Commit Graph

398 Commits

Author SHA1 Message Date
Peter Lobsinger bca34bd17c
perf: report unused inputs for the tar rule (#951)
* perf: report unused inputs for the tar rule

The `mtree` spec passed to the `tar` rule very often selects a subset of the
inputs made available through the `srcs` attribute. In many cases, these
subsets do not break down cleanly along dependency-tree lines and there
is no simple way just pass less content to the `tar` rule.

One prominent example where this occurs is when constructing the tars
for OCI image layers. For instance when [building a Python-based
container image](https://github.com/bazel-contrib/rules_oci/blob/main/docs/python.md),
we might want to split the Python interpreter, third-party dependencies, and
application code into their own layers. This is done by [filtering the
`mtree_spec`](85cb2aaf8c/oci_python_image/py_layer.bzl (L39)).

However, in the operation to construct a `tar` from a subsetted mtree,
it is usually still an unsubsetted tree of `srcs` that gets passed. As
a result, the subset tarball is considered dependent upon a larger set
of sources than is strictly necessary.

This over-scoping runs counter to a very common objective associated with
breaking up an image into layers - isolating churn to a smaller slice of
the application. Because of the spurious relationships established in
Bazel's dependency graph, all tars get rebuilt anytime any content in
the application gets changed. Tar rebuilds can even be triggered by
changes to files that are completely filtered-out from all layers of the container.

Redundent creation of archive content is usually not too computationally
intensive, but the archives can be quite large in some cases, and
avoiding a rebuild might free up gigabytes of disk and/or network
bandwidth for
better use. In addition, eliminating the spurious dependency edges
removes erroneous constraints applied to the build action schedule;
these tend to push all Tar-building operations towards the end of a
build, even when some archive construction could be scheduled much earlier.

## Risk assessment and mitigation

The `unused_inputs_list` mechanism used to report spurious dependency
relationships is a bit difficult to use. Reporting an actually-used
input as unused can create difficult to diagnose problems down the line.

However, the behaviour of the `mtree`-based `tar` rule is sufficiently
simple and self-contained that I am fairly confident that this rule's
used/unused set can be determined accurately in a maintainable fashion.

Out of an abundance of caution I have gated this feature behind a
default-off flag. The `tar` rule will continue to operate as it had
before - typically over-reporting dependencies - unless the
`--@aspect_bazel_lib//lib:tar_compute_unused_inputs` flag is passed.

### Filter accuracy

The `vis` encoding used by the `mtree` format to resiliently handle path
names has a small amount of "play" to it - it is reversable but the
encoded representation of a string is not
unique. Two unequal encoded strings might decode to the same value; this
can happen when at least one of the encoded strings contains unnecessary
escapes that are nevertheless honoured by the decoder.

The unused-inputs set is determined using a filter that compares
`vis`-encoded strings. In the presence of non-canonically-encoded
paths, false-mismatches can lead to falsely reporting that an input is
unused.

The only `vis`-encoded path content that is under the control of callers
is the `mtree` content itself; all other `vis`-encoded strings are
constructed internally to this package, not exposed publicly, and are
all derived using the `lib/private/tar.bzl%_vis_encode` function; all of
these paths are expected to compare exactly. Additionally, it is expected that
many/most users will use this package's helpers (e.g. `mtree_spec`) when
crafting their mtree content; such content is also safe. It is only when
the user crafts their own mtree, or modifies an mtree spec's `content=`
fields' encoding in some way, that a risk of inaccurate reporting
arises. The chances for this are expected to be minor since this seems
like an inconvenient and not-particularly-useful thing for a user to go
out of their way to do.

* Also include other bsdtar toolchain files in keep set

* Add tri-state attribute to control unused-inputs behaviour

This control surface provides for granular control of the feature. The
interface is selected to mirror the common behaviour of `stamp` attributes.

* Add bzl_library level dep

* Update docs

* pre-commit

* Add reminder to change flag default on major-version bump

* Add note about how to make unused input computation exactly correct

* Add a test for unused_inputs listing

* Support alternate contents= form

This is accepted by bsdtar/libarchive. In fact `contents=` is the only of
the pair documented in `mtree(5)`; `content=` is an undocumented
alternate form supported by libarchive.

* Don't try to prune the unprunable

Bazel's interpretation of unused_inputs_list cannot accomodate certain
things in filenames. These are also likely to mess up our own
line-oriented protocol in the shellscript that produces this file.

Co-authored-by: Sahin Yort <thesayyn@gmail.com>

* Rerun docs update

---------

Co-authored-by: Sahin Yort <thesayyn@gmail.com>
2024-10-13 09:58:56 -07:00
Greg Magolan 0ed8bdd8ab
chore: update old reference to aspect-build/bazel-lib (#962) 2024-10-13 09:58:26 -07:00
Marcel ca80d07fca
Fix unknown repo error with mtree_mutate and Bzlmod (#948)
With Bzlmod, every repo has its own namespace. Using Label() should make sure it uses the namespace of the .bzl file instead of the caller's one.
2024-10-04 15:41:24 +00:00
Alex Eagle 1b4d9a7f04
feat(presets): java bazelrc options (#947)
* feat(presets): java bazelrc options

* chore: improved comments
2024-10-04 15:34:34 +00:00
Peter Lobsinger 0db9fbe519
feat: support bzlmod runfiles lookups (#953)
* feat: support bzlmod repo name aliases in tarred runfiles

Under bzlmod, repos have aliases in addition to their canonical names;
in order for lookups using these canonical names to function properly,
a file name `_repo_mapping` is located in the root of the runfiles tree
and consulted to perform repo-name translation.

See: https://github.com/bazelbuild/proposals/blob/main/designs/2022-07-21-locating-runfiles-with-bzlmod.md

* pre-commit formatting

* Enable runfiles test under bzlmod

It works now.
2024-09-30 21:10:06 -07:00
Alex Eagle c5f65e8890
fix: pick up bsdtar windows fix (#942)
see https://github.com/aspect-build/bsdtar-prebuilt/pull/10
2024-09-18 18:29:33 -07:00
Derek Cormier 716af223c2
feat: add an option to not include copy_to_directory output in runfiles (#886)
* feat: add an option to not include copy_to_directory output in runfiles

* chore: docs update

* chore: don't repeat default

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-09-19 01:16:14 +00:00
Justin Pinkul ad48c0d855
fix: moving the preserve mtime test logic to Go for portability (#908)
* Fix: Moving the preserve mtiem test logic to Go for portability

* add tags to disable remote caching, execution and force the test to always re-run

* include docs

* use the runfiles library for windows compatability

* mark the test as manual

* remove duplicate word in comment

* chore: reduce duplication of the long caveats text

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-09-17 17:25:26 -07:00
Alex Eagle 2f65c8c0c7
chore: update git urls (#926)
This repository was donated to the Linux Foundation and is now in the bazel-contrib GH org
2024-09-17 17:05:35 -07:00
Sahin Yort 8f0b38004e
fix: add empty files to tar (#939) 2024-09-17 17:05:20 -07:00
Alex Eagle 3b6a3d50b1
chore(deps): upgrade to newest bsdtar (#940) 2024-09-17 17:00:54 -07:00
renovate[bot] eb575d5782
chore(deps): update dependency bazel_skylib to v1.7.1 (#924)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-09-15 01:45:22 -07:00
David Zbarsky 4c1267fc27
perf: improve copy_file.bzl progress_message (#931) 2024-09-10 10:08:34 -07:00
Peter Lobsinger de9fd596fd
chore(deps): update coreutils to v0.0.27 (#905)
* chore(deps): update coreutils to v0.0.27

This release has an `aarch64-apple-darwin` binary, eliminating the need
for a `version_override` hack to support that platform.

* chore: restore previous coreutils

Users should be able to pin and not have us break them

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-09-02 15:39:32 -07:00
Greg Magolan fb0677ad57
chore: cleanup before bazel-contrib handoff (#918)
* chore: clenaup before bazel-contrib handoff

* chore: apply lint fixes

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-09-02 09:32:38 -07:00
Greg Magolan 0870fadf4c
chore: remove non-bzlmod dep on @internal_platforms_do_not_use//host:constraints.bzl now that root workspace is bzlmod-only (#916) 2024-08-20 16:14:37 -04:00
Greg Magolan 11aacaf5df
chore: enable go, shell, yaml formatters and bazel run //:format (#917) 2024-08-20 11:56:40 -07:00
Greg Magolan 0e1f1e82c9
chore: bump to Bazel 7.3.1 (#914) 2024-08-20 13:37:52 -04:00
Greg Magolan abbbd54a15
chore: right size tests to supress bazel warning (#913) 2024-08-19 15:55:21 -07:00
Greg Magolan 9b87fa7050
chore: skip linux only tests on non-linux platforms (#912) 2024-08-19 18:26:12 -04:00
Greg Magolan 73d021fb36
fix: correctly split quoted args (#909) 2024-08-19 16:36:41 -04:00
Greg Magolan 62b2fd06aa
chore: fixup test sizes to resolve warnings (#911) 2024-08-19 15:33:54 -04:00
Greg Magolan eb55a3c03f
refactor: deprecated expand_locations which is just pass-through to ctx.expand_location() (#910) 2024-08-19 15:28:45 -04:00
Alex Eagle 385717a2a5
chore: turn off bzlmod misguided warning (#901)
* chore: turn off bzlmod misguided warning

These are misinformed, the module resolver should be permitted to find an MVS solution

* chore: update golden
2024-08-14 10:41:50 -07:00
Alex Eagle 5d09fc1b83
fix(docs): description of jq example didn't match behavior (#897)
* fix(docs): description of jq example didn't match behavior

I think this was wrong? Wish our examples were also executable...

---------

Co-authored-by: Derek Cormier <derek@aspect.dev>
2024-08-11 15:56:06 -07:00
Justin Pinkul 74ac451d8a
Adding a preserve time feature to copy_to_directory and copy_directory (#898) 2024-08-10 22:08:56 -07:00
Alex Eagle 0f5e1dcafd
chore(deps): upgrade stardoc (#894)
* chore(deps): upgrade stardoc

This uses the Bazel 7 'starlark_doc_extract' rule which our docsite expects for slurping data.

* chore: stardoc setup in WORKSPACE too

* chore: skip stardoc on bazel 6 in cases where the legacy extractor produces different docstrings
2024-08-08 12:56:11 -07:00
Alex Eagle 109f32eefb
docs(tar): point to the tests as useful examples (#892)
* docs(tar): point to the tests as useful examples

Improve the content to make it easier to reference as examples of usage.

* fix broken link
2024-08-05 11:18:57 -07:00
Markus Hofbauer cdbfe4190c
fix(typos): Fix almost all typos with hook (#884)
* Fix almost all typos with hook

* align docs
2024-07-31 10:09:17 -04:00
Matt 59453e5c50
fix: Set size to a default value as well as timeout. (#839)
* fix: Set size to a default value as well as timeout.

Currently, we are unable to run our `write_source_files` tests in our pre-upload checks, because we have `--test_size_filter=small`, and setting `size` will attempt to set it on both the run rule and the test rule, the former being invalid.

* code review feedback

* chore: fix one more test that should use size for defaulting

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-07-19 12:50:50 -07:00
Alex Eagle db5556df6f
chore(deps): update bsdtar prebuilt (#882)
This fixes the dynamic lookup of zstd from the PATH.
Fixes #877
2024-07-19 12:16:27 -07:00
Greg Magolan be4b0d6455
chore: upgrade to Aspect Workflows 5.10.9 (#881) 2024-07-18 17:54:51 -07:00
Synchronization Acknowledgement cc956d8589
fix(tar): append slash to top-level directory mtree entries (#852)
* fix(tar): append slash to top-level directory mtree entries

bsdtar's mtree format has a quirk wherein entries without "/" in their
first word are treated as "relative" entries, and "relative" directories
will cause tar to "change directory" into the declared directory entry.
If such a directory is followed by a "relative" entry, then the file
will be created within the directory, instead of at top-level as
expected. To mitigate, we append a slash to top-level directory entries.

Fixes #851.

* chore: golden files have BINDIR placeholder

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2024-07-02 09:27:06 -07:00
Tobias Schlatter 086624ae47
fix(tar): expose package_dir argument in mtree_mutate (#873)
This was likely forgotten in #829 when making the parameters explicit
during review.
2024-07-02 13:29:24 +03:00
Jason Bedard 31b4bb68f6 perf: reduce concatenation in relative_path 2024-06-21 20:05:45 -07:00
Greg Magolan 18ae5a89a6
fix: allow copy_to_directory to have an empty srcs list (#871) 2024-06-21 11:33:00 -06:00
Alex Eagle 3330c38904
chore: upgrade bsdtar to 3.7.4 (#866) 2024-06-17 07:49:35 -07:00
Alex Eagle fb950d38ae
docs: add missing default stamp var (#865)
* docs: add missing default stamp var

* update docs
2024-06-13 09:56:06 -07:00
Josh Giles 3c0dbd5895
fix: Directory hidden files in write_source_file. (#860)
* Fix #667: Dir hidden files in write_source_file.

Copy and manage hidden files starting with "." in write_source_files.

Previously these files were not supported if they were in the top level
of the directory to copy.

* Add test and fix error messages from cp, chmod.

* Also fix executable dir case.

* Fix issue with copying directory rather than contents.
2024-06-11 00:30:53 -07:00
Greg Magolan 00310a5b91
test: add test / example of using root path on a run_binary directory output (#862) 2024-06-05 13:48:07 -07:00
Alex Eagle 22c33dfc51
fix: integrity hashes are now sha256 since #854 (#855) 2024-05-23 16:26:14 -07:00
Alex Eagle 4ad02b7795
refactor(release): switch release integrity to be dynamic (#854)
* refactor(release): switch release integrity to be dynamic

This matches rules_py as documented by
https://blog.aspect.build/releasing-bazel-rulesets-rust

It has the benefit that developers no longer get yelled at to vendor some updated integrity hashes into bazel-lib every time they touch the Go sources.

* refactor: echo should produce trailing newline

* chore: bump action-gh-release to avoid Node 16 warning

* chore: update test that is sensitive to compilation mode

We now only use --compilation_mode=opt when cutting a release
2024-05-23 16:08:35 -07:00
mrmeku 6959b3f807
fix: coreutils download path for darwin_amd64 (#853)
* fix: coreutils download path for darwin_amd64

* fixup

---------

Co-authored-by: Greg Magolan <greg@aspect.dev>
2024-05-23 13:12:36 -07:00
Greg Magolan 155e3f250e
chore: bump buildifier, go and gazelle deps (#845) 2024-05-13 23:30:19 -07:00
Greg Magolan a704c2608b
chore: run buildifier to green up main (#841) 2024-05-13 20:00:42 -07:00
Malte Poll 1697a3275b
fix: coreutils toolchain: Use statically linked linux amd64 variant (#706)
* coreutils toolchain: Use statically linked linux amd64 variant

Uutils has a musl release artifact for linux amd64.
In the future, it should probably also be possible to add a
aarch64 musl toolchain. At the moment, this is not an upstream release
artifact.

* coreutils toolchain: temporarily add back old darwin variant

On release 0.0.26 of uutils/coreutils, the darwin x86_64 binary is missing.
Also, any releases between 0.0.23 and 0.0.26 are missing binary artifacts.
Downgrade coreutils toolchain on darwin x86_64 for now.

https://github.com/uutils/coreutils/releases/tag/0.0.26
2024-05-08 07:06:12 -07:00
Alex Eagle b15dc31a81
fix(tar): handle spaces in input filenames (#835) 2024-05-07 17:52:35 -07:00
Sahin Yort d1d063f3e5
feat: introduce zstd toolchain (#831) 2024-05-03 16:12:56 -07:00
Alex Eagle 977f27f7a0
feat(tar): add ergonomic way to strip_prefix (#829) 2024-05-01 12:36:39 -07:00
Greg Magolan 41413388da
chore: add bazel test support to javascript --config=debug preset (#825) 2024-04-25 16:33:16 -07:00