Commit Graph

60 Commits

Author SHA1 Message Date
Peter Lobsinger bca34bd17c
perf: report unused inputs for the tar rule (#951)
* perf: report unused inputs for the tar rule

The `mtree` spec passed to the `tar` rule very often selects a subset of the
inputs made available through the `srcs` attribute. In many cases, these
subsets do not break down cleanly along dependency-tree lines and there
is no simple way just pass less content to the `tar` rule.

One prominent example where this occurs is when constructing the tars
for OCI image layers. For instance when [building a Python-based
container image](https://github.com/bazel-contrib/rules_oci/blob/main/docs/python.md),
we might want to split the Python interpreter, third-party dependencies, and
application code into their own layers. This is done by [filtering the
`mtree_spec`](85cb2aaf8c/oci_python_image/py_layer.bzl (L39)).

However, in the operation to construct a `tar` from a subsetted mtree,
it is usually still an unsubsetted tree of `srcs` that gets passed. As
a result, the subset tarball is considered dependent upon a larger set
of sources than is strictly necessary.

This over-scoping runs counter to a very common objective associated with
breaking up an image into layers - isolating churn to a smaller slice of
the application. Because of the spurious relationships established in
Bazel's dependency graph, all tars get rebuilt anytime any content in
the application gets changed. Tar rebuilds can even be triggered by
changes to files that are completely filtered-out from all layers of the container.

Redundent creation of archive content is usually not too computationally
intensive, but the archives can be quite large in some cases, and
avoiding a rebuild might free up gigabytes of disk and/or network
bandwidth for
better use. In addition, eliminating the spurious dependency edges
removes erroneous constraints applied to the build action schedule;
these tend to push all Tar-building operations towards the end of a
build, even when some archive construction could be scheduled much earlier.

## Risk assessment and mitigation

The `unused_inputs_list` mechanism used to report spurious dependency
relationships is a bit difficult to use. Reporting an actually-used
input as unused can create difficult to diagnose problems down the line.

However, the behaviour of the `mtree`-based `tar` rule is sufficiently
simple and self-contained that I am fairly confident that this rule's
used/unused set can be determined accurately in a maintainable fashion.

Out of an abundance of caution I have gated this feature behind a
default-off flag. The `tar` rule will continue to operate as it had
before - typically over-reporting dependencies - unless the
`--@aspect_bazel_lib//lib:tar_compute_unused_inputs` flag is passed.

### Filter accuracy

The `vis` encoding used by the `mtree` format to resiliently handle path
names has a small amount of "play" to it - it is reversable but the
encoded representation of a string is not
unique. Two unequal encoded strings might decode to the same value; this
can happen when at least one of the encoded strings contains unnecessary
escapes that are nevertheless honoured by the decoder.

The unused-inputs set is determined using a filter that compares
`vis`-encoded strings. In the presence of non-canonically-encoded
paths, false-mismatches can lead to falsely reporting that an input is
unused.

The only `vis`-encoded path content that is under the control of callers
is the `mtree` content itself; all other `vis`-encoded strings are
constructed internally to this package, not exposed publicly, and are
all derived using the `lib/private/tar.bzl%_vis_encode` function; all of
these paths are expected to compare exactly. Additionally, it is expected that
many/most users will use this package's helpers (e.g. `mtree_spec`) when
crafting their mtree content; such content is also safe. It is only when
the user crafts their own mtree, or modifies an mtree spec's `content=`
fields' encoding in some way, that a risk of inaccurate reporting
arises. The chances for this are expected to be minor since this seems
like an inconvenient and not-particularly-useful thing for a user to go
out of their way to do.

* Also include other bsdtar toolchain files in keep set

* Add tri-state attribute to control unused-inputs behaviour

This control surface provides for granular control of the feature. The
interface is selected to mirror the common behaviour of `stamp` attributes.

* Add bzl_library level dep

* Update docs

* pre-commit

* Add reminder to change flag default on major-version bump

* Add note about how to make unused input computation exactly correct

* Add a test for unused_inputs listing

* Support alternate contents= form

This is accepted by bsdtar/libarchive. In fact `contents=` is the only of
the pair documented in `mtree(5)`; `content=` is an undocumented
alternate form supported by libarchive.

* Don't try to prune the unprunable

Bazel's interpretation of unused_inputs_list cannot accomodate certain
things in filenames. These are also likely to mess up our own
line-oriented protocol in the shellscript that produces this file.

Co-authored-by: Sahin Yort <thesayyn@gmail.com>

* Rerun docs update

---------

Co-authored-by: Sahin Yort <thesayyn@gmail.com>
2024-10-13 09:58:56 -07:00
Sahin Yort d1d063f3e5
feat: introduce zstd toolchain (#831) 2024-05-03 16:12:56 -07:00
Greg Magolan 41a9295f07
feat: add //lib:enable_runfiles config_setting (#807) 2024-04-01 07:27:38 -07:00
Alex Eagle 0fc838839c
feat: add a helper for rules to work with resource_sets (#792)
* feat: add a helper for rules to work with resource_sets

This API is poorly designed and needs some help to let rule users pick a value in cases that they aren't also the rule author

* chore: add some cpu resource_set values as well
2024-03-18 15:38:24 -07:00
Sahin Yort 197b2da974
feat: support location expansion in tar (#774) 2024-03-01 14:51:47 -08:00
Sahin Yort 4f6b4bd5cb
feat: implement bats test runner (#699) 2023-12-20 13:08:47 -08:00
Alex Eagle 303779e9ef
fix(tar): propagate testonly attr to mtree_spec (#691) 2023-12-13 15:21:02 -08:00
Greg Magolan 602b7b8f80
Revert: feat: expose a config_setting for copy execution_requirements (#606) (#640) 2023-10-31 15:19:38 -07:00
Alex Eagle eda4929c72
chore: add windows binaries (#610)
* chore: add windows binaries

* chore: fix/exclude windows brokenness

* chore: try to see why diff tests fail on windows

* fix: rm bazelisk rc again for windows

* fix: try our own diff_test

* chore: use only our own diff_test
2023-10-10 14:13:17 -07:00
Alex Eagle 4bfe55711a
feat: expose a config_setting for copy execution_requirements (#606)
* feat: expose a config_setting for copy execution_requirements

Fixes #604

* chore: add user docs

* chore: improve docs

* chore: better link to copy_file
2023-10-09 13:57:34 -07:00
Greg Magolan cb9b74f41e
refactor: remove to_workspace_path and to_manifest_path (#590) 2023-10-06 08:37:53 -07:00
Alex Eagle 6a4381bf07
chore: run gazelle (#584)
Also delete the local_config_platform workaround for bazel 5 which made our docs setup complex
2023-10-05 14:22:38 -07:00
Alex Eagle a283a8216d feat: add a tar toolchain (#468)
* feat: add a BSD tar toolchain

@thesayyn discovered that it has a feature which should make it a drop-in replacement for pkg_tar
including fine-grained file permissions and symlinks:
https://man.freebsd.org/cgi/man.cgi?mtree(8)

* show example of mtree usage

* feat: introduce tar rule

* cleanup and get test passing

* more cleanup

* chore: add support for compress flags

* chore: add docs

* chore: add docs

* feat: implement linux bsdtar toolchain (#566)

* chore: improve target naming

* WIP: args

* feat: generate mtree spec

Also allow arbitrary args

* refactor: mtree is required

* refactor: style nits

* fix: support mix of source and generated artifacts

* feat: demonstrate strip_prefix

* chore: regen docs

* fix: make host toolchain a fallback toolchain

* fix: include libarchive13.so when installing BSD tar

* chore: buildifier

* fix: aarch64 cpu constraint

* fix(ci): include libarchive13.so when running tar

* chore: add libnettle

* refactor: inputs mutated less

* refactor: remove unneeded substitution arg

* refactor: don't advertise unsupported modes

* fix: hack enough to make it run on my machine

* chore: dynamic libraries included in sh_binary under toolchain

* make sh_binary work

* refactor: drop arm64 for now

* fix toolchain

* fix test

* chore: improve test naming scheme

---------

Co-authored-by: Sahin Yort <thesayyn@gmail.com>
2023-10-03 13:50:55 -07:00
Alex Eagle e00ea2b977
chore: update pre-commit buildifier (#563)
It needs to match the one CI runs in
https://github.com/aspect-build/bazel-lib/actions/runs/6357283303/job/17268197322
2023-09-29 14:42:33 -07:00
Sahin Yort ef364b54b4
refactor: consume tools from source if unstamped (#543) 2023-09-24 15:06:10 -07:00
Marc Redemske 45986f000d
feat: add list utils (#512)
* feat: add list utils

* chore: add docs for lists

* fixes accoding to review

---------

Co-authored-by: Alex Eagle <alex@aspect.dev>
2023-09-21 10:12:47 -07:00
Alex Eagle 882bc95615
feat: expand_template allows inline template content (#533)
Co-authored-by: thesayyn <thesayyn@gmail.com>
2023-09-20 18:32:05 -07:00
Jason Bedard 47dd4ce9fd chore: run buildifier 2023-06-13 16:28:17 -07:00
Sahin Yort 6af964f261
feat: support stamp substituions (#435) 2023-05-16 16:14:50 -07:00
Sahin Yort 781d1cbebf
feat: implement chr, ord, and hex (#425) 2023-05-15 12:15:55 -07:00
Alex Eagle 5162ae4d8c feat: detect bzlmod automatically rather than requiring the user sets the flag.
Thanks to @aherrmann for pointing out this is possible: https://github.com/aherrmann/demo-bazel-detect-bzlmod-config-setting
2023-03-31 14:56:22 -07:00
Alex Eagle 035473873a fix: disable stardoc under bzlmod and windows by default
We've been making these exceptions in downstream repos
2023-03-11 13:58:03 -08:00
Greg Magolan 569fa374ef
feat: add write_aspect_bazelrc_presets macro (#370) 2023-02-17 14:34:06 -08:00
Greg Magolan 63f5aff803
feat: hardlink generated files in copy_to_directory and copy_directory instead of copying (#321) 2023-01-16 17:19:13 -08:00
Greg Magolan ce043b299d
fix: fix pathing issue in new copy_to_directory binary tool on Windows (#334) 2023-01-16 15:09:18 -08:00
Sahin Yort 4dc36a97f2
feat: add coreutils toolchain (#332) 2023-01-16 21:02:17 +03:00
Derek Cormier 0df23e54db feat: modify assert_json_matches api before release 2023-01-09 11:31:15 -08:00
Alex Eagle fe867981ee feat(jq): add a diff_test helper
This is useful in rules_swc where we want to check that tsconfig.json and .swcrc have matching paths, and most users will want that too
2023-01-07 13:10:28 -08:00
Greg Magolan fb3ef6e45d
feat: improved copy_to_directory performance & globbing support using copy_to_directory_bin_action (#311) 2023-01-07 11:26:27 -08:00
Derek Cormier 72a26212f2
Add platform_transition_binary rule (#289)
* feat: add platform_transition_binary rule
2022-12-14 11:21:08 -08:00
Greg Magolan 3d73637ee5
feat: add base64 encode & decode utility functions (#292) 2022-11-25 17:00:02 -08:00
Alex Eagle a9dc052c8b feat: add assert_outputs
It's a simple way to make an executable example demonstrate what outputs a rule produces.
See https://github.com/aspect-build/rules_ts/pull/214 for an example usage in the real world.
2022-10-31 15:25:12 -07:00
Greg Magolan f030847908
feat: add maybe_http_archive (#270) 2022-10-27 15:55:57 -07:00
Greg Magolan 6f37a3808b
fix: isolate bzl_libary targets for //lib/private:*.bzl i //lib/private/docs package so that platform_utils dep on @local_config_platform//:constraints doesn't leak unless downstream consumer is generating docs (#254) 2022-09-27 09:59:19 -07:00
Greg Magolan 8e230b0721
feat: add bazel_version value to host_repo repository rule (#246) 2022-09-16 11:51:48 -07:00
Greg Magolan 65e852f774
fix: don't generate @aspect_bazel_lib_local_config_platform repository as it is a leaky abstraction for rule consumers and not just rule authors (#243) 2022-09-13 21:58:08 -07:00
Greg Magolan be5c9d06bc
fix: fix bzl_library breakage created with load from @local_config_platform in copy rules (#242) 2022-09-13 20:37:24 -07:00
Alex Eagle 896ee0c1f0 chore: set test timeouts to short
I recently enabled --test_verbose_timeout_warnings and that caused a bunch of warnings in our build.
This fixes it, and adds a utility for us or others to make test-wrapping macros that set to short by default.
2022-08-20 13:58:43 -07:00
Greg Magolan c867e37856
fix: fix make var expansion in expand_template (#213) 2022-08-11 19:30:45 -07:00
Alex Eagle 20d759f669 feat: add utility for asserting that a file contains a string
Useful for basic smoke tests of bazel outputs
2022-08-04 18:02:29 -07:00
Greg Magolan a945d66eb5
chore: add glob_match to docs (#195) 2022-07-28 11:15:14 -07:00
Alex Eagle de081fb72e feat: basic stamping support 2022-07-14 22:37:42 -07:00
Greg Magolan 41ce34470f
feat: add @aspect_bazel_lib_host repository and normalize function names in repo_utils (#90) 2022-04-28 14:18:06 -07:00
Derek Cormier 3e4024c785
feat: yq (#80) 2022-04-19 21:45:06 -07:00
Greg Magolan b2955dcb05
feat: add copy_to_bin rule (#69) 2022-04-12 16:31:04 -07:00
Greg Magolan 322bbc92df feat: add repo_utils with fork of @bazel_tools patch function that takes a working_directory argument 2022-04-11 18:15:32 -07:00
Derek Cormier a23d1b03f7
feat: add copy_directory (#63) 2022-04-03 17:52:03 -07:00
Greg Magolan e30e89fa3d
feat: add run_binary with output directory support & improved makevar expansion (#57) 2022-03-31 20:04:35 -07:00
Alex Eagle 096133e5d2
feat: platform_transition_filegroup (#55)
* feat: platform_transition_filegroup

Lifted from https://github.com/aspect-build/gcc-toolchain/pull/8/files
See also https://github.com/fmeum/rules_meta/blob/main/meta/internal/meta.bzl#L4

* test: add tests for transition filegroup
2022-03-30 21:04:14 -07:00
Greg Magolan bda5c632be
feat: replace default_info_files with output_files which adds output_group attribute (#50) 2022-03-15 17:36:22 -07:00