* users: create cache for user lookups
This PR introduces a global cache for OS user lookups. This should
relieve pressure on the OS domain/directory lookups, which would be
queried more now that Task API exists.
Hits are cached for 1 hour, and misses are cached for 1 minute. These
values are fairly arbitrary - we can tweak them if there is any reason to.
Closes#16010
* users: delete expired negative entry from cache
This change introduces the Task API: a portable way for tasks to access Nomad's HTTP API. This particular implementation uses a Unix Domain Socket and, unlike the agent's HTTP API, always requires authentication even if ACLs are disabled.
This PR contains the core feature and tests but followup work is required for the following TODO items:
- Docs - might do in a followup since dynamic node metadata / task api / workload id all need to interlink
- Unit tests for auth middleware
- Caching for auth middleware
- Rate limiting on negative lookups for auth middleware
---------
Co-authored-by: Seth Hoenig <shoenig@duck.com>
Add `identity` jobspec block to expose workload identity tokens to tasks.
---------
Co-authored-by: Anders <mail@anars.dk>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
* client: sandbox go-getter subprocess with landlock
This PR re-implements the getter package for artifact downloads as a subprocess.
Key changes include
On all platforms, run getter as a child process of the Nomad agent.
On Linux platforms running as root, run the child process as the nobody user.
On supporting Linux kernels, uses landlock for filesystem isolation (via go-landlock).
On all platforms, restrict environment variables of the child process to a static set.
notably TMP/TEMP now points within the allocation's task directory
kernel.landlock attribute is fingerprinted (version number or unavailable)
These changes make Nomad client more resilient against a faulty go-getter implementation that may panic, and more secure against bad actors attempting to use artifact downloads as a privilege escalation vector.
Adds new e2e/artifact suite for ensuring artifact downloading works.
TODO: Windows git test (need to modify the image, etc... followup PR)
* landlock: fixup items from cr
* cr: fixup tests and go.mod file
* helpers: lockfree lookup of nobody user on linux and darwin
This PR continues the nobody user lookup saga, by making the nobody
user lookup lock-free on linux and darwin.
By doing the lookup in an init block this originally broke on Windows,
where we must avoid doing the lookup at all. We can get around that
breakage by only doing the lookup on linux/darwin where the nobody
user is going to exist.
Also return the nobody user by value so that a copy is created that
cannot be modified by callers of Nobody().
* helper: move nobody code into unix file
In #14742 we introduced a cached lookup of the `nobody` user, which is only ever
called on Unixish machines. But the initial caching was being done in an `init`
block, which meant it was being run on Windows as well. This prevents the Nomad
agent from starting on Windows.
An alternative fix here would be to have a separate `init` block for Windows and
Unix, but this potentially masks incorrect behavior if we accidentally added a
call to the `Nobody()` method on Windows later. This way we're forced to handle
the error in the caller.
* client: protect user lookups with global lock
This PR updates Nomad client to always do user lookups while holding
a global process lock. This is to prevent concurrency unsafe implementations
of NSS, but still enabling NSS lookups of users (i.e. cannot not use osusergo).
* cl: add cl