The `ephemeral_disk` block's `migrate` field allows for best-effort migration of
the ephemeral disk data to new nodes. The documentation says the `migrate` field
is only respected if `sticky=true`, but in fact if client ACLs are not set the
data is migrated even if `sticky=false`.
The existing behavior when client ACLs are disabled has existed since the early
implementation, so "fixing" that case now would silently break backwards
compatibility. Additionally, having `migrate` not imply `sticky` seems
nonsensical: it suggests that if we place on a new node we migrate the data but
if we place on the same node, we throw the data away!
Update so that `migrate=true` implies `sticky=true` as follows:
* The failure mode when client ACLs are enabled comes from the server not passing
along a migration token. Update the server so that the server provides a
migration token whenever `migrate=true` and not just when `sticky=true` too.
* Update the scheduler so that `migrate` implies `sticky`.
* Update the client so that we check for `migrate || sticky` where appropriate.
* Refactor the E2E tests to move them off the old framework and make the intention
of the test more clear.
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
The e2e/acl package has some nice helpers for tracking and cleaning up ACL
objects, but they are currently private. Export them so I can abuse them in
other e2e tests.
This changeset provides a matrix test of ACL enforcement across several
dimensions:
* anonymous vs bogus vs valid tokens
* permitted vs not permitted by policy
* request sent to server vs sent to client (and forwarded)
The vSphere plugin is exclusive to k8s because it relies on k8s-APIs (and
crashes without them being present). Upstream unfortunately will not support
Nomad, so we shouldn't refer to it in our concept docs here.
Nomad's security model requires mTLS in order to secure client-to-server and
server-to-server communications. Configuring ACLs alone is not enough. Loudly
warn the user if mTLS is not configured in non-dev modes.
Requests without an ACL token that pass thru the client's HTTP API are treated
as though they come from the client itself. This allows bypass of ACLs on RPC
requests where ACL permissions are checked (like `Job.Register`). Invalid tokens
are correctly rejected.
Fix the bypass by only setting a client ID on the identity if we have a valid node secret.
Note that this changeset will break rate metrics for RPCs sent by clients
without a client secret such as `Node.GetClientAllocs`; these requests will be
recorded as anonymous.
Future work should:
* Ensure the node secret is sent with all client-driven RPCs except
`Node.Register` which is TOFU.
* Create a new `acl.ACL` object from client requests so that we
can enforce ACLs for all endpoints in a uniform way that's less error-prone.~
These endpoints all refer to JobID by the time you get to the RPC request
layer, but the HTTP handler functions call the field JobName, which is a different
field (... though often with the same value).
This update changes the behaviour when following logs from an
allocation, so that both stdout and stderr files streamed when the
operator supplies the follow flag. The previous behaviour is held
when all other flags and situations are provided.
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
The allocrunner has a facility for passing data written by allocrunner hooks to
taskrunner hooks. Currently the only consumers of this facility are the
allocrunner CSI hook (which writes data) and the taskrunner volume hook (which
reads that same data).
The allocrunner hook for CSI volumes doesn't set the alloc hook resources
atomically. Instead, it gets the current resources and then writes a new version
back. Because the CSI hook is currently the only writer and all readers happen
long afterwards, this should be safe but #16623 shows there's some sequence of
events during restore where this breaks down.
Refactor hook resources so that hook data is accessed via setters and getters
that hold the mutex.
The workflow described in the docs for apt installation is deprecated. Update to
match the workflow described in the Tutorials and official packaging guide.
Instead of attempting to pick each individual commit in a PR using
`BACKPORT_MERGE_COMMIT` only picks the commit that was merged into
`main`.
This reduces the amount of work done during a backport, generating
cleaner merges and avoiding potential issues on specific commits.
With this setting PRs that are not squashed will fail to backport and
must be handled manually, but those are considered exceptions.
* Bones of JWT detection
* JWT to token pipeline complete
* Some live-demo fixes for template language
* findSelf and loginJWT funcs made async
* Acceptance tests and mirage mocks for JWT login
* [ui] Allow for multiple JWT auth methods in the UI (#16665)
* Split selectable jwt methods
* repositions the dropdown to be next to the input field
When we added recovery of pause containers in #16352 we called the recovery
function from the plugin factory function. But in our plugin setup protocol, a
plugin isn't ready for use until we call `SetConfig`. This meant that
recovering pause containers was always done with the default
config. Setting up the Docker client only happens once, so setting the wrong
config in the recovery function also means that all other Docker API calls will
use the default config.
Move the `recoveryPauseContainers` call into the `SetConfig`. Fix the error
handling so that we return any error but also don't log when the context is
canceled, which happens twice during normal startup as we fingerprint the
driver.
The `docker` driver cannot expand capabilities beyond the default set when the
task is a non-root user. Clarify this in the documentation of `allow_caps` and
update the `cap_add` and `cap_drop` to match the `exec` driver, which has more
clear language overall.
Currently, the `exec` driver is only setting the Bounding set, which is
not sufficient to actually enable the requisite capabilities for the
task process. In order for the capabilities to survive `execve`
performed by libcontainer, the `Permitted`, `Inheritable`, and `Ambient`
sets must also be set.
Per CAPABILITIES (7):
> Ambient: This is a set of capabilities that are preserved across an
> execve(2) of a program that is not privileged. The ambient capability
> set obeys the invariant that no capability can ever be ambient if it
> is not both permitted and inheritable.