Split the flag options for the `secure-variables keyring` into their
own subcommands. The gossip keyring CLI will be similarly refactored
and the old version will be deprecated.
* variable index header and deletion hook
* Some visual improvements to edit header and back button
* Mirage fixtures for variables
* Deletion flow test
* Account for disabled state of add-more button
* Transition destination changes after design chat
* Simplified parent folder path by adding to variable model
* Lint cleanup
* test update to reflect path redirect on delete
* No longer have to re-traverse to a deleted var folder after deletion, tests updated
In order to support implicit ACL policies for tasks to get their own
secrets, each task would need to have its own ACL token. This would
add extra raft overhead as well as new garbage collection jobs for
cleaning up task-specific ACL tokens. Instead, Nomad will create a
workload Identity Claim for each task.
An Identity Claim is a JSON Web Token (JWT) signed by the server’s
private key and attached to an Allocation at the time a plan is
applied. The encoded JWT can be submitted as the X-Nomad-Token header
to replace ACL token secret IDs for the RPCs that support identity
claims.
Whenever a key is is added to a server’s keyring, it will use the key
as the seed for a Ed25519 public-private private keypair. That keypair
will be used for signing the JWT and for verifying the JWT.
This implementation is a ruthlessly minimal approach to support the
secure variables feature. When a JWT is verified, the allocation ID
will be checked against the Nomad state store, and non-existent or
terminal allocation IDs will cause the validation to be rejected. This
is sufficient to support the secure variables feature at launch
without requiring implementation of a background process to renew
soon-to-expire tokens.
* Notify a user if they try to make a variable with an existing path
* Stylize error fade
* Bugfix: if you click the dupe link to a variable you havent previously loaded, you lack its keyvalues
* rename and typefix for duplicate path warning
Replication for the secure variables keyring. Because only key
metadata is stored in raft, we need to distribute key material
out-of-band from raft replication. A goroutine runs on each server and
watches for changes to the `RootKeyMeta`. When a new key is received,
attempt to fetch the key from the leader. If the leader doesn't have
the key (which may happen if a key is rotated right before a leader
transition), try to get the key from any peer.
* Recursive trie-building with variable paths
* tree structure applied to new path routes and a new util class
* Breadcrumbs for SV paths and prompt when nothing exists at a path
* Lint and test cleanup
* Pre-review cleanup
* lintfix
* Abstracted pathtree each-ins into a new component class
* Path tree component styles
* Types added and PR feedback addressed
* Path tree to variable paths
* Slightly simpler path QP mods
* More pr feedback handling
* Trim moved into a function on variable model
* Traversal and compaction tests for PathTree
* Trim Path tests
* Variable-paths component tests
* Lint fixup for tests
After internal design review, we decided to remove exposing algorithm
choice to the end-user for the initial release. We'll solve nonce
rotation by forcing rotations automatically on key GC (in a core job,
not included in this changeset). Default to AES-256 GCM for the
following criteria:
* faster implementation when hardware acceleration is available
* FIPS compliant
* implementation in pure go
* post-quantum resistance
Also fixed a bug in the decoding from keystore and switched to a
harder-to-misuse encoding method.
The core jobs to garbage collect unused keys and perform full key
rotations will need to be able to query secure variables by key ID for
efficiency. Add an index to the state store and associated query
function and test.
When a server becomes leader, it will check if there are any keys in
the state store, and create one if there is not. The key metadata will
be replicated via raft to all followers, who will then get the key
material via key replication (not implemented in this changeset).
* variables.new initialized
* Hacky but savey
* Variable wildcard route and multiple creatable at a time
* multiple KVs per variable
* PR Prep cleanup and lintfix
* Delog
* Data mocking in mirage for variables
* Linting fixes
* Re-implement absent params
* Adapter and model tests
* Moves the path-as-id logic to a serializer instead of adapter
* Classic to serializer and lint cleanup
* Pluralized save button (#13140)
* Autofocus modifier and better Add More button UX (#13145)
* Secure Variables: show/hide functionality when adding new values (#13137)
* Flight Icons added and show hide functionality
* PR cleanup
* Linting cleanup
* Position of icon moved to the right of input
* PR feedback addressed
* Delete button and stylistic changes to show hide
* Hmm, eslint doesnt like jsdoc-usage as only reason for import
* More closely match the button styles and delete test
* Simplified new.js model
* Secure Variables: /variables/*path/edit route and functionality (#13170)
* Variable edit page init
* Significant change to where we house model methods
* Lintfix
* Edit a variable tests
* Remove redundant tests
* Asserts expected
* Mirage factory updated to reflect model state
This changeset implements the keystore serialization/deserialization:
* Adds a JSON serialization extension for the `RootKey` struct, along with a metadata stub. When we serialize RootKey to the on-disk keystore, we want to base64 encode the key material but also exclude any frequently-changing fields which are stored in raft.
* Implements methods for loading/saving keys to the keystore.
* Implements methods for restoring the whole keystore from disk.
* Wires it all up with the `Keyring` RPC handlers and fixes up any fallout on tests.
Implement the basic upsert, list, and delete operations for
`RootKeyMeta` needed by the Keyring RPCs.
This changeset also implements two convenience methods
`RootKeyMetaByID` and `GetActiveRootKeyMeta` which are useful for
testing but also will be needed to implement the rest of the RPCs.
* Route init
* Bones of a mirage-mocked secure variables policy
* Functinoing policy for list vars
* Delog and transition on route
* Basic guard test
* Page guard tests for secure variables
* Cleanup and unit tests for variables ability
* Linter cleanup
* Set expectations for test assertions
* PR feedback addressed
* Read label changed to View per suggestion
* Allow running jobs from a namespace-limited token
* qpNamespace cleanup
* Looks like parse can deal with a * namespace
* A little diff cleanup
* Defensive destructuring
* Removing accidental friendly-fire on can-scale
* Testfix: Job run buttons from jobs index
* Testfix: activeRegion job adapter string
* Testfix: unit tests for job abilities correctly reflect the any-namespace rule
* Testfix: job editor test looks for requests with namespace applied on plan
Stream snapshot to FSM when restoring from archive
The `RestoreFromArchive` helper decompresses the snapshot archive to a
temporary file before reading it into the FSM. For large snapshots
this performs a lot of disk IO. Stream decompress the snapshot as we
read it, without first writing to a temporary file.
Add bexpr filters to the `RestoreFromArchive` helper.
The operator can pass these as `-filter` arguments to `nomad operator
snapshot state` (and other commands in the future) to include only
desired data when reading the snapshot.
Use the same output format when listing multiple evals in the `eval
list` command and when `eval status <prefix>` matches more than one
eval.
Include the eval namespace in all output formats and always include the
job ID in `eval status` since, even `node-update` evals are related to a
job.
Add Node ID to the evals table output to help differentiate
`node-update` evals.
Co-authored-by: James Rasell <jrasell@hashicorp.com>
Whenever a node joins the cluster, either for the first time or after
being `down`, we emit a evaluation for every system job to ensure all
applicable system jobs are running on the node.
This patch adds an optimization to skip creating evaluations for system
jobs not in the current node's DC. While the scheduler performs the same
feasability check, skipping the creation of the evaluation altogether
saves disk, network, and memory.
This PR fixes a bug where client configuration max_kill_timeout was
not being enforced. The feature was introduced in 9f44780 but seems
to have been removed during the major drivers refactoring.
We can make sure the value is enforced by pluming it through the DriverHandler,
which now uses the lesser of the task.killTimeout or client.maxKillTimeout.
Also updates Event.SetKillTimeout to require both the task.killTimeout and
client.maxKillTimeout so that we don't make the mistake of using the wrong
value - as it was being given only the task.killTimeout before.
api: apply new ACL check for wildcard namespace
In #13606 the ACL check was refactored to better support the all
namespaces wildcard (`*`). This commit applies the changes to the jobs
and alloc list endpoints.
Improve how the all namespaces wildcard (`*`) is handled when checking
ACL permissions. When using the wildcard namespace the `AllowNsOp` would
return false since it looks for a namespace called `*` to match.
This commit changes this behavior to return `true` when the queried
namespace is `*` and the token allows the operation in _any_ namespace.
Actual permission must be checked per object. The helper function
`AllowNsOpFunc` returns a function that can be used to make this
verification.
The `operator debug` command doesn't output the leader anywhere in the
output, which adds extra burden to offline debugging (away from an
ongoing incident where you can simply check manually). Query the
`/v1/status/leader` API but degrade gracefully.