Commit graph

23244 commits

Author SHA1 Message Date
Tim Gross 56deb6f8cc keyring CLI: refactor to use subcommands (#13351)
Split the flag options for the `secure-variables keyring` into their
own subcommands. The gossip keyring CLI will be similarly refactored
and the old version will be deprecated.
2022-07-11 13:34:05 -04:00
Phil Renaud 7eb462454b Secure Variables form validation improvements (#13335)
* Validation tests for secure variables form: duplicate path error

* Validation for dot in key name

* Defensive pattern on copy keyValues
2022-07-11 13:34:05 -04:00
Phil Renaud 311d9d1cda variable index header and deletion hook (#13302)
* variable index header and deletion hook

* Some visual improvements to edit header and back button

* Mirage fixtures for variables

* Deletion flow test

* Account for disabled state of add-more button

* Transition destination changes after design chat

* Simplified parent folder path by adding to variable model

* Lint cleanup

* test update to reflect path redirect on delete

* No longer have to re-traverse to a deleted var folder after deletion, tests updated
2022-07-11 13:34:05 -04:00
Tim Gross bfcbc00f4e workload identity (#13223)
In order to support implicit ACL policies for tasks to get their own
secrets, each task would need to have its own ACL token. This would
add extra raft overhead as well as new garbage collection jobs for
cleaning up task-specific ACL tokens. Instead, Nomad will create a
workload Identity Claim for each task.

An Identity Claim is a JSON Web Token (JWT) signed by the server’s
private key and attached to an Allocation at the time a plan is
applied. The encoded JWT can be submitted as the X-Nomad-Token header
to replace ACL token secret IDs for the RPCs that support identity
claims.

Whenever a key is is added to a server’s keyring, it will use the key
as the seed for a Ed25519 public-private private keypair. That keypair
will be used for signing the JWT and for verifying the JWT.

This implementation is a ruthlessly minimal approach to support the
secure variables feature. When a JWT is verified, the allocation ID
will be checked against the Nomad state store, and non-existent or
terminal allocation IDs will cause the validation to be rejected. This
is sufficient to support the secure variables feature at launch
without requiring implementation of a background process to renew
soon-to-expire tokens.
2022-07-11 13:34:05 -04:00
Phil Renaud e79fea2b4e Notify a user if they try to make a variable with an existing path (#13278)
* Notify a user if they try to make a variable with an existing path

* Stylize error fade

* Bugfix: if you click the dupe link to a variable you havent previously loaded, you lack its keyvalues

* rename and typefix for duplicate path warning
2022-07-11 13:34:04 -04:00
Phil Renaud e58999d62b Disable path input when model is not new (#13273)
* Disable path input when model is not new

* isDisabled tests for secure variables path
2022-07-11 13:34:04 -04:00
Tim Gross 81b0c4fd36 keyring command line (#13169)
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
2022-07-11 13:34:04 -04:00
Charlie Voiselle 619e0cbafd Don't write a SecureVariable with no Items (#13258) 2022-07-11 13:34:04 -04:00
Tim Gross 7055ce89b1 keyring replication (#13167)
Replication for the secure variables keyring. Because only key
metadata is stored in raft, we need to distribute key material
out-of-band from raft replication. A goroutine runs on each server and
watches for changes to the `RootKeyMeta`. When a new key is received,
attempt to fetch the key from the leader. If the leader doesn't have
the key (which may happen if a key is rotated right before a leader
transition), try to get the key from any peer.
2022-07-11 13:34:04 -04:00
Phil Renaud da4cb6422e Secure Variables: Build a path tree and traverse it at /variables/*path (#13202)
* Recursive trie-building with variable paths

* tree structure applied to new path routes and a new util class

* Breadcrumbs for SV paths and prompt when nothing exists at a path

* Lint and test cleanup

* Pre-review cleanup

* lintfix

* Abstracted pathtree each-ins into a new component class

* Path tree component styles

* Types added and PR feedback addressed

* Path tree to variable paths

* Slightly simpler path QP mods

* More pr feedback handling

* Trim moved into a function on variable model

* Traversal and compaction tests for PathTree

* Trim Path tests

* Variable-paths component tests

* Lint fixup for tests
2022-07-11 13:34:04 -04:00
Tim Gross d5a214484c core job for root key GC (#13199)
Inactive and unused keys older than a threshold will be periodically
garbage collected.
2022-07-11 13:34:04 -04:00
Tim Gross 5a85d96322 remove end-user algorithm selection (#13190)
After internal design review, we decided to remove exposing algorithm
choice to the end-user for the initial release. We'll solve nonce
rotation by forcing rotations automatically on key GC (in a core job,
not included in this changeset). Default to AES-256 GCM for the
following criteria:

* faster implementation when hardware acceleration is available
* FIPS compliant
* implementation in pure go
* post-quantum resistance

Also fixed a bug in the decoding from keystore and switched to a 
harder-to-misuse encoding method.
2022-07-11 13:34:04 -04:00
Tim Gross 973b474b3c provide state store query for variables by key ID (#13195)
The core jobs to garbage collect unused keys and perform full key
rotations will need to be able to query secure variables by key ID for
efficiency. Add an index to the state store and associated query
function and test.
2022-07-11 13:34:04 -04:00
Tim Gross f2ee585830 bootstrap keyring (#13124)
When a server becomes leader, it will check if there are any keys in
the state store, and create one if there is not. The key metadata will
be replicated via raft to all followers, who will then get the key
material via key replication (not implemented in this changeset).
2022-07-11 13:34:04 -04:00
Phil Renaud a628e2969a Secure Variables UI: /variables/new and /variables/*path (#13069)
* variables.new initialized

* Hacky but savey

* Variable wildcard route and multiple creatable at a time

* multiple KVs per variable

* PR Prep cleanup and lintfix

* Delog

* Data mocking in mirage for variables

* Linting fixes

* Re-implement absent params

* Adapter and model tests

* Moves the path-as-id logic to a serializer instead of adapter

* Classic to serializer and lint cleanup

* Pluralized save button (#13140)

* Autofocus modifier and better Add More button UX (#13145)

* Secure Variables: show/hide functionality when adding new values (#13137)

* Flight Icons added and show hide functionality

* PR cleanup

* Linting cleanup

* Position of icon moved to the right of input

* PR feedback addressed

* Delete button and stylistic changes to show hide

* Hmm, eslint doesnt like jsdoc-usage as only reason for import

* More closely match the button styles and delete test

* Simplified new.js model

* Secure Variables: /variables/*path/edit route and functionality (#13170)

* Variable edit page init

* Significant change to where we house model methods

* Lintfix

* Edit a variable tests

* Remove redundant tests

* Asserts expected

* Mirage factory updated to reflect model state
2022-07-11 13:34:04 -04:00
Charlie Voiselle 3717688f3e Secure Variables: Variables - State store, FSM, RPC (#13098)
* Secure Variables: State Store
* Secure Variables: FSM
* Secure Variables: RPC
* Secure Variables: HTTP API

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-07-11 13:34:04 -04:00
Tim Gross 05eef2b95c keystore serialization (#13106)
This changeset implements the keystore serialization/deserialization:

* Adds a JSON serialization extension for the `RootKey` struct, along with a metadata stub. When we serialize RootKey to the on-disk keystore, we want to base64 encode the key material but also exclude any frequently-changing fields which are stored in raft.
* Implements methods for loading/saving keys to the keystore.
* Implements methods for restoring the whole keystore from disk.
* Wires it all up with the `Keyring` RPC handlers and fixes up any fallout on tests.
2022-07-11 13:34:04 -04:00
Tim Gross c6929a6c1e keyring HTTP API (#13077) 2022-07-11 13:34:04 -04:00
Tim Gross 2f0fd556ad keyring RPC handlers (#13075)
Implement the upsert, list, delete, and rotate RPC handlers for the
secure variables keyring. Operations on the keyring itself are still
stubbed out.
2022-07-11 13:34:04 -04:00
Tim Gross b1dc6dcef0 keyring state store operations (#13016)
Implement the basic upsert, list, and delete operations for
`RootKeyMeta` needed by the Keyring RPCs.

This changeset also implements two convenience methods
`RootKeyMetaByID` and `GetActiveRootKeyMeta` which are useful for
testing but also will be needed to implement the rest of the RPCs.
2022-07-11 13:34:04 -04:00
Phil Renaud 7ddc4c8359 Secure Variables UI: Router setup and /variables/index route + guards (#12967)
* Route init

* Bones of a mirage-mocked secure variables policy

* Functinoing policy for list vars

* Delog and transition on route

* Basic guard test

* Page guard tests for secure variables

* Cleanup and unit tests for variables ability

* Linter cleanup

* Set expectations for test assertions

* PR feedback addressed

* Read label changed to View per suggestion
2022-07-11 13:34:03 -04:00
Charlie Voiselle 2019eab2c8 Provide mock secure variables implementation (#12980)
* Add SecureVariable mock
* Add SecureVariableStub
* Add SecureVariable Copy and Stub funcs
2022-07-11 13:34:03 -04:00
Tim Gross d29e85d150 secure variables: initial state store (#12932)
Implement the core SecureVariable and RootKey structs in memdb,
provide the minimal skeleton for FSM, and a dummy storage and keyring
RPC endpoint.
2022-07-11 13:34:01 -04:00
Phil Renaud 462c6653ac
[bugfix, ui] Allow running jobs from a namespace-limited token (#13659)
* Allow running jobs from a namespace-limited token

* qpNamespace cleanup

* Looks like parse can deal with a * namespace

* A little diff cleanup

* Defensive destructuring

* Removing accidental friendly-fire on can-scale

* Testfix: Job run buttons from jobs index

* Testfix: activeRegion job adapter string

* Testfix: unit tests for job abilities correctly reflect the any-namespace rule

* Testfix: job editor test looks for requests with namespace applied on plan
2022-07-11 12:33:17 -04:00
Tim Gross b6dd1191b2
snapshot restore-from-archive streaming and filtering (#13658)
Stream snapshot to FSM when restoring from archive
The `RestoreFromArchive` helper decompresses the snapshot archive to a
temporary file before reading it into the FSM. For large snapshots
this performs a lot of disk IO. Stream decompress the snapshot as we
read it, without first writing to a temporary file.

Add bexpr filters to the `RestoreFromArchive` helper.
The operator can pass these as `-filter` arguments to `nomad operator
snapshot state` (and other commands in the future) to include only
desired data when reading the snapshot.
2022-07-11 10:48:00 -04:00
James Rasell 353323d171
agent: test full object when performing test config parse. (#13668) 2022-07-11 16:21:36 +02:00
James Rasell 9eb63c9e03
cli: ensure node status and drain use correct cmd name. (#13656) 2022-07-11 09:50:42 +02:00
James Rasell 17a467020c
e2e: add terraform init commands to readme doc. (#13655) 2022-07-08 16:52:35 +02:00
Michael Schurter 2a25f2870b
ignore b0a20b4dc965a38b0c843f47c16685ccad7439da (#13648)
Mainly an excuse to experiment with .git-blame-ignore-revs
2022-07-07 15:16:18 -07:00
Seth Hoenig 8c98a8e06c
Merge pull request #13644 from hashicorp/docs-max-kill-timeout
docs: upgrade guide for client max_kill_timeout
2022-07-07 16:47:10 -05:00
Seth Hoenig ed2f2b1a75
docs: move upgrade docs for max_client_timeout
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-07-07 16:46:26 -05:00
Seth Hoenig 905e673553 docs: upgrade guide for client max_kill_timeout 2022-07-07 15:27:40 -05:00
Seth Hoenig 9861afcbe7
Merge pull request #13630 from hashicorp/b-critical-typo
boltdd: fixup typo and use unreachable assertion
2022-07-07 14:59:38 -05:00
Seth Hoenig 239eaf9a29
Merge pull request #13626 from hashicorp/b-client-max-kill-timeout
client: enforce max_kill_timeout client configuration
2022-07-07 13:44:39 -05:00
Luiz Aoqui 85908415f9
state: fix eval list by prefix with * namespace (#13551) 2022-07-07 14:21:51 -04:00
Luiz Aoqui 03433dd8af
cli: improve output of eval commands (#13581)
Use the same output format when listing multiple evals in the `eval
list` command and when `eval status <prefix>` matches more than one
eval.

Include the eval namespace in all output formats and always include the
job ID in `eval status` since, even `node-update` evals are related to a
job.

Add Node ID to the evals table output to help differentiate
`node-update` evals.

Co-authored-by: James Rasell <jrasell@hashicorp.com>
2022-07-07 13:13:34 -04:00
Ted Behling 6a032a54d2
driver/docker: Don't pull InfraImage if it exists (#13265)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2022-07-07 17:44:06 +02:00
James Rasell d6ef3432c5
terraform: update installed versions of HashiCorp tools. (#13635) 2022-07-07 16:12:19 +02:00
Seth Hoenig 5ef04ac7c0
Merge pull request #13631 from hashicorp/docs-service-rwh
docs: add docs for simple load balancing nomad services
2022-07-07 08:40:29 -05:00
Seth Hoenig b9fe6c8d2c docs: fixup from cr comments 2022-07-07 08:37:10 -05:00
Seth Hoenig 1c31ef285e docs: add docs for simple load balancing nomad services
This PR adds a section to template docs for simple load balancing with nomad servicse.
2022-07-06 17:34:30 -05:00
Seth Hoenig b1a3a8a5bb helpers: use unreachable assertion 2022-07-06 17:05:35 -05:00
Seth Hoenig 7cca6604e7 helpers: fix critical typo 2022-07-06 17:03:00 -05:00
Michael Schurter f21272065d
core: emit node evals only for sys jobs in dc (#12955)
Whenever a node joins the cluster, either for the first time or after
being `down`, we emit a evaluation for every system job to ensure all
applicable system jobs are running on the node.

This patch adds an optimization to skip creating evaluations for system
jobs not in the current node's DC. While the scheduler performs the same
feasability check, skipping the creation of the evaluation altogether
saves disk, network, and memory.
2022-07-06 14:35:18 -07:00
Seth Hoenig 5dd8aa3e27 client: enforce max_kill_timeout client configuration
This PR fixes a bug where client configuration max_kill_timeout was
not being enforced. The feature was introduced in 9f44780 but seems
to have been removed during the major drivers refactoring.

We can make sure the value is enforced by pluming it through the DriverHandler,
which now uses the lesser of the task.killTimeout or client.maxKillTimeout.
Also updates Event.SetKillTimeout to require both the task.killTimeout and
client.maxKillTimeout so that we don't make the mistake of using the wrong
value - as it was being given only the task.killTimeout before.
2022-07-06 15:29:38 -05:00
Luiz Aoqui a9a66ad018
api: apply new ACL check for wildcard namespace (#13608)
api: apply new ACL check for wildcard namespace

In #13606 the ACL check was refactored to better support the all
namespaces wildcard (`*`). This commit applies the changes to the jobs
and alloc list endpoints.
2022-07-06 16:17:16 -04:00
Luiz Aoqui 74c5578432
api: refactor ACL check for namespace wildcard (#13606)
Improve how the all namespaces wildcard (`*`) is handled when checking
ACL permissions. When using the wildcard namespace the `AllowNsOp` would
return false since it looks for a namespace called `*` to match.

This commit changes this behavior to return `true` when the queried
namespace is `*` and the token allows the operation in _any_ namespace.

Actual permission must be checked per object. The helper function
`AllowNsOpFunc` returns a function that can be used to make this
verification.
2022-07-06 15:22:30 -04:00
Seth Hoenig 29c6b9dfdf
Merge pull request #13586 from hashicorp/build-bump-circleci-macos
build: bump circleci macos version
2022-07-06 10:41:06 -05:00
Tim Gross 1fc8995590
query for leader in operator debug command (#13472)
The `operator debug` command doesn't output the leader anywhere in the
output, which adds extra burden to offline debugging (away from an
ongoing incident where you can simply check manually). Query the
`/v1/status/leader` API but degrade gracefully.
2022-07-06 10:57:44 -04:00
Charlie Voiselle 07418877c6
Fix Testutil for delve debugging API tests (#13589) 2022-07-06 10:47:48 -04:00