open-nomad

Author	SHA1	Message	Date
Tim Gross	83dc3ec758	secure variables ACL policies (#13294 ) Adds a new policy block inside namespaces to control access to secure variables on the basis of path, with support for globbing. Splits out VerifyClaim from ResolveClaim. The ServiceRegistration RPC only needs to be able to verify that a claim is valid for some allocation in the store; it doesn't care about implicit policies or capabilities. Split this out to its own method on the server so that the SecureVariables RPC can reuse it as a separate step from resolving policies (see next commit). Support implicit policies based on workload identity	2022-07-11 13:34:05 -04:00
Tim Gross	160ad9b64f	implement Encrypt/Decrypt methods of encrypter (#13375 ) The `Encrypt` method generates an appropriately-sized nonce and uses that buffer as the prefix for the ciphertext. This keeps the ciphertext and nonce together for decryption, and reuses the buffer as much as possible without presenting the temptation to reuse the cleartext buffer owned by the caller. We include the key ID as the "additional data" field that's used as an extra input to the authentication signature, to provide additional protection that a ciphertext originated with that key. Refactors the locking for the keyring so that the public methods are generally (with one commented exception) responsible for taking the lock and then inner methods are assumed locked.	2022-07-11 13:34:05 -04:00
Charlie Voiselle	1fe080c6de	Implement HTTP search API for Variables (#13257 ) * Add Path only index for SecureVariables * Add GetSecureVariablesByPrefix; refactor tests * Add search for SecureVariables * Add prefix search for secure variables	2022-07-11 13:34:05 -04:00
Charlie Voiselle	06c6a950c4	Secure Variables: Seperate Encrypted and Decrypted structs (#13355 ) This PR splits SecureVariable into SecureVariableDecrypted and SecureVariableEncrypted in order to use the type system to help verify that cleartext secret material is not committed to file. * Make Encrypt function return KeyID * Split SecureVariable Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-07-11 13:34:05 -04:00
Tim Gross	bfcbc00f4e	workload identity (#13223 ) In order to support implicit ACL policies for tasks to get their own secrets, each task would need to have its own ACL token. This would add extra raft overhead as well as new garbage collection jobs for cleaning up task-specific ACL tokens. Instead, Nomad will create a workload Identity Claim for each task. An Identity Claim is a JSON Web Token (JWT) signed by the server’s private key and attached to an Allocation at the time a plan is applied. The encoded JWT can be submitted as the X-Nomad-Token header to replace ACL token secret IDs for the RPCs that support identity claims. Whenever a key is is added to a server’s keyring, it will use the key as the seed for a Ed25519 public-private private keypair. That keypair will be used for signing the JWT and for verifying the JWT. This implementation is a ruthlessly minimal approach to support the secure variables feature. When a JWT is verified, the allocation ID will be checked against the Nomad state store, and non-existent or terminal allocation IDs will cause the validation to be rejected. This is sufficient to support the secure variables feature at launch without requiring implementation of a background process to renew soon-to-expire tokens.	2022-07-11 13:34:05 -04:00
Tim Gross	7055ce89b1	keyring replication (#13167 ) Replication for the secure variables keyring. Because only key metadata is stored in raft, we need to distribute key material out-of-band from raft replication. A goroutine runs on each server and watches for changes to the `RootKeyMeta`. When a new key is received, attempt to fetch the key from the leader. If the leader doesn't have the key (which may happen if a key is rotated right before a leader transition), try to get the key from any peer.	2022-07-11 13:34:04 -04:00
Tim Gross	d5a214484c	core job for root key GC (#13199 ) Inactive and unused keys older than a threshold will be periodically garbage collected.	2022-07-11 13:34:04 -04:00
Tim Gross	5a85d96322	remove end-user algorithm selection (#13190 ) After internal design review, we decided to remove exposing algorithm choice to the end-user for the initial release. We'll solve nonce rotation by forcing rotations automatically on key GC (in a core job, not included in this changeset). Default to AES-256 GCM for the following criteria: * faster implementation when hardware acceleration is available * FIPS compliant * implementation in pure go * post-quantum resistance Also fixed a bug in the decoding from keystore and switched to a harder-to-misuse encoding method.	2022-07-11 13:34:04 -04:00
Tim Gross	973b474b3c	provide state store query for variables by key ID (#13195 ) The core jobs to garbage collect unused keys and perform full key rotations will need to be able to query secure variables by key ID for efficiency. Add an index to the state store and associated query function and test.	2022-07-11 13:34:04 -04:00
Tim Gross	f2ee585830	bootstrap keyring (#13124 ) When a server becomes leader, it will check if there are any keys in the state store, and create one if there is not. The key metadata will be replicated via raft to all followers, who will then get the key material via key replication (not implemented in this changeset).	2022-07-11 13:34:04 -04:00
Charlie Voiselle	3717688f3e	Secure Variables: Variables - State store, FSM, RPC (#13098 ) * Secure Variables: State Store * Secure Variables: FSM * Secure Variables: RPC * Secure Variables: HTTP API Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-07-11 13:34:04 -04:00
Tim Gross	05eef2b95c	keystore serialization (#13106 ) This changeset implements the keystore serialization/deserialization: * Adds a JSON serialization extension for the `RootKey` struct, along with a metadata stub. When we serialize RootKey to the on-disk keystore, we want to base64 encode the key material but also exclude any frequently-changing fields which are stored in raft. * Implements methods for loading/saving keys to the keystore. * Implements methods for restoring the whole keystore from disk. * Wires it all up with the `Keyring` RPC handlers and fixes up any fallout on tests.	2022-07-11 13:34:04 -04:00
Tim Gross	2f0fd556ad	keyring RPC handlers (#13075 ) Implement the upsert, list, delete, and rotate RPC handlers for the secure variables keyring. Operations on the keyring itself are still stubbed out.	2022-07-11 13:34:04 -04:00
Tim Gross	b1dc6dcef0	keyring state store operations (#13016 ) Implement the basic upsert, list, and delete operations for `RootKeyMeta` needed by the Keyring RPCs. This changeset also implements two convenience methods `RootKeyMetaByID` and `GetActiveRootKeyMeta` which are useful for testing but also will be needed to implement the rest of the RPCs.	2022-07-11 13:34:04 -04:00
Charlie Voiselle	2019eab2c8	Provide mock secure variables implementation (#12980 ) * Add SecureVariable mock * Add SecureVariableStub * Add SecureVariable Copy and Stub funcs	2022-07-11 13:34:03 -04:00
Tim Gross	d29e85d150	secure variables: initial state store (#12932 ) Implement the core SecureVariable and RootKey structs in memdb, provide the minimal skeleton for FSM, and a dummy storage and keyring RPC endpoint.	2022-07-11 13:34:01 -04:00
Tim Gross	b6dd1191b2	snapshot restore-from-archive streaming and filtering (#13658 ) Stream snapshot to FSM when restoring from archive The `RestoreFromArchive` helper decompresses the snapshot archive to a temporary file before reading it into the FSM. For large snapshots this performs a lot of disk IO. Stream decompress the snapshot as we read it, without first writing to a temporary file. Add bexpr filters to the `RestoreFromArchive` helper. The operator can pass these as `-filter` arguments to `nomad operator snapshot state` (and other commands in the future) to include only desired data when reading the snapshot.	2022-07-11 10:48:00 -04:00
Seth Hoenig	239eaf9a29	Merge pull request #13626 from hashicorp/b-client-max-kill-timeout client: enforce max_kill_timeout client configuration	2022-07-07 13:44:39 -05:00
Luiz Aoqui	85908415f9	state: fix eval list by prefix with * namespace (#13551 )	2022-07-07 14:21:51 -04:00
Michael Schurter	f21272065d	core: emit node evals only for sys jobs in dc (#12955 ) Whenever a node joins the cluster, either for the first time or after being `down`, we emit a evaluation for every system job to ensure all applicable system jobs are running on the node. This patch adds an optimization to skip creating evaluations for system jobs not in the current node's DC. While the scheduler performs the same feasability check, skipping the creation of the evaluation altogether saves disk, network, and memory.	2022-07-06 14:35:18 -07:00
Seth Hoenig	5dd8aa3e27	client: enforce max_kill_timeout client configuration This PR fixes a bug where client configuration max_kill_timeout was not being enforced. The feature was introduced in 9f44780 but seems to have been removed during the major drivers refactoring. We can make sure the value is enforced by pluming it through the DriverHandler, which now uses the lesser of the task.killTimeout or client.maxKillTimeout. Also updates Event.SetKillTimeout to require both the task.killTimeout and client.maxKillTimeout so that we don't make the mistake of using the wrong value - as it was being given only the task.killTimeout before.	2022-07-06 15:29:38 -05:00
Luiz Aoqui	a9a66ad018	api: apply new ACL check for wildcard namespace (#13608 ) api: apply new ACL check for wildcard namespace In #13606 the ACL check was refactored to better support the all namespaces wildcard (`*`). This commit applies the changes to the jobs and alloc list endpoints.	2022-07-06 16:17:16 -04:00
Luiz Aoqui	74c5578432	api: refactor ACL check for namespace wildcard (#13606 ) Improve how the all namespaces wildcard (``) is handled when checking ACL permissions. When using the wildcard namespace the `AllowNsOp` would return false since it looks for a namespace called `` to match. This commit changes this behavior to return `true` when the queried namespace is `*` and the token allows the operation in _any_ namespace. Actual permission must be checked per object. The helper function `AllowNsOpFunc` returns a function that can be used to make this verification.	2022-07-06 15:22:30 -04:00
James Rasell	0c0b028a59	core: allow deleting of evaluations (#13492 ) * core: add eval delete RPC and core functionality. * agent: add eval delete HTTP endpoint. * api: add eval delete API functionality. * cli: add eval delete command. * docs: add eval delete website documentation.	2022-07-06 16:30:11 +02:00
James Rasell	181b247384	core: allow pausing and un-pausing of leader broker routine (#13045 ) * core: allow pause/un-pause of eval broker on region leader. * agent: add ability to pause eval broker via scheduler config. * cli: add operator scheduler commands to interact with config. * api: add ability to pause eval broker via scheduler config * e2e: add operator scheduler test for eval broker pause. * docs: include new opertor scheduler CLI and pause eval API info.	2022-07-06 16:13:48 +02:00
Seth Hoenig	97726c2fd8	Merge pull request #12862 from hashicorp/f-choose-services api: enable selecting subset of services using rendezvous hashing	2022-06-30 15:17:40 -05:00
Michael Schurter	1cc0ae8795	docs: fix Plan{,Result}.NodeUpdate comment (#13534 ) It appears way back when this was first implemented in 9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from NodeEvict (with a correct comment) to NodeUpdate. The comment was changed from referring to only evictions to referring to "all allocs" in the first sentence and "stop or evict" in the second. This confuses every time I see it because I read the name (NodeUpdate) and first sentence ("all the allocs") and assume this represents all allocations... which isn't true. I'm going to assume I'm the only one who doesn't read the 2nd sentence and that's why this suboptimal wording has lasted 7 years, but can we change it for my sake?	2022-06-30 12:47:14 -07:00
James Rasell	d080eed9ae	client: fixed a problem calculating a service namespace. (#13493 ) When calculating a services namespace for registration, the code assumed the first task within the task array would include a service block. This is incorrect as it is possible only a latter task within the array contains a service definition. This change fixes the logic, so we correctly search for a service definition before identifying the namespace.	2022-06-28 09:47:28 +02:00
Shishir Mahajan	6ba8245283	Fix typo: orthogonal. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-06-27 12:12:51 -07:00
Seth Hoenig	9467bc9eb3	api: enable selecting subset of services using rendezvous hashing This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint. The value of 'choose' is in the form '<number>\|<key>', number is the number of desired services and key is a value unique but consistent to the requester (e.g. allocID). Folks aren't really expected to use this API directly, but rather through consul-template which will soon be getting a new helper function making use of this query parameter. Example, curl 'localhost:4646/v1/service/redis?choose=2\|abc123' Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.	2022-06-25 10:37:37 -05:00
Tim Gross	4368dcc02f	fix deadlock in plan_apply (#13407 ) The plan applier has to get a snapshot with a minimum index for the plan it's working on in order to ensure consistency. Under heavy raft loads, we can exceed the timeout. When this happens, we hit a bug where the plan applier blocks waiting on the `indexCh` forever, and all schedulers will block in `Plan.Submit`. Closing the `indexCh` when the `asyncPlanWait` is done with it will prevent the deadlock without impacting correctness of the previous snapshot index. This changeset includes the a PoC failing test that works by injecting a large timeout into the state store. We need to turn this into a test we can run normally without breaking the state store before we can merge this PR. Increase `snapshotMinIndex` timeout to 10s. This timeout creates backpressure where any concurrent `Plan.Submit` RPCs will block waiting for results. This sheds load across all servers and gives raft some CPU to catch up, because schedulers won't dequeue more work while waiting. Increase it to 10s based on observations of large production clusters.	2022-06-23 12:06:27 -04:00
Grant Griffiths	99896da443	CSI: make plugin health_timeout configurable in csi_plugin stanza (#13340 ) Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>	2022-06-14 10:04:16 -04:00
Tim Gross	9d5523a72d	CSI: skip node unpublish on GC'd or down nodes (#13301 ) If the node has been GC'd or is down, we can't send it a node unpublish. The CSI spec requires that we don't send the controller unpublish before the node unpublish, but in the case where a node is gone we can't know the final fate of the node unpublish step. The `csi_hook` on the client will unpublish if the allocation has stopped and if the host is terminated there's no mount for the volume anyways. So we'll now assume that the node has unpublished at its end. If it hasn't, any controller unpublish will potentially hang or error and need to be retried.	2022-06-09 11:33:22 -04:00
James Rasell	f5e78a3791	state: only update index on change when deleting evals. (#13227 ) When deleting evaluations and allocations during a reap event, the index table entries for evals and allocs was updated irregardless of whether changes were made. This change modifies the state logic so that the index table is only modified when the corresponding table has actually been modified. Along with matching expected behaviour, this change has the potential to reduce the number of times blocking queries will return without any real state change.	2022-06-07 11:56:43 +02:00
Lance Haig	4bf27d743d	Allow Operator Generated bootstrap token (#12520 )	2022-06-03 07:37:24 -04:00
Huan Wang	7d15157635	adding support for customized ingress tls (#13184 )	2022-06-02 18:43:58 -04:00
Seth Hoenig	0399b7e4c5	Merge pull request #12951 from jorgemarey/f-srv-tagged-addresses Allow setting tagged addresses on services	2022-06-01 10:51:49 -05:00
Seth Hoenig	189176f052	consul: avoid reflection in comparing service map types	2022-06-01 10:22:00 -05:00
Tim Gross	6873670dd6	refactor index threshold calculation for core GC jobs (#13196 ) Almost all GC jobs check the index of the objects being GC'd to see if they're older than a configured threshold. This code was repeated six times in `CoreScheduler` with only logging changes, so it seems safe to extract it as its own method.	2022-06-01 11:12:20 -04:00
Seth Hoenig	dca954faac	build: update golangci-lint to v1.46.2 This version of golangci-lint improves support for generics, but also is more strict in copy vs. loop for slice copying.	2022-05-31 23:32:01 +00:00
Seth Hoenig	54efec5dfe	docs: add docs and tests for tagged_addresses	2022-05-31 13:02:48 -05:00
Jorge Marey	f966614602	Allow setting tagged addresses on services	2022-05-31 10:06:55 -05:00
Seth Hoenig	4631045d83	connect: enable setting connect upstream destination namespace	2022-05-26 09:39:36 -05:00
Luiz Aoqui	769ff1dcc3	Merge pull request #13109 from hashicorp/merge-release-1.3.1-branch Merge release 1.3.1 branch	2022-05-25 10:45:09 -04:00
Seth Hoenig	626a345fb2	ci: switch to 22.04 LTS for GHA Core CI tests	2022-05-25 08:19:40 -05:00
Michael Schurter	2965dc6a1a	artifact: fix numerous go-getter security issues Fix numerous go-getter security issues: - Add timeouts to http, git, and hg operations to prevent DoS - Add size limit to http to prevent resource exhaustion - Disable following symlinks in both artifacts and `job run` - Stop performing initial HEAD request to avoid file corruption on retries and DoS opportunities. Approach Since Nomad has no ability to differentiate a DoS-via-large-artifact vs a legitimate workload, all of the new limits are configurable at the client agent level. The max size of HTTP downloads is also exposed as a node attribute so that if some workloads have large artifacts they can specify a high limit in their jobspecs. In the future all of this plumbing could be extended to enable/disable specific getters or artifact downloading entirely on a per-node basis.	2022-05-24 16:29:39 -04:00
Luiz Aoqui	0a00059f3c	core: test duplicated blocked eval stats In the original test, the eval generator would use a random value for the job ID, resulting in an unxercised code path for duplicate blocked evals.	2022-05-24 15:44:06 -04:00
Seth Hoenig	a5943da0c7	core: add tests for blocked evals math	2022-05-24 09:05:18 -05:00
Seth Hoenig	0c145ac1e4	core: remove correct set of resources on blocked eval	2022-05-23 15:18:55 -05:00
Seth Hoenig	fc58f4972c	cli: correctly use and validate job with vault token set This PR fixes `job validate` to respect '-vault-token', '$VAULT_TOKEN', '-vault-namespace' if set.	2022-05-19 12:13:34 -05:00

1 2 3 4 5 ...

4012 commits