open-nomad

Author	SHA1	Message	Date
James Rasell	726d419da1	acl: replicate auth-methods from federated cluster leaders. (#15366 )	2022-11-28 09:20:24 +01:00
Luiz Aoqui	8f91be26ab	scheduler: create placements for non-register MRD (#15325 ) * scheduler: create placements for non-register MRD For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run. Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it. This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements. The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements. This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes. * changelog: add entry for #15325	2022-11-25 12:45:34 -05:00
James Rasell	32dfa431f3	sso: add ACL auth-method HTTP API CRUD endpoints (#15338 ) * core: remove custom auth-method TTLS and use ACL token TTLS. * agent: add ACL auth-method HTTP endpoints for CRUD actions. * api: add ACL auth-method client.	2022-11-23 09:38:02 +01:00
Piotr Kazmierczak	bb66b5e770	acl: sso auth method RPC endpoints (#15221 ) This PR implements RPC endpoints for SSO auth methods. This PR is part of the SSO work captured under ☂️ ticket #13120.	2022-11-21 10:15:39 +01:00
Piotr Kazmierczak	d02241cad5	acl: sso auth method event stream (#15280 ) This PR implements SSO auth method support in the event stream. This PR is part of the SSO work captured under ☂️ ticket #13120.	2022-11-21 10:06:05 +01:00
Tim Gross	05a46e6648	make eval cancelation really async with `Eval.Ack` (#15298 ) Ensure we never block in the `Eval.Ack`	2022-11-18 08:38:17 -05:00
Tim Gross	b74a868aae	make eval cancelation async with `Eval.Ack` (#15294 ) In #14621 we added an eval canelation reaper goroutine with a channel that allowed us to wake it up. But we forgot to actually send on this channel from `Eval.Ack` and are still committing the cancelations synchronously. Fix this by sending on the buffered channel to wake up the reaper instead.	2022-11-17 16:40:41 -05:00
Tim Gross	d0f9e887f7	autopilot: include only servers from the same region (#15290 ) When we migrated to the updated autopilot library in Nomad 1.4.0, the interface for finding servers changed. Previously autopilot would get the serf members and call `IsServer` on each of them, leaving it up to the implementor to filter out clients (and in Nomad's case, other regions). But in the "new" autopilot library, the equivalent interface is `KnownServers` for which we did not filter by region. This causes spurious attempts for the cross-region stats fetching, which results in TLS errors and a lot of log noise. Filter the member set by region to fix the regression.	2022-11-17 12:09:36 -05:00
Tim Gross	510eb435dc	remove deprecated `AllocUpdateRequestType` raft entry (#15285 ) After Deployments were added in Nomad 0.6.0, the `AllocUpdateRequestType` raft log entry was no longer in use. Mark this as deprecated, remove the associated dead code, and remove references to the metrics it emits from the docs. We'll leave the entry itself just in case we encounter old raft logs that we need to be able to safely load.	2022-11-17 12:08:04 -05:00
Tim Gross	dd3a07302e	keyring: update handle to state inside replication loop (#15227 ) * keyring: update handle to state inside replication loop When keyring replication starts, we take a handle to the state store. But whenever a snapshot is restored, this handle is invalidated and no longer points to a state store that is receiving new keys. This leaks a bunch of memory too! In addition to operator-initiated restores, when fresh servers are added to existing clusters with large-enough state, the keyring replication can get started quickly enough that it's running before the snapshot from the existing clusters have been restored. Fix this by updating the handle to the state store on each pass.	2022-11-17 08:40:12 -05:00
Tim Gross	6415fb4284	eval broker: shed all but one blocked eval per job after ack (#14621 ) When an evaluation is acknowledged by a scheduler, the resulting plan is guaranteed to cover up to the `waitIndex` set by the worker based on the most recent evaluation for that job in the state store. At that point, we no longer need to retain blocked evaluations in the broker that are older than that index. Move all but the highest priority / highest `ModifyIndex` blocked eval into a canceled set. When the `Eval.Ack` RPC returns from the eval broker it will signal a reap of a batch of cancelable evals to write to raft. This paces the cancelations limited by how frequently the schedulers are acknowledging evals; this should reduce the risk of cancelations from overwhelming raft relative to scheduler progress. In order to avoid straggling batches when the cluster is quiet, we also include a periodic sweep through the cancelable list.	2022-11-16 16:10:11 -05:00
Tim Gross	e8c83c3ecc	test: ensure leader is still valid in reelection test (#15267 ) The `TestLeader_Reelection` test waits for a leader to be elected and then makes some other assertions. But it implcitly assumes that there's no failure of leadership before shutting down the leader, which can lead to a panic in the tests. Assert there's still a leader before the shutdown.	2022-11-16 11:04:02 -05:00
Tim Gross	37134a4a37	eval delete: move batching of deletes into RPC handler and state (#15117 ) During unusual outage recovery scenarios on large clusters, a backlog of millions of evaluations can appear. In these cases, the `eval delete` command can put excessive load on the cluster by listing large sets of evals to extract the IDs and then sending larges batches of IDs. Although the command's batch size was carefully tuned, we still need to be JSON deserialize, re-serialize to MessagePack, send the log entries through raft, and get the FSM applied. To improve performance of this recovery case, move the batching process into the RPC handler and the state store. The design here is a little weird, so let's look a the failed options first: * A naive solution here would be to just send the filter as the raft request and let the FSM apply delete the whole set in a single operation. Benchmarking with 1M evals on a 3 node cluster demonstrated this can block the FSM apply for several minutes, which puts the cluster at risk if there's a leadership failover (the barrier write can't be made while this apply is in-flight). * A less naive but still bad solution would be to have the RPC handler filter and paginate, and then hand a list of IDs to the existing raft log entry. Benchmarks showed this blocked the FSM apply for 20-30s at a time and took roughly an hour to complete. Instead, we're filtering and paginating in the RPC handler to find a page token, and then passing both the filter and page token in the raft log. The FSM apply recreates the paginator using the filter and page token to get roughly the same page of evaluations, which it then deletes. The pagination process is fairly cheap (only abut 5% of the total FSM apply time), so counter-intuitively this rework ends up being much faster. A benchmark of 1M evaluations showed this blocked the FSM apply for 20-30ms at a time (typical for normal operations) and completes in less than 4 minutes. Note that, as with the existing design, this delete is not consistent: a new evaluation inserted "behind" the cursor of the pagination will fail to be deleted.	2022-11-14 14:08:13 -05:00
Piotr Kazmierczak	4851f9e68a	acl: sso auth method schema and store functions (#15191 ) This PR implements ACLAuthMethod type, acl_auth_methods table schema and crud state store methods. It also updates nomadSnapshot.Persist and nomadSnapshot.Restore methods in order for them to work with the new table, and adds two new Raft messages: ACLAuthMethodsUpsertRequestType and ACLAuthMethodsDeleteRequestType This PR is part of the SSO work captured under ☂️ ticket #13120.	2022-11-10 19:42:41 +01:00
Drew Gonzales	aac9404ee5	server: add git revision to serf tags (#9159 )	2022-11-07 10:34:33 -05:00
Tim Gross	9e1c0b46d8	API for `Eval.Count` (#15147 ) Add a new `Eval.Count` RPC and associated HTTP API endpoints. This API is designed to support interactive use in the `nomad eval delete` command to get a count of evals expected to be deleted before doing so. The state store operations to do this sort of thing are somewhat expensive, but it's cheaper than serializing a big list of evals to JSON. Note that although it seems like this could be done as an extra parameter and response field on `Eval.List`, having it as its own endpoint avoids having to change the response body shape and lets us avoid handling the legacy filter params supported by `Eval.List`.	2022-11-07 08:53:19 -05:00
Luiz Aoqui	e4c8b59919	Update alloc after reconnect and enforece client heartbeat order (#15068 ) * scheduler: allow updates after alloc reconnects When an allocation reconnects to a cluster the scheduler needs to run special logic to handle the reconnection, check if a replacement was create and stop one of them. If the allocation kept running while the node was disconnected, it will be reconnected with `ClientStatus: running` and the node will have `Status: ready`. This combination is the same as the normal steady state of allocation, where everything is running as expected. In order to differentiate between the two states (an allocation that is reconnecting and one that is just running) the scheduler needs an extra piece of state. The current implementation uses the presence of a `TaskClientReconnected` task event to detect when the allocation has reconnected and thus must go through the reconnection process. But this event remains even after the allocation is reconnected, causing all future evals to consider the allocation as still reconnecting. This commit changes the reconnect logic to use an `AllocState` to register when the allocation was reconnected. This provides the following benefits: - Only a limited number of task states are kept, and they are used for many other events. It's possible that, upon reconnecting, several actions are triggered that could cause the `TaskClientReconnected` event to be dropped. - Task events are set by clients and so their timestamps are subject to time skew from servers. This prevents using time to determine if an allocation reconnected after a disconnect event. - Disconnect events are already stored as `AllocState` and so storing reconnects there as well makes it the only source of information required. With the new logic, the reconnection logic is only triggered if the last `AllocState` is a disconnect event, meaning that the allocation has not been reconnected yet. After the reconnection is handled, the new `ClientStatus` is store in `AllocState` allowing future evals to skip the reconnection logic. * scheduler: prevent spurious placement on reconnect When a client reconnects it makes two independent RPC calls: - `Node.UpdateStatus` to heartbeat and set its status as `ready`. - `Node.UpdateAlloc` to update the status of its allocations. These two calls can happen in any order, and in case the allocations are updated before a heartbeat it causes the state to be the same as a node being disconnected: the node status will still be `disconnected` while the allocation `ClientStatus` is set to `running`. The current implementation did not handle this order of events properly, and the scheduler would create an unnecessary placement since it considered the allocation was being disconnected. This extra allocation would then be quickly stopped by the heartbeat eval. This commit adds a new code path to handle this order of events. If the node is `disconnected` and the allocation `ClientStatus` is `running` the scheduler will check if the allocation is actually reconnecting using its `AllocState` events. * rpc: only allow alloc updates from `ready` nodes Clients interact with servers using three main RPC methods: - `Node.GetAllocs` reads allocation data from the server and writes it to the client. - `Node.UpdateAlloc` reads allocation from from the client and writes them to the server. - `Node.UpdateStatus` writes the client status to the server and is used as the heartbeat mechanism. These three methods are called periodically by the clients and are done so independently from each other, meaning that there can't be any assumptions in their ordering. This can generate scenarios that are hard to reason about and to code for. For example, when a client misses too many heartbeats it will be considered `down` or `disconnected` and the allocations it was running are set to `lost` or `unknown`. When connectivity is restored the to rest of the cluster, the natural mental model is to think that the client will heartbeat first and then update its allocations status into the servers. But since there's no inherit order in these calls the reverse is just as possible: the client updates the alloc status and then heartbeats. This results in a state where allocs are, for example, `running` while the client is still `disconnected`. This commit adds a new verification to the `Node.UpdateAlloc` method to reject updates from nodes that are not `ready`, forcing clients to heartbeat first. Since this check is done server-side there is no need to coordinate operations client-side: they can continue sending these requests independently and alloc update will succeed after the heartbeat is done. * chagelog: add entry for #15068 * code review * client: skip terminal allocations on reconnect When the client reconnects with the server it synchronizes the state of its allocations by sending data using the `Node.UpdateAlloc` RPC and fetching data using the `Node.GetClientAllocs` RPC. If the data fetch happens before the data write, `unknown` allocations will still be in this state and would trigger the `allocRunner.Reconnect` flow. But when the server `DesiredStatus` for the allocation is `stop` the client should not reconnect the allocation. * apply more code review changes * scheduler: persist changes to reconnected allocs Reconnected allocs have a new AllocState entry that must be persisted by the plan applier. * rpc: read node ID from allocs in UpdateAlloc The AllocUpdateRequest struct is used in three disjoint use cases: 1. Stripped allocs from clients Node.UpdateAlloc RPC using the Allocs, and WriteRequest fields 2. Raft log message using the Allocs, Evals, and WriteRequest fields 3. Plan updates using the AllocsStopped, AllocsUpdated, and Job fields Adding a new field that would only be used in one these cases (1) made things more confusing and error prone. While in theory an AllocUpdateRequest could send allocations from different nodes, in practice this never actually happens since only clients call this method with their own allocations. * scheduler: remove logic to handle exceptional case This condition could only be hit if, somehow, the allocation status was set to "running" while the client was "unknown". This was addressed by enforcing an order in "Node.UpdateStatus" and "Node.UpdateAlloc" RPC calls, so this scenario is not expected to happen. Adding unnecessary code to the scheduler makes it harder to read and reason about it. * more code review * remove another unused test	2022-11-04 16:25:11 -04:00
Charlie Voiselle	79c4478f5b	template: error on missing key (#15141 ) * Support error_on_missing_value for templates * Update docs for template stanza	2022-11-04 13:23:01 -04:00
Charlie Voiselle	83e43e01c1	Add missing timer reset (#15134 )	2022-11-03 18:57:57 -04:00
Phil Renaud	ffb4c63af7	[ui] Adds meta to job list stub and displays a pack logo on the jobs index (#14833 ) * Adds meta to job list stub and displays a pack logo on the jobs index * Changelog * Modifying struct for optional meta param * Explicitly ask for meta anytime I look up a job from index or job page * Test case for the endpoint * adding meta field to API struct and ommitting from response if empty * passthru method added to api/jobs.list * Meta param listed in docs for jobs list * Update api/jobs.go Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-11-02 16:58:24 -04:00
Tim Gross	4d7a4171cd	volumewatcher: prevent panic on nil volume (#15101 ) If a GC claim is written and then volume is deleted before the `volumewatcher` enters its run loop, we panic on the nil-pointer access. Simply doing a nil-check at the top of the loop reveals a race condition around shutting down the loop just as a new update is coming in. Have the parent `volumeswatcher` send an initial update on the channel before returning, so that we're still holding the lock. Update the watcher's `Stop` method to set the running state, which lets us avoid having a second context and makes stopping synchronous. This reduces the cases we have to handle in the run loop. Updated the tests now that we'll safely return from the goroutine and stop the runner in a larger set of cases. Ran the tests with the `-race` detection flag and fixed up any problems found here as well.	2022-11-01 16:53:10 -04:00
Tim Gross	38542f256e	variables: limit rekey eval to half the nack timeout (#15102 ) In order to limit how much the rekey job can monopolize a scheduler worker, we limit how long it can run to 1min before stopping work and emitting a new eval. But this exactly matches the default nack timeout, so it'll fail the eval rather than getting a chance to emit a new one. Set the timeout for the rekey eval to half the configured nack timeout.	2022-11-01 16:50:50 -04:00
Tim Gross	903b5baaa4	keyring: safely handle missing keys and restore GC (#15092 ) When replication of a single key fails, the replication loop breaks early and therefore keys that fall later in the sorting order will never get replicated. This is particularly a problem for clusters impacted by the bug that caused #14981 and that were later upgraded; the keys that were never replicated can now never be replicated, and so we need to handle them safely. Included in the replication fix: * Refactor the replication loop so that each key replicated in a function call that returns an error, to make the workflow more clear and reduce nesting. Log the error and continue. * Improve stability of keyring replication tests. We no longer block leadership on initializing the keyring, so there's a race condition in the keyring tests where we can test for the existence of the root key before the keyring has been initialize. Change this to an "eventually" test. But these fixes aren't enough to fix #14981 because they'll end up seeing an error once a second complaining about the missing key, so we also need to fix keyring GC so the keys can be removed from the state store. Now we'll store the key ID used to sign a workload identity in the Allocation, and we'll index the Allocation table on that so we can track whether any live Allocation was signed with a particular key ID.	2022-11-01 15:00:50 -04:00
Tim Gross	dab9388c75	refactor eval delete safety check (#15070 ) The `Eval.Delete` endpoint has a helper that takes a list of jobs and allocs and determines whether the eval associated with those is safe to delete (based on their state). Filtering improvements to the `Eval.Delete` endpoint are going to need this check to run in the state store itself for consistency. Refactor to push this check down into the state store to keep the eventual diff for that work reasonable.	2022-10-28 09:10:33 -04:00
Tim Gross	9c37a234e7	test: refactor EvalEndpoint_Delete (#15065 ) While working on filtering improvements to the `Eval.Delete` endpoint I noticed that this test was going to need to expand significantly and needed some refactoring to make that work nicely. In order to reduce the size of the eventual diff, I've pulled this refactoring out into its own changeset.	2022-10-27 15:29:22 -04:00
hc-github-team-nomad-core	38b1c8a22a	Prepare for next release	2022-10-27 13:08:05 -04:00
hc-github-team-nomad-core	fbef8881cd	Generate files for 1.4.2 release	2022-10-27 13:08:05 -04:00
Tim Gross	9d906d4632	variables: fix filter on List RPC The List RPC correctly authorized against the prefix argument. But when filtering results underneath the prefix, it only checked authorization for standard ACL tokens and not Workload Identity. This results in WI tokens being able to read List results (metadata only: variable paths and timestamps) for variables under the `nomad/` prefix that belong to other jobs in the same namespace. Fixes the filtering and split the `handleMixedAuthEndpoint` function into separate authentication and authorization steps so that we don't need to re-verify the claim token on each filtered object. Also includes: * update semgrep rule for mixed auth endpoints * variables: List returns empty set when all results are filtered	2022-10-27 13:08:05 -04:00
James Rasell	da5069bded	event stream: ensure token expiry is correctly checked for subs. This change ensures that a token's expiry is checked before every event is sent to the caller. Previously, a token could still be used to listen for events after it had expired, as long as the subscription was made while it was unexpired. This would last until the token was garbage collected from state. The check occurs within the RPC as there is currently no state update when a token expires.	2022-10-27 13:08:05 -04:00
Tim Gross	aca95c0bc6	keyring: remove root key GC (#15034 )	2022-10-25 17:06:18 -04:00
Tim Gross	c45d9a9ea8	keyring: refactor to hold locks for less time (#15026 ) Follow-up from https://github.com/hashicorp/nomad/pull/14987/files#r1003611644 We don't need to hold the lock when querying the state store, so move the read-lock to the interior of the `activeKeySet` function.	2022-10-24 16:23:44 -04:00
Tim Gross	b9922631bd	keyring: fix missing GC config, don't rotate on manual GC (#15009 ) The configuration knobs for root keyring garbage collection are present in the consumer and present in the user-facing config, but we missed the spot where we copy from one to the other. Fix this so that users can set their own thresholds. The root key is automatically rotated every ~30d, but the function that does both rotation and key GC was wired up such that `nomad system gc` caused an unexpected key rotation. Split this into two functions so that `nomad system gc` cleans up old keys without forcing a rotation, which will be done periodially or by the `nomad operator root keyring rotate` command.	2022-10-24 08:43:42 -04:00
Tim Gross	3a811ac5e7	keyring: fixes for keyring replication on cluster join (#14987 ) * keyring: don't unblock early if rate limit burst exceeded The rate limiter returns an error and unblocks early if its burst limit is exceeded (unless the burst limit is Inf). Ensure we're not unblocking early, otherwise we'll only slow down the cases where we're already pausing to make external RPC requests. * keyring: set MinQueryIndex on stale queries When keyring replication makes a stale query to non-leader peers to find a key the leader doesn't have, we need to make sure the peer we're querying has had a chance to catch up to the most current index for that key. Otherwise it's possible for newly-added servers to query another newly-added server and get a non-error nil response for that key ID. Ensure that we're setting the correct reply index in the blocking query. Note that the "not found" case does not return an error, just an empty key. So as a belt-and-suspenders, update the handling of empty responses so that we don't break the loop early if we hit a server that doesn't have the key. * test for adding new servers to keyring * leader: initialize keyring after we have consistent reads Wait until we're sure the FSM is current before we try to initialize the keyring. Also, if a key is rotated immediately following a leader election, plans that are in-flight may get signed before the new leader has the key. Allow for a short timeout-and-retry to avoid rejecting plans	2022-10-21 12:33:16 -04:00
James Rasell	206fb04dc1	acl: allow tokens to read policies linked via roles to the token. (#14982 ) ACL tokens are granted permissions either by direct policy links or via ACL role links. Callers should therefore be able to read policies directly assigned to the caller token or indirectly by ACL role links.	2022-10-21 09:05:17 +02:00
James Rasell	215b4e7e36	acl: add ACL roles to event stream topic and resolve policies. (#14923 ) This changes adds ACL role creation and deletion to the event stream. It is exposed as a single topic with two types; the filter is primarily the role ID but also includes the role name. While conducting this work it was also discovered that the events stream has its own ACL resolution logic. This did not account for ACL tokens which included role links, or tokens with expiry times. ACL role links are now resolved to their policies and tokens are checked for expiry correctly.	2022-10-20 09:43:35 +02:00
James Rasell	8e25048f3d	acl: gate ACL role write and delete RPC usage on v1.4.0 or greater. (#14908 )	2022-10-18 16:46:11 +02:00
James Rasell	9923f9e6f3	nnsd: gate registration write & delete RPC use on v1.3.0 or greater. (#14924 )	2022-10-18 15:30:28 +02:00
Tim Gross	3c78980b78	make version checks specific to region (1.4.x) (#14912 ) * One-time tokens are not replicated between regions, so we don't want to enforce that the version check across all of serf, just members in the same region. * Scheduler: Disconnected clients handling is specific to a single region, so we don't want to enforce that the version check across all of serf, just members in the same region. * Variables: enforce version check in Apply RPC * Cleans up a bunch of legacy checks. This changeset is specific to 1.4.x and the changes for previous versions of Nomad will be manually backported in a separate PR.	2022-10-17 16:23:51 -04:00
Tim Gross	c721ce618e	keyring: filter by region before checking version (#14901 ) In #14821 we fixed a panic that can happen if a leadership election happens in the middle of an upgrade. That fix checks that all servers are at the minimum version before initializing the keyring (which blocks evaluation processing during trhe upgrade). But the check we implemented is over the serf membership, which includes servers in any federated regions, which don't necessarily have the same upgrade cycle. Filter the version check by the leader's region. Also bump up log levels of major keyring operations	2022-10-17 13:21:16 -04:00
Seth Hoenig	1593963cd1	servicedisco: implicit constraint for nomad v1.4 when using nsd checks (#14868 ) This PR adds a jobspec mutator to constrain jobs making use of checks in the nomad service provider to nomad clients of at least v1.4.0. Before, in a mixed client version cluster it was possible to submit an NSD job making use of checks and for that job to land on an older, incompatible client node. Closes #14862	2022-10-11 08:21:42 -05:00
Seth Hoenig	69ced2a2bd	services: remove assertion on 'task' field being set (#14864 ) This PR removes the assertion around when the 'task' field of a check may be set. Starting in Nomad 1.4 we automatically set the task field on all checks in support of the NSD checks feature. This is causing validation problems elsewhere, e.g. when a group service using the Consul provider sets 'task' it will fail validation that worked previously. The assertion of leaving 'task' unset was only about making sure job submitters weren't expecting some behavior, but in practice is causing bugs now that we need the task field for more than it was originally added for. We can simply update the docs, noting when the task field set by job submitters actually has value.	2022-10-10 13:02:33 -05:00
Seth Hoenig	5e38a0e82c	cleanup: rename Equals to Equal for consistency (#14759 )	2022-10-10 09:28:46 -05:00
Hemanth Krishna	e516fc266f	enhancement: UpdateTask when Task is waiting for ShutdownDelay (#14775 ) Signed-off-by: Hemanth Krishna <hkpdev008@gmail.com>	2022-10-06 16:33:28 -04:00
Giovani Avelar	a625de2062	Allow specification of a custom job name/prefix for parameterized jobs (#14631 )	2022-10-06 16:21:40 -04:00
Tim Gross	80ec5e1346	fix panic from keyring raft entries being written during upgrade (#14821 ) During an upgrade to Nomad 1.4.0, if a server running 1.4.0 becomes the leader before one of the 1.3.x servers, the old server will crash because the keyring is initialized and writes a raft entry. Wait until all members are on a version that supports the keyring before initializing it.	2022-10-06 12:47:02 -04:00
Michael Schurter	0df5c7d5ae	test: fix flaky test (#14713 ) Need to wait for Stop evals to be processed before you can expect subsequent RPCs to see the alloc's DesiredStatus=stop.	2022-09-27 10:36:16 -07:00
Tim Gross	87681fca68	CSI: ensure initial unpublish state is checkpointed (#14675 ) A test flake revealed a bug in the CSI unpublish workflow, where an unpublish that comes from a client that's successfully done the node-unpublish step will not have the claim checkpointed if the controller-unpublish step fails. This will result in a delay in releasing the volume claim until the next GC. This changeset also ensures we're using a new snapshot after each write to raft, and fixes two timing issues in test where either the volume watcher can unpublish before the unpublish RPC is sent or we don't wait long enough in resource-restricted environements like GHA.	2022-09-27 08:43:45 -04:00
Seth Hoenig	87ec5fdee5	deps: update set and test (#14680 ) This PR updates go-set and shoenig/test, which introduced some breaking API changes.	2022-09-26 08:28:03 -05:00
Seth Hoenig	ae5b800085	cleanup: rearrange mocks package (#14660 ) This PR splits up the nomad/mock package into more files. Specific features that have a lot of mocks get their own file (e.g. acl, variables, csi, connect, etc.). Above that, functions that return jobs/allocs/nodes are in the job/alloc/node file. And lastly other mocks/helpers are in mock.go	2022-09-22 13:49:58 -05:00
Derek Strickland	6874997f91	scheduler: Fix bug where the would treat multiregion jobs as paused for job types that don't use deployments (#14659 ) * scheduler: Fix bug where the scheduler would treat multiregion jobs as paused for job types that don't use deployments Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-09-22 14:31:27 -04:00
Florian Apolloner	f66d61e17f	consul: Removed unused ConsulUsage.Kinds. (#11303 )	2022-09-22 10:07:14 -05:00
Jorge Marey	584ddfe859	Add Namespace, Job and Group to envoy stats (#14311 )	2022-09-22 10:38:21 -04:00
Seth Hoenig	2088ca3345	cleanup more helper updates (#14638 ) * cleanup: refactor MapStringStringSliceValueSet to be cleaner * cleanup: replace SliceStringToSet with actual set * cleanup: replace SliceStringSubset with real set * cleanup: replace SliceStringContains with slices.Contains * cleanup: remove unused function SliceStringHasPrefix * cleanup: fixup StringHasPrefixInSlice doc string * cleanup: refactor SliceSetDisjoint to use real set * cleanup: replace CompareSliceSetString with SliceSetEq * cleanup: replace CompareMapStringString with maps.Equal * cleanup: replace CopyMapStringString with CopyMap * cleanup: replace CopyMapStringInterface with CopyMap * cleanup: fixup more CopyMapStringString and CopyMapStringInt * cleanup: replace CopySliceString with slices.Clone * cleanup: remove unused CopySliceInt * cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice * cleanup: replace CopyMap with maps.Clone * cleanup: run go mod tidy	2022-09-21 14:53:25 -05:00
Michael Schurter	bd4b4b8f66	Data race fixes in tests and a new semgrep rule (#14594 ) * test: don't use loop vars in goroutines fixes a data race in the test * test: copy objects in statestore before mutating fixes data race in test * test: @lgfa29's segmgrep rule for loops/goroutines Found 2 places where we were improperly using loop variables inside goroutines.	2022-09-15 10:35:08 -07:00
Tim Gross	89dfdef95d	variables: handler should catch errors before conflicts (#14591 )	2022-09-14 13:14:17 -04:00
Mahmood Ali	a9d5e4c510	scheduler: stopped-yet-running allocs are still running (#10446 ) * scheduler: stopped-yet-running allocs are still running * scheduler: test new stopped-but-running logic * test: assert nonoverlapping alloc behavior Also add a simpler Wait test helper to improve line numbers and save few lines of code. * docs: tried my best to describe #10446 it's not concise... feedback welcome * scheduler: fix test that allowed overlapping allocs * devices: only free devices when ClientStatus is terminal * test: output nicer failure message if err==nil Co-authored-by: Mahmood Ali <mahmood@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-09-13 12:52:47 -07:00
Seth Hoenig	bf4dd30919	Merge pull request #14553 from hashicorp/f-nsd-check-watcher servicedisco: implement check_restart support for nomad service checks	2022-09-13 09:55:51 -05:00
Charlie Voiselle	6ab59d2aa6	var: Correct 0-index CAS Deletes (#14555 ) * Add missing 0 case for VarDeleteCAS, more comments * Add tests for VarDeleteCAS	2022-09-13 10:12:08 -04:00
Seth Hoenig	9a943107c7	servicedisco: implement check_restart for nomad service checks This PR implements support for check_restart for checks registered in the Nomad service provider. Unlike Consul, Nomad service checks never report a "warning" status, and so the check_restart.ignore_warnings configuration is not valid for Nomad service checks.	2022-09-13 08:59:23 -05:00
Seth Hoenig	b960925939	Merge pull request #14546 from hashicorp/f-refactor-check-watcher client: refactor check watcher to be reusable	2022-09-13 07:32:32 -05:00
Tim Gross	03312f3227	variables: restrict allowed paths for variables (#14547 ) Restrict variable paths to RFC3986 URL-safe characters that don't conflict with the use of characters "@" and "." in `template` blocks. This prevents users from writing variables that will require tricky templating syntax or that they simply won't be able to use. Also restrict the length so that a user can't make queries in the state store unusually expensive (as they are O(k) on the key length).	2022-09-12 16:37:33 -04:00
Seth Hoenig	feff36f3f7	client: refactor check watcher to be reusable This PR refactors agent/consul/check_watcher into client/serviceregistration, and abstracts away the Consul-specific check lookups. In doing so we should be able to reuse the existing check watcher logic for also watching NSD checks in a followup PR. A chunk of consul/unit_test.go is removed - we'll cover that in e2e tests in a follow PR if needed. In the long run I'd like to remove this whole file.	2022-09-12 10:13:31 -05:00
Derek Strickland	5ca934015b	job_endpoint: check spec for all regions (#14519 ) * job_endpoint: check spec for all regions	2022-09-12 09:24:26 -04:00
Charlie Voiselle	b55112714f	Vars: CLI commands for `var get`, `var put`, `var purge` (#14400 ) * Includes updates to `var init`	2022-09-09 17:55:20 -04:00
Charlie Voiselle	e58998e218	Add client scheduling eligibility to heartbeat (#14483 )	2022-09-08 14:31:36 -04:00
Tim Gross	3fc7482ecd	CSI: failed allocation should not block its own controller unpublish (#14484 ) A Nomad user reported problems with CSI volumes associated with failed allocations, where the Nomad server did not send a controller unpublish RPC. The controller unpublish is skipped if other non-terminal allocations on the same node claim the volume. The check has a bug where the allocation belonging to the claim being freed was included in the check incorrectly. During a normal allocation stop for job stop or a new version of the job, the allocation is terminal. But allocations that fail are not yet marked terminal at the point in time when the client sends the unpublish RPC to the server. For CSI plugins that support controller attach/detach, this means that the controller will not be able to detach the volume from the allocation's host and the replacement claim will fail until a GC is run. This changeset fixes the conditional so that the claim's own allocation is not included, and makes the logic easier to read. Include a test case covering this path. Also includes two minor extra bugfixes: * Entities we get from the state store should always be copied before altering. Ensure that we copy the volume in the top-level unpublish workflow before handing off to the steps. * The list stub object for volumes in `nomad/structs` did not match the stub object in `api`. The `api` package also did not include the current readers/writers fields that are expected by the UI. True up the two objects and add the previously undocumented fields to the docs.	2022-09-08 13:30:05 -04:00
James Rasell	3fa8b0b270	client: fix RPC forwarding when querying checks for alloc. (#14498 ) When querying the checks for an allocation, the request must be forwarded to the agent that is running the allocation. If the initial request is made to a server agent, the request can be made directly to the client agent running the allocation. If the request is made to a client agent not running the alloc, the request needs to be forwarded to a server and then the correct client.	2022-09-08 16:55:23 +02:00
Tim Gross	2a961af44c	test: fix concurrent map access in `TestStatsFetcher` (#14496 ) The map of in-flight RPCs gets cleared by a goroutine in the test without first locking it to make sure that it's not being accessed concurrently by the stats fetcher itself. This can cause a panic in tests.	2022-09-08 10:41:15 -04:00
Tim Gross	5c57a84e99	autopilot: deflake tests (#14475 ) Includes: * Remove leader upgrade raft version test, as older versions of raft are now incompatible with our autopilot library. * Remove attempt to assert initial non-voter status on the `PromoteNonVoter` test, as this happens too quickly to reliably detect. * Unskip some previously-skipped tests which we should make stable. * Remove the `consul/sdk` retry helper for these tests; this uses panic recovery in a kind of a clever/gross way to reduce LoC but it seems to introduce some timing issues in the process. * Add more test step logging and reduce logging noise from the scheduler goroutines to make it easier to debug failing tests. * Be more consistent about using the `waitForStableLeadership` helper so that we can assert the cluster is fully stable and not just that we've added peers.	2022-09-07 09:35:01 -04:00
James Rasell	962b1f78e8	core: clarify ACL token expiry GC messages to show global param. (#14466 )	2022-09-06 15:42:45 +02:00
Kellen Fox	5086368a1e	Add a log line to help track node eligibility (#14125 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2022-09-06 14:03:33 +02:00
Yan	6e927fa125	warn destructive update only when count > 1 (#13103 )	2022-09-02 15:30:06 -04:00
Tim Gross	7921f044e5	migrate autopilot implementation to raft-autopilot (#14441 ) Nomad's original autopilot was importing from a private package in Consul. It has been moved out to a shared library. Switch Nomad to use this library so that we can eliminate the import of Consul, which is necessary to build Nomad ENT with the current version of the Consul SDK. This also will let us pick up autopilot improvements shared with Consul more easily.	2022-09-01 14:27:10 -04:00
Luiz Aoqui	19de803503	cli: ignore VaultToken when generating job diff (#14424 )	2022-09-01 10:01:53 -04:00
James Rasell	4b9bcf94da	chore: remove use of "err" a log line context key for errors. (#14433 ) Log lines which include an error should use the full term "error" as the context key. This provides consistency across the codebase and avoids a Go style which operators might not be aware of.	2022-09-01 15:06:10 +02:00
Luiz Aoqui	dc6525336b	ci: fix TestNomad_BootstrapExpect_NonVoter test (#14407 ) PR #12130 refactored the test to use the `wantPeers` helper, but this function only returns the number of voting peers, which in this test should be equal to 2. I think the tests were passing back them because of a bug in Raft (https://github.com/hashicorp/raft/pull/483) where a non-voting server was able to transition to candidate state. One possible evidence of this is that a successful test run would have the following log line: ``` raft@v1.3.5/raft.go:1058: nomad.raft: updating configuration: command=AddVoter server-id=127.0.0.1:9101 server-addr=127.0.0.1:9101 servers="[{Suffrage:Voter ID:127.0.0.1:9107 Address:127.0.0.1:9107} {Suffrage:Voter ID:127.0.0.1:9105 Address:127.0.0.1:9105} {Suffrage:Voter ID:127.0.0.1:9103 Address:127.0.0.1:9103} {Suffrage:Voter ID:127.0.0.1:9101 Address:127.0.0.1:9101}]" ``` This commit reverts the test logic to check for peer count, regardless of voting status.	2022-08-30 16:32:54 -04:00
Tim Gross	5784fb8c58	search: enforce correct ACL for search over variables (#14397 )	2022-08-30 13:27:31 -04:00
Tim Gross	c9d678a91a	keyring: wrap root key in key encryption key (#14388 ) Update the on-disk format for the root key so that it's wrapped with a unique per-key/per-server key encryption key. This is a bit of security theatre for the current implementation, but it uses `go-kms-wrapping` as the interface for wrapping the key. This provides a shim for future support of external KMS such as cloud provider APIs or Vault transit encryption. * Removes the JSON serialization extension we had on the `RootKey` struct; this struct is now only used for key replication and not for disk serialization, so we don't need this helper. * Creates a helper for generating cryptographically random slices of bytes that properly accounts for short reads from the source. * No observable functional changes outside of the on-disk format, so there are no test updates.	2022-08-30 10:59:25 -04:00
Seth Hoenig	52de2dc09d	Merge pull request #14290 from hashicorp/cleanup-more-helper-cleanup cleanup: tidy up helper package some more	2022-08-30 08:19:48 -05:00
James Rasell	755b4745ed	Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main	2022-08-30 08:59:13 +01:00
Seth Hoenig	3e1e2001b9	Merge pull request #14143 from hashicorp/cleanup-slice-sets-3 cleanup: more cleanup of slices that are really sets	2022-08-29 13:52:59 -05:00
Tim Gross	7d1eb2efd5	keyring: split structs to its own file (#14378 )	2022-08-29 14:18:35 -04:00
Seth Hoenig	9d0e274f27	cleanup: cleanup more slice-set comparisons	2022-08-29 12:04:21 -05:00
Tim Gross	62a968f443	Merge pull request #14351 from hashicorp/variables-rename Variables rename	2022-08-29 11:36:50 -04:00
Piotr Kazmierczak	5f353503e5	bugfix: fixed template validation panic in case of incorrect ChangeScript configuration (#14374 ) Fixes #14367	2022-08-29 17:11:15 +02:00
Tim Gross	1dc053b917	rename SecureVariables to Variables throughout	2022-08-26 16:06:24 -04:00
Tim Gross	dcfd31296b	file rename	2022-08-26 16:06:24 -04:00
Seth Hoenig	b87689d2d1	Merge pull request #14318 from hashicorp/cleanup-create-pointer-compare cleanup: create pointer.Compare helper function	2022-08-26 09:15:41 -05:00
Seth Hoenig	6b2655ad86	cleanup: create pointer.Compare helper function This PR creates a pointer.Compare helper for comparing equality of two pointers. Strictly only works with primitive types we know are safe to derefence and compare using '=='.	2022-08-26 08:55:59 -05:00
James Rasell	601588df6b	Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main	2022-08-25 12:14:29 +01:00
James Rasell	7a0798663d	acl: fix a bug where roles could be duplicated by name. An ACL roles name must be unique, however, a bug meant multiple roles of the same same could be created. This fixes that problem with checks in the RPC handler and state store.	2022-08-25 09:20:43 +01:00
Luiz Aoqui	e012d9411e	Task lifecycle restart (#14127 ) * allocrunner: handle lifecycle when all tasks die When all tasks die the Coordinator must transition to its terminal state, coordinatorStatePoststop, to unblock poststop tasks. Since this could happen at any time (for example, a prestart task dies), all states must be able to transition to this terminal state. * allocrunner: implement different alloc restarts Add a new alloc restart mode where all tasks are restarted, even if they have already exited. Also unifies the alloc restart logic to use the implementation that restarts tasks concurrently and ignores ErrTaskNotRunning errors since those are expected when restarting the allocation. * allocrunner: allow tasks to run again Prevent the task runner Run() method from exiting to allow a dead task to run again. When the task runner is signaled to restart, the function will jump back to the MAIN loop and run it again. The task runner determines if a task needs to run again based on two new task events that were added to differentiate between a request to restart a specific task, the tasks that are currently running, or all tasks that have already run. * api/cli: add support for all tasks alloc restart Implement the new -all-tasks alloc restart CLI flag and its API counterpar, AllTasks. The client endpoint calls the appropriate restart method from the allocrunner depending on the restart parameters used. * test: fix tasklifecycle Coordinator test * allocrunner: kill taskrunners if all tasks are dead When all non-poststop tasks are dead we need to kill the taskrunners so we don't leak their goroutines, which are blocked in the alloc restart loop. This also ensures the allocrunner exits on its own. * taskrunner: fix tests that waited on WaitCh Now that "dead" tasks may run again, the taskrunner Run() method will not return when the task finishes running, so tests must wait for the task state to be "dead" instead of using the WaitCh, since it won't be closed until the taskrunner is killed. * tests: add tests for all tasks alloc restart * changelog: add entry for #14127 * taskrunner: fix restore logic. The first implementation of the task runner restore process relied on server data (`tr.Alloc().TerminalStatus()`) which may not be available to the client at the time of restore. It also had the incorrect code path. When restoring a dead task the driver handle always needs to be clear cleanly using `clearDriverHandle` otherwise, after exiting the MAIN loop, the task may be killed by `tr.handleKill`. The fix is to store the state of the Run() loop in the task runner local client state: if the task runner ever exits this loop cleanly (not with a shutdown) it will never be able to run again. So if the Run() loops starts with this local state flag set, it must exit early. This local state flag is also being checked on task restart requests. If the task is "dead" and its Run() loop is not active it will never be able to run again. * address code review requests * apply more code review changes * taskrunner: add different Restart modes Using the task event to differentiate between the allocrunner restart methods proved to be confusing for developers to understand how it all worked. So instead of relying on the event type, this commit separated the logic of restarting an taskRunner into two methods: - `Restart` will retain the current behaviour and only will only restart the task if it's currently running. - `ForceRestart` is the new method where a `dead` task is allowed to restart if its `Run()` method is still active. Callers will need to restart the allocRunner taskCoordinator to make sure it will allow the task to run again. * minor fixes	2022-08-24 17:43:07 -04:00
Tim Gross	c732b215f0	vault: detect namespace change in config reload (#14298 ) The `namespace` field was not included in the equality check between old and new Vault configurations, which meant that a Vault config change that only changed the namespace would not be detected as a change and the clients would not be reloaded. Also, the comparison for boolean fields such as `enabled` and `allow_unauthenticated` was on the pointer and not the value of that pointer, which results in spurious reloads in real config reload that is easily missed in typical test scenarios. Includes a minor refactor of the order of fields for `Copy` and `Merge` to match the struct fields in hopes it makes it harder to make this mistake in the future, as well as additional test coverage.	2022-08-24 17:03:29 -04:00
Piotr Kazmierczak	7077d1f9aa	template: custom change_mode scripts (#13972 ) This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707	2022-08-24 17:43:01 +02:00
Seth Hoenig	bff6c88683	cleanup: remove more copies of min/max from helper	2022-08-24 09:56:15 -05:00
James Rasell	9782d6d7ff	acl: allow tokens to lookup linked roles. (#14227 ) When listing or reading an ACL role, roles linked to the ACL token used for authentication can be returned to the caller.	2022-08-24 13:51:51 +02:00
Michael Schurter	0114bcfe5b	core: move LicenseConfig to shared file (#14247 ) This moves LicenseConfig and its Copy method to a shared file so that it can be shared with enterprise code.	2022-08-23 13:44:10 -07:00
Luiz Aoqui	7a8cacc9ec	allocrunner: refactor task coordinator (#14009 ) The current implementation for the task coordinator unblocks tasks by performing destructive operations over its internal state (like closing channels and deleting maps from keys). This presents a problem in situations where we would like to revert the state of a task, such as when restarting an allocation with tasks that have already exited. With this new implementation the task coordinator behaves more like a finite state machine where task may be blocked/unblocked multiple times by performing a state transition. This initial part of the work only refactors the task coordinator and is functionally equivalent to the previous implementation. Future work will build upon this to provide bug fixes and enhancements.	2022-08-22 18:38:49 -04:00
Tim Gross	bf57d76ec7	allow ACL policies to be associated with workload identity (#14140 ) The original design for workload identities and ACLs allows for operators to extend the automatic capabilities of a workload by using a specially-named policy. This has shown to be potentially unsafe because of naming collisions, so instead we'll allow operators to explicitly attach a policy to a workload identity. This changeset adds workload identity fields to ACL policy objects and threads that all the way down to the command line. It also a new secondary index to the ACL policy table on namespace and job so that claim resolution can efficiently query for related policies.	2022-08-22 16:41:21 -04:00
Charlie Voiselle	29e63a6cb2	Make var get a blocking query as expected (#14205 )	2022-08-22 16:37:21 -04:00
Luiz Aoqui	dbffdca92e	template: use pointer values for gid and uid (#14203 ) When a Nomad agent starts and loads jobs that already existed in the cluster, the default template uid and gid was being set to 0, since this is the zero value for int. This caused these jobs to fail in environments where it was not possible to use 0, such as in Windows clients. In order to differentiate between an explicit 0 and a template where these properties were not set we need to use a pointer.	2022-08-22 16:25:49 -04:00
James Rasell	2736cf0dfa	acl: make listing RPC and HTTP API a stub return object. (#14211 ) Making the ACL Role listing return object a stub future-proofs the endpoint. In the event the role object grows, we are not bound by having to return all fields within the list endpoint or change the signature of the endpoint to reduce the list return size.	2022-08-22 17:20:23 +02:00
James Rasell	802d005ef5	acl: add replication to ACL Roles from authoritative region. (#14176 ) ACL Roles along with policies and global token will be replicated from the authoritative region to all federated regions. This involves a new replication loop running on the federated leader. Policies and roles may be replicated at different times, meaning the policies and role references may not be present within the local state upon replication upsert. In order to bypass the RPC and state check, a new RPC request parameter has been added. This is used by the replication process; all other callers will trigger the ACL role policy validation check. There is a new ACL RPC endpoint to allow the reading of a set of ACL Roles which is required by the replication process and matches ACL Policies and Tokens. A bug within the ACL Role listing RPC has also been fixed which returned incorrect data during blocking queries where a deletion had occurred.	2022-08-22 08:54:07 +02:00
Michael Schurter	26637ab55d	core: fix race mutating jobs in scaling api (#14192 ) Since the state store returns a pointer to the shared job structs in memdb we must always copy it before mutating it and applying the new version via raft. Otherwise if the rpc fails before the mutated job is committed to raft (either due to validation, bug, crash, or other exit condition), the leader server will have an updated copy of the job that other servers will not have.	2022-08-19 15:46:54 -07:00
Michael Schurter	3b57df33e3	client: fix data races in config handling (#14139 ) Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked. This change takes the following approach to safely handling configs in the client: 1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex 2. Since the mutex only protects the config pointer itself and not fields inside the Config struct: all config mutation must be done on a copy of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates. 3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion. 4. To facilitate deep copying I made an internally backward incompatible API change: our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"	2022-08-18 16:32:04 -07:00
Derek Strickland	8dba52cee2	sentinel: add support for Nomad ACL Token and Namespace (#14171 ) * sentinel: add ability to reference Nomad ACL Token and Namespace in Sentinel policies	2022-08-18 16:33:00 -04:00
Michael Schurter	1d1526cd24	events: fix race in acl event handling (#14188 )	2022-08-18 11:39:14 -07:00
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
James Rasell	9e3f1581fb	core: add ACL role functionality to ACL tokens. ACL tokens can now utilize ACL roles in order to provide API authorization. Each ACL token can be created and linked to an array of policies as well as an array of ACL role links. The link can be provided via the role name or ID, but internally, is always resolved to the ID as this is immutable whereas the name can be changed by operators. When resolving an ACL token, the policies linked from an ACL role are unpacked and combined with the policy array to form the complete auth set for the token. The ACL token creation endpoint handles deduplicating ACL role links as well as ensuring they exist within state. When reading a token, Nomad will also ensure the ACL role link is current. This handles ACL roles being deleted from under a token from a UX standpoint.	2022-08-17 14:45:01 +01:00
Seth Hoenig	bb58b94e15	Merge pull request #14141 from hashicorp/cleanup-slice-sets-2 cleanup: helper func for comparing slices of .Equals types	2022-08-16 15:12:39 -05:00
Michael Schurter	cdf5a74998	core: fix data races in blocked eval chan handling (#14142 ) Similar to the deployment watcher fix in #14121 - the server code loves these mutable structs so we need to guard access to the struct fields with locks. Capturing ch := b.capacityChangeCh is sufficient to satisfy the data race detector, but I noticed it was also possible to leak goroutines: Since the watchCapacity loop is in charge of receiving from capacityChangeCh and exits when stopCh is closed, senders to capacityChangeCh also must exit when stopCh is closed. Otherwise they may block forever if capacityChangeCh is full because it will never be received on again. I did not find evidence of this occurring in my meager smattering of prod goroutine dumps I have laying around, but this isn't surprising as the chan has a buffer of 8096! I would imagine that is sufficient to handle "late" sends and then just get GC'd away when the last reference to the old chan is dropped. This is just additional safety/correctness.	2022-08-16 12:33:53 -07:00
Seth Hoenig	47f5452825	cleanup: helper funcs for comparing slices of references	2022-08-16 13:47:47 -05:00
Michael Schurter	0c3cfb073a	nomad: fix race in Bootstrapped access (#14120 ) * nomad: fix race in Bootstrapped access	2022-08-16 11:22:33 -07:00
Michael Schurter	285979e96c	deployments: fix data races (#14121 ) * deployments: fix data races Both priority and state related fields may be mutated concurrently and need to be accessed with the lock acquired.	2022-08-16 10:50:40 -07:00
Seth Hoenig	7728cf5a9a	Merge pull request #14132 from hashicorp/build-update-go1.19 build: update to go1.19	2022-08-16 11:20:27 -05:00
Seth Hoenig	b3ea68948b	build: run gofmt on all go source files Go 1.19 will forecefully format all your doc strings. To get this out of the way, here is one big commit with all the changes gofmt wants to make.	2022-08-16 11:14:11 -05:00
Seth Hoenig	56b0b456dc	Merge pull request #14102 from hashicorp/cleanup-mesh-gateway-value cleanup: consul mesh gateway type need not be pointer	2022-08-16 10:07:16 -05:00
Tim Gross	a4e89d72a8	secure vars: filter by path in List RPCs (#14036 ) The List RPCs only checked the ACL for the Prefix argument of the request. Add an ACL filter to the paginator for the List RPC. Extend test coverage of ACLs in the List RPC and in the `acl` package, and add a "deny" capability so that operators can deny specific paths or prefixes below an allowed path.	2022-08-15 11:38:20 -04:00
Tim Gross	4005759d28	move secure variable conflict resolution to state store (#13922 ) Move conflict resolution implementation into the state store with a new Apply RPC. This also makes the RPC for secure variables much more similar to Consul's KV, which will help us support soft deletes in a post-1.4.0 version of Nomad. Reimplement quotas in the state store functions. Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>	2022-08-15 11:19:53 -04:00
Seth Hoenig	f9355c29fb	cleanup: consul mesh gateway type need not be pointer This PR changes the use of structs.ConsulMeshGateway to value types instead of via pointers. This will help in a follow up PR where we cleanup a lot of custom comparison code with helper functions instead.	2022-08-13 11:26:58 -05:00
James Rasell	581a5bb6ad	rpc: add ACL Role RPC endpoint for CRUD actions. New ACL Role RPC endpoints have been created to allow the creation, update, read, and deletion of ACL roles. All endpoints require a management token; in the future readers will also be allowed to view roles associated to their ACL token. The create endpoint in particular is responsible for deduplicating ACL policy links and ensuring named policies are found within state. This is done within the RPC handler so we perform a single loop through the links for slight efficiency.	2022-08-11 08:43:50 +01:00
Seth Hoenig	3aaaedf52e	cli: forward request for job validation to nomad leader This PR changes the behavior of 'nomad job validate' to forward the request to the nomad leader, rather than responding from any server. This is because we need the leader when validating Vault tokens, since the leader is the only server with an active vault client.	2022-08-10 14:34:04 -05:00
James Rasell	e660c9a908	core: add ACL role state schema and functionality. (#13955 ) This commit includes the new state schema for ACL roles along with state interaction functions for CRUD actions. The change also includes snapshot persist and restore functionality and the addition of FSM messages for Raft updates which will come via RPC endpoints.	2022-08-09 09:33:41 +02:00
Seth Hoenig	0b52c27a15	Merge pull request #14045 from Abirdcfly/main fix minor unreachable code caused by t.Fatal	2022-08-08 11:47:02 -05:00
Abirdcfly	d66943d4f7	fix minor unreachable code caused by t.Fatal Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-08-08 23:50:11 +08:00
Seth Hoenig	2b6bda49b9	core: automatically plumb task name into task-level services and checks	2022-08-05 12:42:41 -05:00
Seth Hoenig	f6f26fb72c	nsd: add support for setting request body in http checks This PR adds support for settings check.body in checks of services making use of Nomad's service provider.	2022-08-04 14:40:23 -05:00
Charles Z	7a8ec90fbe	allow unhealthy canaries without blocking autopromote (#14001 )	2022-08-04 11:53:50 -04:00
Seth Hoenig	dcda57e729	nsd: add support for setting headers on nomad service http checks This PR enables setting of the headers block on services registered into Nomad's service provider. Works just like the existing support in Consul checks.	2022-08-03 10:06:44 -05:00
Seth Hoenig	067aa00a6a	Merge pull request #13953 from hashicorp/f-nsd-check-methods nsd: add support for specifying check.method in nomad service checks	2022-08-03 08:28:38 -05:00
Piotr Kazmierczak	530280505f	client: enable specifying user/group permissions in the template stanza (#13755 ) * Adds Uid/Gid parameters to template. * Updated diff_test * fixed order * update jobspec and api * removed obsolete code * helper functions for jobspec parse test * updated documentation * adjusted API jobs test. * propagate uid/gid setting to job_endpoint * adjusted job_endpoint tests * making uid/gid into pointers * refactor * updated documentation * updated documentation * Update client/allocrunner/taskrunner/template/template_test.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update website/content/api-docs/json-jobs.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * propagating documentation change from Luiz * formatting * changelog entry * changed changelog entry Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-08-02 22:15:38 +02:00
Eric Weber	cbce13c1ac	Add stage_publish_base_dir field to csi_plugin stanza of a job (#13919 ) * Allow specification of CSI staging and publishing directory path * Add website documentation for stage_publish_dir * Replace erroneous reference to csi_plugin.mount_config with csi_plugin.mount_dir * Avoid requiring CSI plugins to be redeployed after introducing StagePublishDir	2022-08-02 09:42:44 -04:00
Tim Gross	e5ac6464f6	secure vars: enforce ENT quotas (OSS work) (#13951 ) Move the secure variables quota enforcement calls into the state store to ensure quota checks are atomic with quota updates (in the same transaction). Switch to a machine-size int instead of a uint64 for quota tracking. The ENT-side quota spec is described as int, and negative values have a meaning as "not permitted at all". Using the same type for tracking will make it easier to the math around checks, and uint64 is infeasibly large anyways. Add secure vars to quota HTTP API and CLI outputs and API docs.	2022-08-02 09:32:09 -04:00
James Rasell	663aa92b7a	Merge branch 'main' into f-gh-13120-sso-umbrella	2022-08-02 08:30:03 +01:00
Seth Hoenig	a4d4a76994	nsd: add support for specifying check.method in nomad service checks Unblock 'check.method' in service validation. Add tests around making sure this value gets plumbed through.	2022-08-01 16:13:48 -05:00
Tim Gross	04677d205e	block deleting namespace if it contains a secure variable (#13888 ) When we delete a namespace, we check to ensure that there are no non-terminal jobs or CSI volume, which also covers evals, allocs, etc. Secure variables are also namespaces, so extend this check to them as well.	2022-07-22 10:06:35 -04:00
Seth Hoenig	5aaa31a6dc	Merge pull request #13882 from hashicorp/cleanup-onupdate-consts cleanup: use constants for on_update values	2022-07-22 08:58:36 -05:00
Tim Gross	c7a11a86c6	block deleting namespaces if the namespace contains a volume (#13880 ) When we delete a namespace, we check to ensure that there are no non-terminal jobs, which effectively covers evals, allocs, etc. CSI volumes are also namespaced, so extend this check to cover CSI volumes.	2022-07-21 16:13:52 -04:00
Seth Hoenig	d8fe1d10ba	cleanup: use constants for on_update values	2022-07-21 13:09:47 -05:00
Seth Hoenig	c61e779b48	Merge pull request #13715 from hashicorp/dev-nsd-checks client: add support for checks in nomad services	2022-07-21 10:22:57 -05:00
Seth Hoenig	606e3ebdd4	client: updates from pr feedback	2022-07-21 09:54:27 -05:00
Seth Hoenig	8e6eeaa37e	Merge pull request #13869 from hashicorp/b-uniq-services-2 servicedisco: ensure service uniqueness in job validation	2022-07-21 08:24:24 -05:00
Tim Gross	d11da1df5c	workload identity: use parent ID for dispatch/periodic jobs (#13748 ) Workload identities grant implicit access to policies, and operators will not want to craft separate policies for each invocation of a periodic or dispatch job. Use the parent job's ID as the JobID claim.	2022-07-21 09:05:54 -04:00
Tim Gross	9c43c28575	search: use secure vars ACL policy for secure vars context (#13788 ) The search RPC used a placeholder policy for searching within the secure variables context. Now that we have ACL policies built for secure variables, we can use them for search. Requires a new loose policy for checking if a token has any secure variables access within a namespace, so that we can filter on specific paths in the iterator.	2022-07-21 08:39:36 -04:00
Tim Gross	97a6346da0	keyring: use nanos for `CreateTime` in key metadata (#13849 ) Most of our objects use int64 timestamps derived from `UnixNano()` instead of `time.Time` objects. Switch the keyring metadata to use `UnixNano()` for consistency across the API.	2022-07-20 14:46:57 -04:00
Tim Gross	428e23043c	secure vars: limit maximum size of variable data (#13743 ) To discourage accidentally DoS'ing the cluster with secure variables data, we're providing a very low limit to the maximum size of a given secure variable. This currently matches the limit for dispatch payloads. In future versions, we may increase this limit or make it configurable, once we have better metrics from real-world operators.	2022-07-20 14:46:43 -04:00
Seth Hoenig	e5978a9cbf	jobspec: ensure service uniqueness in job validation	2022-07-20 12:38:08 -05:00
Seth Hoenig	d83aae253f	cleanup: track task names and providers using set	2022-07-20 11:48:36 -05:00
Seth Hoenig	bd2935ee54	cleanup: tweaks from cr feedback	2022-07-20 10:42:35 -05:00
Seth Hoenig	93cfeb177b	cleanup: example refactoring out map[string]struct{} using set.Set This PR is a little demo of using github.com/hashicorp/go-set to replace the use of map[T]struct{} as a make-shift set.	2022-07-19 22:50:49 -05:00

1 2 3 4 5 ...

4286 commits