open-nomad

Author	SHA1	Message	Date
Mahmood Ali	38a01c050e	Merge pull request #8192 from hashicorp/f-status-allnamespaces-2 CLI Allow querying all namespaces for jobs and allocations - Try 2	2020-06-18 20:16:52 -04:00
Nick Ethier	4a44deaa5c	CNI Implementation (#7518 )	2020-06-18 11:05:29 -07:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Mahmood Ali	c0aa06d9c7	rpc: allow querying allocs across namespaces This implements the backend handling for querying across namespaces for allocation list endpoints.	2020-06-17 16:31:06 -04:00
Mahmood Ali	e784fe331a	use '*' to indicate all namespaces This reverts the introduction of AllNamespaces parameter that was merged earlier but never got released.	2020-06-17 16:27:43 -04:00
Tim Gross	81ae581da6	test: remove flaky test from volumewatcher (#8189 ) The volumewatcher restores itself on notification, but detecting this is racy because it may reap any claim (or find there are no claims to reap) and shutdown before we can test whether it's running. This appears to have become flaky with a new version of golang. The other cases in this test case sufficiently exercise the start/stop behavior of the volumewatcher, so remove the flaky section.	2020-06-17 15:41:51 -04:00
Chris Baker	fe9d654640	Merge pull request #8187 from hashicorp/f-8143-block-scaling-during-deployment modify Job.Scale RPC to return an error if there is an active deployment	2020-06-17 14:38:55 -05:00
Chris Baker	cd903218f7	added changelog entry and satisfied `make check`	2020-06-17 17:43:45 +00:00
Chris Baker	ab2b15d8cb	modify Job.Scale RPC to return an error if there is an active deployment resolves #8143	2020-06-17 17:03:35 +00:00
Tim Gross	6b1cb61888	remove test for ent-only behavior	2020-06-17 11:27:29 -04:00
Tim Gross	c14a75bfab	multiregion: use pending instead of paused The `paused` state is used as an operator safety mechanism, so that they can debug a deployment or halt one that's causing a wider failure. By using the `paused` state as the first state of a multiregion deployment, we risked resuming an intentionally operator-paused deployment because of activity in a peer region. This changeset replaces the use of the `paused` state with a `pending` state, and provides a `Deployment.Run` internal RPC to replace the use of the `Deployment.Pause` (resume) RPC we were using in `deploymentwatcher`.	2020-06-17 11:06:14 -04:00
Tim Gross	fd50b12ee2	multiregion: integrate with deploymentwatcher * `nextRegion` should take status parameter * thread Deployment/Job RPCs thru `nextRegion` * add `nextRegion` calls to `deploymentwatcher` * use a better description for paused for peer	2020-06-17 11:06:00 -04:00
Tim Gross	7b12445f29	multiregion: change AutoRevert to OnFailure	2020-06-17 11:05:45 -04:00
Tim Gross	5c4d0a73f4	start all but first region deployment in paused state	2020-06-17 11:05:34 -04:00
Tim Gross	48e9f75c1e	multiregion: deploymentwatcher hooks This changeset establishes hooks in deploymentwatcher for multiregion deployments (for the enterprise version of Nomad).	2020-06-17 11:05:18 -04:00
Tim Gross	b09b7a2475	Multiregion job registration Integration points for multiregion jobs to be registered in the enterprise version of Nomad: * hook in `Job.Register` for enterprise to send job to peer regions * remove monitoring from `nomad job run` and `nomad job stop` for multiregion jobs	2020-06-17 11:04:58 -04:00
Drew Bailey	9263fcb0d3	Multiregion deploy status and job status CLI	2020-06-17 11:03:34 -04:00
Tim Gross	473a0f1d44	multiregion: unblock and cancel RPCs	2020-06-17 11:02:26 -04:00
Tim Gross	ede3a4f1c4	multiregion: request structs	2020-06-17 11:00:34 -04:00
Tim Gross	6851024925	Multiregion structs Initial struct definitions, jobspec parsing, validation, and conversion between Nomad structs and API structs for multi-region deployments.	2020-06-17 11:00:14 -04:00
Chris Baker	9fc66bc1aa	support in API client and Job.Register RPC for PreserveCounts	2020-06-16 18:45:28 +00:00
Chris Baker	1e3563e08c	wip: added PreserveCounts to struct.JobRegisterRequest, development test for Job.Register	2020-06-16 18:45:17 +00:00
Chris Baker	7ed06cced0	core: update Job.Scale to save the previous job count in the ScalingEvent	2020-06-15 19:49:22 +00:00
Chris Baker	aeb3ed449e	wip: added .PreviousCount to api.ScalingEvent and structs.ScalingEvent, with developmental tests	2020-06-15 19:40:21 +00:00
Mahmood Ali	c17ffb2d35	Merge pull request #8131 from hashicorp/f-snapshot-restore Implement snapshot restore	2020-06-15 08:32:34 -04:00
Mahmood Ali	9bfc3e28d9	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-06-15 08:32:16 -04:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Mahmood Ali	63e048e972	clarify ccomments, esp related to leadership code	2020-06-09 12:01:31 -04:00
Mahmood Ali	b543460e0a	loosen raft timeout	2020-06-07 16:38:11 -04:00
Mahmood Ali	69bb42acf8	tests: prefix agent logs to identify agent sources	2020-06-07 16:38:11 -04:00
Mahmood Ali	47a163b63f	reassert leadership	2020-06-07 15:47:06 -04:00
Mahmood Ali	9eb13ae144	basic snapshot restore	2020-06-07 15:46:23 -04:00
Mahmood Ali	bf7a3583e5	Merge pull request #8089 from hashicorp/b-leader-worker-count leadership: pause and unpause workers consistently	2020-06-04 12:01:01 -04:00
Mahmood Ali	cd8e1b4d62	stop periodic dispatch at end of tests (#8111 )	2020-06-04 09:15:00 -04:00
Lang Martin	ac7c39d3d3	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 ) * client/heartbeatstop: reversed time condition for startup grace * scheduler/generic_sched: use `delayInstead` to avoid a loop Without protecting the loop that creates followUpEvals, a delayed eval is allowed to create an immediate subsequent delayed eval. For both `stop_after_client_disconnect` and the `reschedule` block, a delayed eval should always produce some immediate result (running or blocked) and then only after the outcome of that eval produce a second delayed eval. * scheduler/reconcile: lostLater are different than delayedReschedules Just slightly. `lostLater` allocs should be used to create batched evaluations, but `handleDelayedReschedules` assumes that the allocations are in the untainted set. When it creates the in-place updates to those allocations at the end, it causes the allocation to be treated as running over in the planner, which causes the initial `stop_after_client_disconnect` evaluation to be retried by the worker.	2020-06-03 09:48:38 -04:00
Mahmood Ali	70fbcb99c2	leadership: pause and unpause workers consistently This fixes a bug where leadership establishment pauses 3/4 of workers but stepping down unpause only 1/2!	2020-06-01 10:57:53 -04:00
Mahmood Ali	891fb3f8a9	test for paused workers upon leadership revocation	2020-06-01 10:48:42 -04:00
Mahmood Ali	de44d9641b	Merge pull request #8047 from hashicorp/f-snapshot-save API for atomic snapshot backups	2020-06-01 07:55:16 -04:00
Mahmood Ali	e37a3312d5	If leadership fails, consider it handled The callers for `forward` and old implementation expect failures to be accompanied with a true value! This fixes the issue and have tests passing!	2020-05-31 22:06:17 -04:00
Mahmood Ali	30ab9c84e5	more review feedback	2020-05-31 21:39:09 -04:00
Mahmood Ali	a73cd01a00	Merge pull request #8001 from hashicorp/f-jobs-list-across-nses endpoint to expose all jobs across all namespaces	2020-05-31 21:28:03 -04:00
Mahmood Ali	082c085068	Merge pull request #8036 from hashicorp/f-background-vault-revoke-on-restore Speed up leadership establishment	2020-05-31 21:27:16 -04:00
Mahmood Ali	1af32e65bc	clarify rpc consistency readiness comment	2020-05-31 21:26:41 -04:00
Mahmood Ali	0819ea60ea	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-05-31 21:04:39 -04:00
Mahmood Ali	37c6160b96	Handle nil/empty cluster metadata Handle case where a snapshot is made before cluster metadata is created. This fixes a bug where a server may have empty cluster metadata if it created and installed a Raft snapshot before a new cluster metadata ID is generated. This case is very unlikely to arise. Most likely reason is when upgrading from an old version slowly where servers may use snapshots before all servers upgrade. This happened for a user with a log line like: ``` 2020-05-21T15:21:56.996Z [ERROR] nomad.fsm: ClusterSetMetadata failed: error=""set cluster metadata failed: refusing to set new cluster id, previous: , new: <<redacted> ```	2020-05-29 13:34:21 -04:00
Drew Bailey	23d24c7a7f	removes pro tags (#8014 )	2020-05-28 15:40:17 -04:00
Mahmood Ali	475b3b77ad	Merge pull request #8060 from hashicorp/tests-deflake-20200526 Deflake some tests - 2020-05-27 edition	2020-05-27 15:24:31 -04:00
Drew Bailey	34871f89be	Oss license support for ent builds (#8054 ) * changes necessary to support oss licesning shims revert nomad fmt changes update test to work with enterprise changes update tests to work with new ent enforcements make check update cas test to use scheduler algorithm back out preemption changes add comments * remove unused method	2020-05-27 13:46:52 -04:00
Mahmood Ali	61e4f5aaf9	tests: use GreaterOrEqual and apply change to other tests	2020-05-27 11:22:48 -04:00
Mahmood Ali	6dfe0f5d3b	tests: use t.Fatalf when it's clearer	2020-05-27 10:09:56 -04:00
Mahmood Ali	ec1fcedb93	tests: node drain events may be duplicated	2020-05-27 08:59:06 -04:00
Mahmood Ali	c3c2a85314	tests: wait until clients are in the state store	2020-05-26 18:53:24 -04:00
Mahmood Ali	5d80d2a511	tests: eval may be processed quickly	2020-05-26 18:53:24 -04:00
Mahmood Ali	19141f8103	{volume\|deployment}watcher: check for nil batcher	2020-05-26 14:54:27 -04:00
Mahmood Ali	81ac098a22	deploymentwatcher: no batcher when disabling When disabling deploymentwatcher (at the end of a test), avoid starting a new update batcher with its new goroutine.	2020-05-26 14:44:47 -04:00
Mahmood Ali	ccc89f940a	terminate leader goroutines on shutdown Ensure that nomad steps down (and terminate leader goroutines) on shutdown, when the server is the leader. Without this change, `monitorLeadership` may handle `shutdownCh` event and exit early before handling the raft `leaderCh` event and end up leaking leadership goroutines.	2020-05-26 10:18:10 -04:00
Mahmood Ali	e671913e56	fix a trace logline	2020-05-26 10:18:09 -04:00
Mahmood Ali	1c79c3b93d	refactor: context is first parameter By convention, go functions take `context.Context` as the first argument.	2020-05-26 10:18:09 -04:00
Mahmood Ali	1eff8b0ed8	volumewatcher: no batcher when disabling When disabling volumewatcher (at the end of a test), avoid starting a new update batcher with its new goroutine.	2020-05-26 10:18:09 -04:00
Mahmood Ali	b895cef622	always set purgeFunc purgeFunc cannot be nil, so ensure it's set to a no-op function in tests.	2020-05-21 21:05:53 -04:00
Mahmood Ali	2108681c1d	Endpoint for snapshotting server state	2020-05-21 20:04:38 -04:00
Mahmood Ali	fbe140b26c	vault: ensure ttl expired tokens are purge If a token is scheduled for revocation expires before we revoke it, ensure that it is marked as purged in raft and is only removed from local vault state if the purge operation succeeds. Prior to this change, we may remove the accessor from local state but not purge it from Raft. This causes unnecessary and churn in the next leadership elections (and until 0.11.2 result in indefinite retries).	2020-05-21 19:54:50 -04:00
Mahmood Ali	aa8e79e55b	Reorder leadership handling Start serving RPC immediately after leader components are enabled, and move clean up to the bottom as they don't block leadership responsibilities.	2020-05-21 08:30:31 -04:00
Mahmood Ali	1cf1114627	apply the same change to consul revocation	2020-05-21 08:30:31 -04:00
Mahmood Ali	1399d02f45	rate limit revokeDaemon	2020-05-21 08:30:31 -04:00
Mahmood Ali	6e749d12a0	on leadership establishment, revoke Vault tokens in background Establishing leadership should be very fast and never make external API calls. This fixes a situation where there is a long backlog of Vault tokens to be revoked on when leadership is gained. In such case, revoking the tokens will significantly slow down leadership establishment and slow down processing. Worse, the revocation call does not honor leadership `stopCh` signals, so it will not stop when the leader loses leadership.	2020-05-21 07:38:27 -04:00
Tim Gross	72430a4e62	csi: don't pass volume claim releases thru GC eval (#8021 ) Following the new volumewatcher in #7794 and performance improvements to it that landed afterwards, there's no particular reason we should be threading claim releases through the GC eval rather than writing an empty `CSIVolumeClaimRequest` with the mode set to `CSIVolumeClaimRelease`, just as the GC evaluation would do. Also, by batching up these raft messages, we can reduce the amount of raft writes by 1 and cross-server RPCs by 1 per volume we release claims on.	2020-05-20 15:22:51 -04:00
Tim Gross	3902709c0a	csi: check for empty arguments on CSI endpoint (#8027 ) Some of the CSI RPC endpoints were missing validation that the ID or the Volume definition was present. This could result in nonsense `CSIVolume` structs being written to raft during registration. This changeset corrects that bug and adds validation checks to present nicer error messages to operators in some other cases.	2020-05-20 10:22:24 -04:00
Charlie Voiselle	70303c906c	Simplify comments Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-05-19 15:05:24 -04:00
Charlie Voiselle	6976a7699e	Set Updated to true for all non-CAS requests	2020-05-19 12:59:39 -04:00
Mahmood Ali	406fce90c3	list all jobs on namespaces the token can access	2020-05-19 09:51:41 -04:00
Seth Hoenig	f6c8db8a8a	consul/connect: use task kind to get service name Fixes #8000 When requesting a Service Identity token from Consul, use the TaskKind of the Task to get at the service name associated with the task. In the past using the TaskName worked because it was generated as a sidecar task with a name that included the service. In the Native context, we need to get at the service name in a more correct way, i.e. using the TaskKind which is defined to include the service name.	2020-05-18 13:46:00 -06:00
Mahmood Ali	5ab2d52e27	endpoint to expose all jobs across all namespaces Allow a `/v1/jobs?all_namespaces=true` to list all jobs across all namespaces. The returned list is to contain a `Namespace` field indicating the job namespace. If ACL is enabled, the request token needs to be a management token or have `namespace:list-jobs` capability on all existing namespaces.	2020-05-18 13:50:46 -04:00
Tim Gross	2082cf738a	csi: support for VolumeContext and VolumeParameters (#7957 ) The MVP for CSI in the 0.11.0 release of Nomad did not include support for opaque volume parameters or volume context. This changeset adds support for both. This also moves args for ControllerValidateCapabilities into a struct. The CSI plugin `ControllerValidateCapabilities` struct that we turn into a CSI RPC is accumulating arguments, so moving it into a request struct will reduce the churn of this internal API, make the plugin code more readable, and make this method consistent with the other plugin methods in that package.	2020-05-15 08:16:01 -04:00
Mahmood Ali	b385a5d063	Merge pull request #7959 from hashicorp/b-deleted-vault-accessors vault: ensure that token revocation is idempotent	2020-05-14 12:39:06 -04:00
Mahmood Ali	077342c528	vault: ensure that token revocation is idempotent This ensures that token revocation is idempotent and can handle when tokens are revoked out of band. Idempotency is important to handle some transient failures and retries. Consider when a single token of a batch fails to be revoked, nomad would retry revoking the entire batch; tokens already revoked should be gracefully handled, otherwise, nomad may retry revoking the same tokens forever.	2020-05-14 11:30:32 -04:00
Mahmood Ali	6ac166e1aa	vault: failing test for repeated revocation	2020-05-14 11:30:29 -04:00
Lang Martin	d3c4700cd3	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Mahmood Ali	3b4116e0db	Merge pull request #7894 from hashicorp/b-cronexpr-dst-fix Fix Daylight saving transition handling	2020-05-12 16:36:11 -04:00
Tim Gross	4374c1a837	csi: support Secrets parameter in CSI RPCs (#7923 ) CSI plugins can require credentials for some publishing and unpublishing workflow RPCs. Secrets are configured at the time of volume registration, stored in the volume struct, and then passed around as an opaque map by Nomad to the plugins.	2020-05-11 17:12:51 -04:00
Mahmood Ali	938e916d9c	When serializing msgpack, only consider codec tag When serializing structs with msgpack, only consider type tags of `codec`. Hashicorp/go-msgpack (based on ugorji/go) defaults to interpretting `codec` tag if it's available, but falls to using `json` if `codec` isn't present. This behavior is surprising in cases where we want to serialize json differently from msgpack, e.g. serializing `ConsulExposeConfig`.	2020-05-11 14:14:10 -04:00
Mahmood Ali	b4fa8e9588	codec: we use hashicorp/go-msgpack exclusively No need to maintain two msgpack handles!	2020-05-11 14:05:29 -04:00
Tim Gross	6554e9ee37	csi: log fallthrough on invalid node IDs for client RPC (#7918 ) When a CSI client RPC is given a specific node for a controller but the lookup fails (because the node is gone or is an older version), we fallthrough to select a node from all those available. This adds logging to this case to aid in diagnostics.	2020-05-11 12:26:10 -04:00
Tim Gross	1ec41b6770	volumewatcher: stop watcher goroutines when there's no work (#7909 ) The watcher goroutines will be automatically started if a volume has updates, but when idle we shouldn't keep a goroutine running and taking up memory.	2020-05-11 09:32:05 -04:00
Mahmood Ali	061a439f2c	Merge pull request #7912 from hashicorp/f-scheduler-algorithm-followup Scheduler Algorithm Defaults handling and docs	2020-05-11 09:30:58 -04:00
Mahmood Ali	0384543d05	Merge pull request #7913 from hashicorp/deflake-TestTaskTemplateManager_BlockedEvents Deflake TestTaskTemplateManager_BlockedEvents test	2020-05-11 09:30:44 -04:00
Mahmood Ali	dff0fcf2f3	Merge pull request #7914 from hashicorp/b-csi-fix-slice-initialization Fix slice initialization	2020-05-11 09:27:01 -04:00
Tim Gross	3aa761b151	Periodic GC for volume claims (#7881 ) This changeset implements a periodic garbage collection of CSI volumes with missing allocations. This can happen in a scenario where a node update fails partially and the allocation updates are written to raft but the evaluations to GC the volumes are dropped. This feature will cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1 get any stray claims cleaned up.	2020-05-11 08:20:50 -04:00
James Rasell	aaf2fe033e	Merge pull request #7903 from hashicorp/b-gh-7902 api: validate scale count value is not negative.	2020-05-11 09:17:01 +02:00
Mahmood Ali	9fac6ea5d9	Fix slice initialization	2020-05-09 21:35:42 -04:00
Mahmood Ali	64de395df0	tests: ease debugging TestClientEndpoint_CreateNodeEvals TestClientEndpoint_CreateNodeEvals flakes a bit but its output is very confusing, as `structs.Evaluations` overrides GoString. Here, we emit the entire struct of the evaluation, and hopefully we'll figure out the problem the next time it happens	2020-05-09 16:04:32 -04:00
Mahmood Ali	ff5c3e81b0	avoid logging after a test completes	2020-05-09 14:40:00 -04:00
Mahmood Ali	2c963885b0	handle upgrade path and defaults Ensure that `""` Scheduler Algorithm gets explicitly set to binpack on upgrades or on API handling when user misses the value. The scheduler already treats `""` value as binpack. This PR merely ensures that the operator API returns the effective value.	2020-05-09 12:34:08 -04:00
Tim Gross	8373e917fc	volumewatcher: set maximum batch size for raft update (#7907 ) The `volumewatcher` has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 33K), as a very conservative safety/robustness guard. Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-05-08 16:53:57 -04:00
James Rasell	55a2ad3854	api: validate scale count value is not negative. An operator could submit a scale request including a negative count value. This negative value caused the Nomad server to panic. The fix adds validation to the submitted count, returning an error to the caller if it is negative.	2020-05-08 16:51:40 +02:00
Mahmood Ali	57435950d7	Update current DST and some code style issues	2020-05-07 19:27:05 -04:00
Mahmood Ali	c8fb132956	Update cronexpr to point to hashicorp/cronexpr	2020-05-07 17:50:45 -04:00
Mahmood Ali	507c0b8f64	tests for periodic job scheduling and DST	2020-05-07 17:36:59 -04:00
Tim Gross	42f9d517d8	CSI volumewatcher testability improvments (#7889 ) * volumewatcher: remove redundant log fields The constructor for `volumeWatcher` already sets a `logger.With` that includes the volume ID and namespace fields. Remove them from the various trace logs. * volumewatcher: advance state for controller already released One way of bypassing client RPCs in testing is to set a claim status to controller-detached, but this results in an incorrect claim state when we checkpoint.	2020-05-07 15:57:24 -04:00
Tim Gross	801ebcfe8d	periodic GC for CSI plugins (#7878 ) This changeset implements a periodic garbage collection of unused CSI plugins. Plugins are self-cleaning when the last allocation for a plugin is stopped, but this feature will cover any missing edge cases and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins cleaned up.	2020-05-06 16:49:12 -04:00
Tim Gross	00c9bd7ff0	reorder volume claim batch request raft message (#7871 ) For backwards compatibility during upgrades, new raft message types need to come at the end of the enum.	2020-05-06 08:57:51 -04:00
Tim Gross	ce86a594a6	csi: fix plugin counts on node update (#7844 ) In this changeset: * If a Nomad client node is running both a controller and a node plugin (which is a common case), then if only the controller or the node is removed, the plugin was not being updated with the correct counts. * The existing test for plugin cleanup didn't go back to the state store, which normally is ok but is complicated in this case by denormalization which changes the behavior. This commit makes the test more comprehensive. * Set "controller required" when plugin has `PUBLISH_READONLY`. All known controllers that support `PUBLISH_READONLY` also support `PUBLISH_UNPUBLISH_VOLUME` but we shouldn't assume this. * Only create plugins when the allocs for those plugins are healthy. If we allow a plugin to be created for the first time when the alloc is not healthy, then we'll recreate deleted plugins when the job's allocs all get marked terminal. * Terminal plugin alloc updates should cleanup the plugin. The client fingerprint can't tell if the plugin is unhealthy intentionally (for the case of updates or job stop). Allocations that are server-terminal should delete themselves from the plugin and trigger a plugin self-GC, the same as an unused node.	2020-05-05 15:39:57 -04:00
Tim Gross	22e3815e8c	docstring improvements and typo fixes (#7862 )	2020-05-05 10:30:50 -04:00
Tim Gross	1c6dcab56b	volumewatcher: remove spurious nil-check (#7858 ) The nil-check here is left-over from an earlier approach that didn't get merged. It doesn't do anything for us now as we can't ever pass it `nil` and if we leave it in the `getVolume` call it guards will panic anyways.	2020-05-04 12:28:32 -04:00
Mahmood Ali	78ae7b885a	Merge pull request #7810 from hashicorp/spread-configuration spread scheduling algorithm	2020-05-01 13:15:19 -04:00
Mahmood Ali	3da74068dd	changelog and fix typo	2020-05-01 13:14:20 -04:00
Mahmood Ali	b9e3cde865	tests and some clean up	2020-05-01 13:13:30 -04:00
Charlie Voiselle	d8e5e02398	Wiring algorithm to scheduler calls	2020-05-01 13:13:29 -04:00
Charlie Voiselle	663fb677cf	Add SchedulerAlgorithm to SchedulerConfig	2020-05-01 13:13:29 -04:00
Lang Martin	28bac139cb	client/heartbeatstop: destroy allocs when disconnected from servers - track lastHeartbeat, the client local time of the last successful heartbeat round trip - track allocations with `stop_after_client_disconnect` configured - trigger allocation destroy (which handles cleanup) - restore heartbeat/killable allocs tracking when allocs are recovered from disk - on client restart, stop those allocs after a grace period if the servers are still partioned	2020-05-01 12:35:49 -04:00
Michael Schurter	c901d0e7dd	Merge branch 'master' into b-reserved-scoring	2020-04-30 14:48:14 -07:00
Tim Gross	52e805a6a6	csi: ensure Read/WriteAllocs aren't released early (#7841 ) We should only remove the `ReadAllocs`/`WriteAllocs` values for a volume after the claim has entered the "ready to free" state. The volume will eventually be released as expected. But querying the volume API will show the volume is released before the controller unpublish has finished and this can cause a race with starting new jobs. Test updates are to cover cases where we're dropping claims but not running through the whole reaping process.	2020-04-30 17:11:31 -04:00
Tim Gross	a7a64443e1	csi: move volume claim release into volumewatcher (#7794 ) This changeset adds a subsystem to run on the leader, similar to the deployment watcher or node drainer. The `Watcher` performs a blocking query on updates to the `CSIVolumes` table and triggers reaping of volume claims. This will avoid tying up scheduling workers by immediately sending volume claim workloads into their own loop, rather than blocking the scheduling workers in the core GC job doing things like talking to CSI controllers The volume watcher is enabled on leader step-up and disabled on leader step-down. The volume claim GC mechanism now makes an empty claim RPC for the volume to trigger an index bump. That in turn unblocks the blocking query in the volume watcher so it can assess which claims can be released for a volume.	2020-04-30 09:13:00 -04:00
Tim Gross	e34f099d20	csi: read-repair CSI volume claims (#7824 ) The `CSIVolumeClaim` fields were added after 0.11.1, so claims made before that may be missing the value. Repair this when we read the volume out of the state store. The `NodeID` field was added after 0.11.0, so we need to ensure it's been populated during upgrades from 0.11.0.	2020-04-29 11:57:19 -04:00
Mahmood Ali	18f16cfb12	Merge pull request #7818 from greut/codegen structs: give codecgen import	2020-04-28 12:16:41 -04:00
Chris Baker	315bcf1060	Merge pull request #7816 from hashicorp/b-7789-job-scaling-status-issues fix issues in Job.ScaleStatus	2020-04-28 06:33:42 -05:00
Yoan Blanc	5ca31f23e5	structs: give codecgen import Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-04-28 08:23:20 +02:00
Nick Ethier	4b810b697a	nomad: build dynamic port for exposed checks if not specified (#7800 )	2020-04-28 00:07:41 -04:00
Chris Baker	73f1390316	modified Job.ScaleStatus to ignore deployments and look directly at the allocations, ignoring canaries	2020-04-27 21:45:39 +00:00
Tim Gross	083b35d651	csi: checkpoint volume claim garbage collection (#7782 ) Adds a `CSIVolumeClaim` type to be tracked as current and past claims on a volume. Allows for a client RPC failure during node or controller detachment without having to keep the allocation around after the first garbage collection eval. This changeset lays groundwork for moving the actual detachment RPCs into a volume watching loop outside the GC eval.	2020-04-23 11:06:23 -04:00
Chris Baker	09d980be2b	modify state store so that autoscaling policies are deleted from their table as job is stopped (and recreated when job is started)	2020-04-21 23:01:26 +00:00
Tim Gross	bd74b593d0	csi: nil-check allocs for VolumeDenormalize and claim methods (#7760 )	2020-04-21 08:32:24 -04:00
Michael Dwan	ba70c54340	fix panic while deleting CSI plugins for missing job (#7758 )	2020-04-20 17:13:33 -04:00
Seth Hoenig	40e0f8a346	Merge pull request #7690 from hashicorp/b-inspect-proxy-output two fixes for inspect on connect proxy	2020-04-20 10:17:54 -06:00
Anthony Scalisi	9664c6b270	fix spelling errors (#6985 )	2020-04-20 09:28:19 -04:00
Jorge Marey	30b877c63a	Fix get all vault token policies	2020-04-16 16:38:24 +02:00
Michael Schurter	4c5a0cae35	core: fix node reservation scoring The BinPackIter accounted for node reservations twice when scoring nodes which could bias scores toward nodes with reservations. Pseudo-code for previous algorithm: ``` proposed = reservedResources + sum(allocsResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are added to the total resources used by allocations, and then the node's reserved resources are later substracted from the node's overall resources. The new algorithm is: ``` proposed = sum(allocResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are no longer added to the total resources used by allocations. My guess as to how this bug happened is that the resource utilization variable (`util`) is calculated and returned by the `AllocsFit` function which needs to take reserved resources into account as a basic feasibility check. To avoid re-calculating alloc resource usage (because there may be a large number of allocs), we reused `util` in the `ScoreFit` function. `ScoreFit` properly accounts for reserved resources by subtracting them from the node's overall resources. However since `util` _also_ took reserved resources into account the score would be incorrect. Prior to the fix the added test output: ``` Node: reserved Score: 1.0000 Node: reserved2 Score: 1.0000 Node: no-reserved Score: 0.9741 ``` The scores being 1.0 for both nodes with reserved resources is a good hint something is wrong as they should receive different scores. Upon further inspection the double accounting of reserved resources caused their scores to be >1.0 and clamped. After the fix the added test outputs: ``` Node: no-reserved Score: 0.9741 Node: reserved Score: 0.9480 Node: reserved2 Score: 0.8717 ```	2020-04-15 15:13:30 -07:00
Seth Hoenig	d5ad580d5c	structs: fix compatibility between api and nomad/structs proxy definitions The field names within the structs representing the Connect proxy definition were not the same (nomad/structs/ vs api/), causing the values to be lost in translation for the 'nomad job inspect' command. Since the field names already shipped in v0.11.0 we cannot simply fix the names. Instead, use the json struct tag on the structs/ structs to remap the name to match the publicly expose api/ package on json encoding. This means existing jobs from v0.11.0 will continue to work, and the JSON API for job submission will remain backwards compatible.	2020-04-13 15:59:45 -06:00
Tim Gross	4e9bd1e1d1	refactor: consolidate private methods for CSI RPC (#7702 ) Follow-up for a method missed in the refactor for #7688. The `volAndPluginLookup` method is only ever called from the server's `CSI` RPC and never the `ClientCSI` RPC, so move it into that scope.	2020-04-13 10:46:43 -04:00
Tim Gross	f37e986b1b	refactor: make nodeForControllerPlugin private to ClientCSI (#7688 ) The current design of `ClientCSI` RPC requires that callers in the server know about the free-standing `nodeForControllerPlugin` function. This makes it difficult to send `ClientCSI` RPC messages from subpackages of `nomad` and adds a bunch of boilerplate to every server-side caller of a controller RPC. This changeset makes it so that the `ClientCSI` RPCs will populate and validate the controller's client node ID if it hasn't been passed by the caller, centralizing the logic of picking and validating controller targets into the `nomad.ClientCSI` struct.	2020-04-10 16:47:21 -04:00
Seth Hoenig	20802da8fd	connect: correctly deal with nil sidecar_service task stanza Before, if the sidecar_service stanza of a connect enabled service was missing, the job submission would cause a panic in the nomad agent. Since the panic was happening in the API handler the agent itself continued running, but this change will the condition more gracefully. By fixing the `Copy` method, the API handler now returns the proper error. $ nomad job run foo.nomad Error submitting job: Unexpected response code: 500 (1 error occurred: * Task group api validation failed: 2 errors occurred: * Missing tasks for task group * Task group service validation failed: 1 error occurred: * Service[0] count-api validation failed: 1 error occurred: * Consul Connect must be native or use a sidecar service	2020-04-09 20:28:17 -06:00
Drew Bailey	4ab7c03641	Merge pull request #7618 from hashicorp/b-shutdown-delay-updates Fixes bug that prevented group shutdown_delay updates	2020-04-06 13:05:20 -04:00
Drew Bailey	0d4bb6bf92	guard against nil maps	2020-04-06 12:25:50 -04:00
Drew Bailey	3b8afce9e6	test added and removed	2020-04-06 11:53:46 -04:00
Drew Bailey	9874e7b21d	Group shutdown delay fixes Group shutdown delay updates were not properly handled in Update hook. This commit also ensures that plan output is displayed.	2020-04-06 11:29:12 -04:00
Tim Gross	73dc2ad443	e2e/csi: add waiting for alloc stop	2020-04-06 10:15:55 -04:00
Tim Gross	027277a0d9	csi: make volume GC in job deregister safely async The `Job.Deregister` call will block on the client CSI controller RPCs while the alloc still exists on the Nomad client node. So we need to make the volume claim reaping async from the `Job.Deregister`. This allows `nomad job stop` to return immediately. In order to make this work, this changeset changes the volume GC so that the GC jobs are on a by-volume basis rather than a by-job basis; we won't have to query the (possibly deleted) job at the time of volume GC. We smuggle the volume ID and whether it's a purge into the GC eval ID the same way we smuggled the job ID previously.	2020-04-06 10:15:55 -04:00
Tim Gross	5a3b45864d	csi: fix unpublish workflow ID mismatches The CSI plugins uses the external volume ID for all operations, but the Client CSI RPCs uses the Nomad volume ID (human-friendly) for the mount paths. Pass the External ID as an arg in the RPC call so that the unpublish workflows have it without calling back to the server to find the external ID. The controller CSI plugins need the CSI node ID (or in other words, the storage provider's view of node ID like the EC2 instance ID), not the Nomad node ID, to determine how to detach the external volume.	2020-04-06 10:15:55 -04:00
Lang Martin	1750426d04	csi: run volume claim GC on `job stop -purge` (#7615 ) * nomad/state/state_store: error message copy/paste error * nomad/structs/structs: add a VolumeEval to the JobDeregisterResponse * nomad/job_endpoint: synchronously, volumeClaimReap on job Deregister * nomad/core_sched: make volumeClaimReap available without a CoreSched * nomad/job_endpoint: Deregister return early if the job is missing * nomad/job_endpoint_test: job Deregistion is idempotent * nomad/core_sched: conditionally ignore alloc status in volumeClaimReap * nomad/job_endpoint: volumeClaimReap all allocations, even running * nomad/core_sched_test: extra argument to collectClaimsToGCImpl * nomad/job_endpoint: job deregistration is not idempotent	2020-04-03 17:37:26 -04:00
Mahmood Ali	816a93ed4a	tests: deflake TestAutopilot_RollingUpdate I hypothesize that the flakiness in rolling update is due to shutting down s3 server before s4 is properly added as a voter. The chain of the flakiness is as follows: 1. Bootstrap with s1, s2, s3 2. Add s4 3. Wait for servers to register with 3 voting peers * But we already have 3 voters (s1, s2, and s3) * s4 is added as a non-voter in Raft v3 and must wait until autopilot promots it 4. Test proceeds without s4 being a voter 5. s3 shutdown 6. cluster changes stall due to leader election and too many pending configuration changes (e.g. removing s3 from raft, promoting s4). Here, I have the test wait until s4 is marked as a voter before shutting down s3, so we don't have too many configuration changes at once. In https://circleci.com/gh/hashicorp/nomad/57092, I noticed the following events: ``` TestAutopilot_RollingUpdate: autopilot_test.go:204: adding server s4 TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO] nomad/serf.go:60: nomad: adding server: server="nomad-137.global (Addr: 127.0.0.1:9177) (DC: dc1)" TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO] raft/raft.go:1018: nomad.raft: updating configuration: command=AddNonvoter server-id=c54b5bf4-1159-34f6-032d-56aefeb08425 server-addr=127.0.0.1:9177 servers="[{Suffrage:Voter ID:df01ba65-d1b2-17a9-f792-a4459b3a7c09 Address:127.0.0.1:9171} {Suffrage:Voter ID:c3337778-811e-2675-87f5-006309888387 Address:127.0.0.1:9173} {Suffrage:Voter ID:186d5e15-c473-e2b3-b5a4-3259a84e10ef Address:127.0.0.1:9169} {Suffrage:Nonvoter ID:c54b5bf4-1159-34f6-032d-56aefeb08425 Address:127.0.0.1:9177}]" TestAutopilot_RollingUpdate: autopilot_test.go:218: shutting down server s3 TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.797Z [INFO] raft/replication.go:456: nomad.raft: aborting pipeline replication: peer="{Nonvoter c54b5bf4-1159-34f6-032d-56aefeb08425 127.0.0.1:9177}" TestAutopilot_RollingUpdate: autopilot_test.go:235: waiting for s4 to stabalize and be promoted TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.975Z [ERROR] raft/raft.go:1656: nomad.raft: failed to make requestVote RPC: target="{Voter c3337778-811e-2675-87f5-006309888387 127.0.0.1:9173}" error="dial tcp 127.0.0.1:9173: connect: connection refused" TestAutopilot_RollingUpdate: retry.go:121: autopilot_test.go:241: don't want "c3337778-811e-2675-87f5-006309888387" autopilot_test.go:241: didn't find map[c54b5bf4-1159-34f6-032d-56aefeb08425:true] in []raft.ServerID{"df01ba65-d1b2-17a9-f792-a4459b3a7c09", "186d5e15-c473-e2b3-b5a4-3259a84e10ef"} ``` Note how s3, c3337778, is present in the peers list in the final failure, but s4, c54b5bf4, is added as a Nonvoter and isn't present in the final peers list.	2020-04-03 17:15:41 -04:00
Mahmood Ali	5587dc58c0	Use lowercase for hcl keys This is not a change in behavior, hcl key matching is case insensitive as desmonstrated in `command.agent/TestConfig_Parse`	2020-04-03 07:56:00 -04:00
Tim Gross	f6b3d38eb8	CSI: move node unmount to server-driven RPCs (#7596 ) If a volume-claiming alloc stops and the CSI Node plugin that serves that alloc's volumes is missing, there's no way for the allocrunner hook to send the `NodeUnpublish` and `NodeUnstage` RPCs. This changeset addresses this issue with a redesign of the client-side for CSI. Rather than unmounting in the alloc runner hook, the alloc runner hook will simply exit. When the server gets the `Node.UpdateAlloc` for the terminal allocation that had a volume claim, it creates a volume claim GC job. This job will made client RPCs to a new node plugin RPC endpoint, and only once that succeeds, move on to making the client RPCs to the controller plugin. If the node plugin is unavailable, the GC job will fail and be requeued.	2020-04-02 16:04:56 -04:00
Nick Ethier	3557099f4c	Merge pull request #7594 from hashicorp/f-connect-lifecycle connect: set task lifecycle config for injected sidecar task	2020-04-02 12:51:01 -04:00
Lang Martin	24449e23af	csi: volume validate namespace (#7587 ) * nomad/state/state_store: enforce that the volume namespace exists * nomad/csi_endpoint_test: a couple of broken namespaces now * nomad/csi_endpoint_test: one more test * nomad/node_endpoint_test: use structs.DefaultNamespace * nomad/state/state_store_test: use DefaultNamespace	2020-04-02 10:13:41 -04:00
Nick Ethier	90b5d2b13f	lint: gofmt	2020-04-01 21:23:47 -04:00
Nick Ethier	92f8bfc729	connect: set task lifecycle config for injected sidecar task fixes #7593	2020-04-01 21:19:41 -04:00
Chris Baker	c3ab837d9e	job_endpoint: fixed bad test	2020-04-01 18:11:58 +00:00
Chris Baker	285728f3fa	Merge branch 'f-7422-scaling-events' of github.com:hashicorp/nomad into f-7422-scaling-events	2020-04-01 17:28:50 +00:00
Chris Baker	8ec252e627	added indices to the job scaling events, so we could properly do blocking queries on the job scaling status	2020-04-01 17:28:19 +00:00
Chris Baker	4ac36b7c89	Update nomad/state/state_store.go Co-Authored-By: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-04-01 11:56:12 -05:00
Chris Baker	eb19fe16d2	Update nomad/state/state_store.go Co-Authored-By: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-04-01 11:56:01 -05:00
Chris Baker	6dbfb36e14	Update nomad/job_endpoint.go Co-Authored-By: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-04-01 11:55:55 -05:00
Chris Baker	b2ab42afbb	scaling api: more testing around the scaling events api	2020-04-01 16:39:23 +00:00
Chris Baker	40d6b3bbd1	adding raft and state_store support to track job scaling events updated ScalingEvent API to record "message string,error bool" instead of confusing "reason,error *string"	2020-04-01 16:15:14 +00:00
Mahmood Ali	37c0dbcfe6	fix codegen for ugorji/go When generating ugorji/go package, we should use github.com/hashicorp/go-msgpack/codec instead. Also fix the reference for codegen_generated	2020-03-31 21:30:21 -04:00
Seth Hoenig	9880e798bf	docs: note why check.Expose is not part of chech.Hash	2020-03-31 17:15:50 -06:00
Seth Hoenig	14c7cebdea	connect: enable automatic expose paths for individual group service checks Part of #6120 Building on the support for enabling connect proxy paths in #7323, this change adds the ability to configure the 'service.check.expose' flag on group-level service check definitions for services that are connect-enabled. This is a slight deviation from the "magic" that Consul provides. With Consul, the 'expose' flag exists on the connect.proxy stanza, which will then auto-generate expose paths for every HTTP and gRPC service check associated with that connect-enabled service. A first attempt at providing similar magic for Nomad's Consul Connect integration followed that pattern exactly, as seen in #7396. However, on reviewing the PR we realized having the `expose` flag on the proxy stanza inseperably ties together the automatic path generation with every HTTP/gRPC defined on the service. This makes sense in Consul's context, because a service definition is reasonably associated with a single "task". With Nomad's group level service definitions however, there is a reasonable expectation that a service definition is more abstractly representative of multiple services within the task group. In this case, one would want to define checks of that service which concretely make HTTP or gRPC requests to different underlying tasks. Such a model is not possible with the course `proxy.expose` flag. Instead, we now have the flag made available within the check definitions themselves. By making the expose feature resolute to each check, it is possible to have some HTTP/gRPC checks which make use of the envoy exposed paths, as well as some HTTP/gRPC checks which make use of some orthongonal port-mapping to do checks on some other task (or even some other bound port of the same task) within the task group. Given this example, group "server-group" { network { mode = "bridge" port "forchecks" { to = -1 } } service { name = "myserver" port = 2000 connect { sidecar_service { } } check { name = "mycheck-myserver" type = "http" port = "forchecks" interval = "3s" timeout = "2s" method = "GET" path = "/classic/responder/health" expose = true } } } Nomad will automatically inject (via job endpoint mutator) the extrapolated expose path configuration, i.e. expose { path { path = "/classic/responder/health" protocol = "http" local_path_port = 2000 listener_port = "forchecks" } } Documentation is coming in #7440 (needs updating, doing next) Modifications to the `countdash` examples in https://github.com/hashicorp/demo-consul-101/pull/6 which will make the examples in the documentation actually runnable. Will add some e2e tests based on the above when it becomes available.	2020-03-31 17:15:50 -06:00
Seth Hoenig	0266f056b8	connect: enable proxy.passthrough configuration Enable configuration of HTTP and gRPC endpoints which should be exposed by the Connect sidecar proxy. This changeset is the first "non-magical" pass that lays the groundwork for enabling Consul service checks for tasks running in a network namespace because they are Connect-enabled. The changes here provide for full configuration of the connect { sidecar_service { proxy { expose { paths = [{ path = <exposed endpoint> protocol = <http or grpc> local_path_port = <local endpoint port> listener_port = <inbound mesh port> }, ... ] } } } stanza. Everything from `expose` and below is new, and partially implements the precedent set by Consul: https://www.consul.io/docs/connect/registration/service-registration.html#expose-paths-configuration-reference Combined with a task-group level network port-mapping in the form: port "exposeExample" { to = -1 } it is now possible to "punch a hole" through the network namespace to a specific HTTP or gRPC path, with the anticipated use case of creating Consul checks on Connect enabled services. A future PR may introduce more automagic behavior, where we can do things like 1) auto-fill the 'expose.path.local_path_port' with the default value of the 'service.port' value for task-group level connect-enabled services. 2) automatically generate a port-mapping 3) enable an 'expose.checks' flag which automatically creates exposed endpoints for every compatible consul service check (http/grpc checks on connect enabled services).	2020-03-31 17:15:27 -06:00
Lang Martin	e03c328792	csi: use node MaxVolumes during scheduling (#7565 ) * nomad/state/state_store: CSIVolumesByNodeID ignores namespace * scheduler/scheduler: add CSIVolumesByNodeID to the state interface * scheduler/feasible: check node MaxVolumes * nomad/csi_endpoint: no namespace inn CSIVolumesByNodeID anymore * nomad/state/state_store: avoid DenormalizeAllocationSlice * nomad/state/iterator: clean up SliceIterator Next * scheduler/feasible_test: block with MaxVolumes * nomad/state/state_store_test: fix args to CSIVolumesByNodeID	2020-03-31 17:16:47 -04:00
Lang Martin	8d4f39fba1	csi: add node events to report progress mounting and unmounting volumes (#7547 ) * nomad/structs/structs: new NodeEventSubsystemCSI * client/client: pass triggerNodeEvent in the CSIConfig * client/pluginmanager/csimanager/instance: add eventer to instanceManager * client/pluginmanager/csimanager/manager: pass triggerNodeEvent * client/pluginmanager/csimanager/volume: node event on [un]mount * nomad/structs/structs: use storage, not CSI * client/pluginmanager/csimanager/volume: use storage, not CSI * client/pluginmanager/csimanager/volume_test: eventer * client/pluginmanager/csimanager/volume: event on error * client/pluginmanager/csimanager/volume_test: check event on error * command/node_status: remove an extra space in event detail format * client/pluginmanager/csimanager/volume: use snake_case for details * client/pluginmanager/csimanager/volume_test: snake_case details	2020-03-31 17:13:52 -04:00
Yoan Blanc	225c9c1215	fixup! vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:48:07 -04:00
Yoan Blanc	761d014071	vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:45:21 -04:00
Michael Schurter	464dae514c	test: assert HostVolumes included in ListNodes	2020-03-30 17:34:44 -07:00
Michael Lange	4707a625d6	Add HostVolumes to the NodeListStub	2020-03-30 17:33:43 -07:00
Seth Hoenig	b3664c628c	Merge pull request #7524 from hashicorp/docs-consul-acl-minimums consul: annotate Consul interfaces with ACLs	2020-03-30 13:27:27 -06:00
Seth Hoenig	0a812ab689	consul: annotate Consul interfaces with ACLs	2020-03-30 10:17:28 -06:00
Tim Gross	54b3573fc9	state: support snapshot of CSI plugin and volume tables (#7546 ) The `csi_plugins` and `csi_volumes` tables were missing support for snapshot persist and restore. This means restoring a snapshot would result in missing information for CSI.	2020-03-30 11:17:16 -04:00
Drew Bailey	a98dc8c768	update audit examples to an endpoint that is audited	2020-03-30 10:03:11 -04:00
Mahmood Ali	e76ff9f679	Merge pull request #7543 from hashicorp/test-flakiness-20200330_1 Test flakiness fixes - 2020-03-30 Edition	2020-03-30 09:26:26 -04:00
Mahmood Ali	57bebfdb5c	tests: avoid logging after test completion	2020-03-30 09:08:34 -04:00
Mahmood Ali	13381448e0	avoid logging in draining job watcher In tests where the logger is a test logger, emitting a trace log in a background thread while it's shutting down may trigger a panic. Thus avoid logging Trace if err != nil. Note that we already log an error when err isn't a trace. This fixes cases where tests panic with a trace like: ``` panic: Log in goroutine after TestAllocGarbageCollector_MakeRoomFor_MaxAllocs has completed goroutine 30 [running]: testing.(common).logDepth(0xc000aa9e60, 0xc000c4a000, 0xab, 0x3) /usr/local/Cellar/go/1.14/libexec/src/testing/testing.go:680 +0x4d3 testing.(common).log(...) /usr/local/Cellar/go/1.14/libexec/src/testing/testing.go:662 testing.(common).Logf(0xc000aa9e60, 0x690b941, 0x4, 0xc001366c00, 0x2, 0x2) /usr/local/Cellar/go/1.14/libexec/src/testing/testing.go:701 +0x7e github.com/hashicorp/nomad/helper/testlog.(writer).Write(0xc000a82a60, 0xc0000b48c0, 0xab, 0x13f, 0x0, 0x0, 0x0) /Users/notnoop/go/src/github.com/hashicorp/nomad/helper/testlog/testlog.go:34 +0x106 github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog.(writer).Flush(0xc000a80900, 0xbf9870f000000001, 0x20a87556e, 0x8b12bc0) /Users/notnoop/go/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog/writer.go:29 +0x14f github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog.(intLogger).log(0xc000e2c180, 0xc0003b6880, 0x17, 0x1, 0x6974edc, 0x22, 0xc000db57a0, 0x6, 0x6) /Users/notnoop/go/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog/intlogger.go:139 +0x15d github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog.(intLogger).Trace(0xc000e2c180, 0x6974edc, 0x22, 0xc000db57a0, 0x6, 0x6) /Users/notnoop/go/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog/intlogger.go:446 +0x7a github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog.(interceptLogger).Trace(0xc0002f1ad0, 0x6974edc, 0x22, 0xc000db57a0, 0x6, 0x6) /Users/notnoop/go/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/go-hclog/interceptlogger.go:48 +0x9c github.com/hashicorp/nomad/nomad/drainer.(*drainingJobWatcher).watch(0xc0002f2380) /Users/notnoop/go/src/github.com/hashicorp/nomad/nomad/drainer/watch_jobs.go:147 +0x1125 created by github.com/hashicorp/nomad/nomad/drainer.NewDrainingJobWatcher /Users/notnoop/go/src/github.com/hashicorp/nomad/nomad/drainer/watch_jobs.go:89 +0x1e3 FAIL github.com/hashicorp/nomad/client 10.605s FAIL ```	2020-03-30 07:06:53 -04:00
Mahmood Ali	36ad8ee2e0	tests: add debugging for TestAutopilot_RollingUpdate	2020-03-30 07:06:53 -04:00
Chris Baker	d6287c43b9	clean up some tests	2020-03-29 23:38:36 +00:00
Chris Baker	5e3c38be2f	state_store: * added method to retrieve all scaling policies for use in snapshotting, plus test * better testing for ScalingPoliciesByNamespace * added scaling policy snapshot persist and restore (and test of restore) manually tested snapshot restore. resolves #7539	2020-03-29 13:32:44 +00:00
Lang Martin	50ff9ccd44	csi: plugin deregistration on plugin job GC (#7502 ) * nomad/structs/csi: delete just one plugin type from a node * nomad/structs/csi: add DeleteAlloc * nomad/state/state_store: add deleteJobFromPlugin * nomad/state/state_store: use DeleteAlloc not DeleteNodeType * move CreateTestCSIPlugin to state to avoid an import cycle * nomad/state/state_store_test: delete a plugin by deleting its jobs * nomad/_test: move CreateTestCSIPlugin to state nomad/state/state_store: update one plugin per transaction * command/plugin_status_test: move CreateTestCSIPlugin * nomad: csi: handle nils CSIPlugin methods, clarity	2020-03-26 17:07:18 -04:00
Lang Martin	3375c92aa0	csi: make volume registration idempotent (#7490 ) If not in use and not changing external ids, it should not be an error to register a volume again. * nomad/state/state_store: make volume registration idempotent	2020-03-26 12:27:19 -04:00
Lang Martin	ea80330aaa	csi: nomad/structs: test volume denormalize without plugin (#7472 )	2020-03-26 09:43:59 -04:00
Mahmood Ali	b33dbe539b	tests: TestCSIPluginEndpoint_ACLNamespaceAlloc is ent TestCSIPluginEndpoint_ACLNamespaceAlloc uses namespace features not present in OSS.	2020-03-25 08:45:44 -04:00
Mahmood Ali	281fc9837c	tests: relax index checks TestStateStore_Indexes specifically tests for `nodes` index, but asserts on the exact number of indexes present in the state. This is fragile and will break almost everytime we add a state index.	2020-03-25 08:45:38 -04:00
Mahmood Ali	ceed57b48f	per-task restart policy	2020-03-24 17:00:41 -04:00
Chris Baker	ffd79583f6	Merge pull request #7474 from hashicorp/f-scaling-changes-from-review more testing for scaling API	2020-03-24 15:32:10 -05:00
Chris Baker	c638c2c352	update RPC scaling endpoint tests to use renamed 'scale' policy disposition	2020-03-24 20:18:12 +00:00
Chris Baker	5979d6a81e	more testing for ScalingPolicy, mainly around parsing and canonicalization for Min/Max	2020-03-24 19:43:50 +00:00
Chris Baker	aa5beafe64	Job.Scale should not result in job update or eval create if args.Count == nil plus tests	2020-03-24 17:36:06 +00:00
Tim Gross	913da68296	csi: remove client from plugin on client node update (#7462 ) Plugins track the client nodes where they are placed. On client updates, remove the client from the plugin tracking if the client is no longer running an instance of that controller/node plugin. Extends the state store tests to ensure deregistration works as expected and that controllers and nodes are being tracked independently.	2020-03-24 13:26:31 -04:00
Chris Baker	9e530e167d	Merge pull request #7409 from hashicorp/scaling-api Scaling API changes	2020-03-24 11:02:09 -05:00
Chris Baker	606c79b320	add acl validation to Scaling.ListPolicies and Scaling.GetPolicy	2020-03-24 14:39:05 +00:00
Chris Baker	f6ec5f9624	made count optional during job scaling actions added ACL protection in Job.Scale in Job.Scale, only perform a Job.Register if the Count was non-nil	2020-03-24 14:39:05 +00:00
Chris Baker	41b002eecc	wip: ACL checking for RPC Job.ScaleStatus	2020-03-24 14:39:05 +00:00
Lang Martin	bd22afd003	csi: volume deregister fails for volumes actively in use (#7445 ) * nomad/structs/csi: add InUse to CSIVolume * nomad/state/state_store: block volume deregistration for in use vols	2020-03-24 10:10:44 -04:00
Chris Baker	233db5258a	changes to Canonicalize, Validate, and api->struct conversion so that tg.Count, tg.Scaling.Min/Max are well-defined with reasonable defaults. - tg.Count defaults to tg.Scaling.Min if present (falls back on previous default of 1 if Scaling is absent) - Validate() enforces tg.Scaling.Min <= tg.Count <= tg.Scaling.Max modification in ApiScalingPolicyToStructs, api.TaskGroup.Validate so that defaults are handled for TaskGroup.Count and	2020-03-24 13:57:17 +00:00
Chris Baker	f9876a487e	finished Job.ScaleStatus RPC, need to work on http endpoint	2020-03-24 13:57:16 +00:00
Chris Baker	925b59e1d2	wip: scaling status return, almost done	2020-03-24 13:57:15 +00:00
James Rasell	f125b5fb2d	scaling: ensure min and max int64s are in toplevel of block.	2020-03-24 13:57:15 +00:00
Chris Baker	42270d862c	wip: some tests still failing updating job scaling endpoints to match RFC, cleaning up the API object as well	2020-03-24 13:57:14 +00:00
Chris Baker	abc7a52f56	finished refactoring state store, schema, etc	2020-03-24 13:57:14 +00:00
Chris Baker	116aa98ed7	wip: removed some commented junk from scaling poc	2020-03-24 13:57:13 +00:00
Chris Baker	3d54f1feba	wip: added Enabled to ScalingPolicyListStub, removed JobID from body of scaling request	2020-03-24 13:57:12 +00:00
Chris Baker	024d203267	wip: added tests for client methods around group scaling	2020-03-24 13:57:11 +00:00
Chris Baker	179ab68258	wip: added job.scale rpc endpoint, needs explicit test (tested via http now)	2020-03-24 13:57:09 +00:00
Chris Baker	8453e667c2	wip: working on job group scaling endpoint	2020-03-24 13:55:20 +00:00
Chris Baker	6665d0bfb0	wip: added policy get endpoint, added UUID to policy	2020-03-24 13:55:20 +00:00
Chris Baker	9c2560ceeb	wip: upsert/delete scaling policies on job upsert/delete	2020-03-24 13:55:18 +00:00
Chris Baker	65d92f1fbf	WIP: adding ScalingPolicy to api/structs and state store	2020-03-24 13:55:18 +00:00
Tim Gross	fa01a6ea59	csi: fix missing health count from volume list stub	2020-03-24 09:42:59 -04:00
Lang Martin	0847cb513c	csi: volume/plugin list should return an empty array, not nil (#7443 ) * nomad/csi_endpoint: return an empty list, not nil * nomad/csi_endpoint_test: volume list returns non-nil	2020-03-23 21:21:40 -04:00
Lang Martin	d994990ef0	csi: the scheduler allows a job with a volume write claim to be updated (#7438 ) * nomad/structs/csi: split CanWrite into health, in use * scheduler/scheduler: expose AllocByID in the state interface * nomad/state/state_store_test * scheduler/stack: SetJobID on the matcher * scheduler/feasible: when a volume writer is in use, check if it's us * scheduler/feasible: remove SetJob * nomad/state/state_store: denormalize allocs before Claim * nomad/structs/csi: return errors on claim, with context * nomad/csi_endpoint_test: new alloc doesn't look like an update * nomad/state/state_store_test: change test reference to CanWrite	2020-03-23 21:21:04 -04:00
Tim Gross	076fbbf08f	Merge pull request #7012 from hashicorp/f-csi-volumes Container Storage Interface Support	2020-03-23 14:19:46 -04:00
Lang Martin	e100444740	csi: add mount_options to volumes and volume requests (#7398 ) Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host. Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context. closes #7007 * nomad/structs/volumes: add MountOptions to volume request * jobspec/test-fixtures/basic.hcl: add mount_options to volume block * jobspec/parse_test: add expected MountOptions * api/tasks: add mount_options * jobspec/parse_group: use hcl decode not mapstructure, mount_options * client/allocrunner/csi_hook: pass MountOptions through client/allocrunner/csi_hook: add a VolumeMountOptions client/allocrunner/csi_hook: drop Options client/allocrunner/csi_hook: use the structs options * client/pluginmanager/csimanager/interface: UsageOptions.MountOptions * client/pluginmanager/csimanager/volume: pass MountOptions in capabilities * plugins/csi/plugin: remove todo 7007 comment * nomad/structs/csi: MountOptions * api/csi: add options to the api for parsing, match structs * plugins/csi/plugin: move VolumeMountOptions to structs * api/csi: use specific type for mount_options * client/allocrunner/csi_hook: merge MountOptions here * rename CSIOptions to CSIMountOptions * client/allocrunner/csi_hook * client/pluginmanager/csimanager/volume * nomad/structs/csi * plugins/csi/fake/client: add PrevVolumeCapability * plugins/csi/plugin * client/pluginmanager/csimanager/volume_test: remove debugging * client/pluginmanager/csimanager/volume: fix odd merging logic * api: rename CSIOptions -> CSIMountOptions * nomad/csi_endpoint: remove a 7007 comment * command/alloc_status: show mount options in the volume list * nomad/structs/csi: include MountOptions in the volume stub * api/csi: add MountOptions to stub * command/volume_status_csi: clean up csiVolMountOption, add it * command/alloc_status: csiVolMountOption lives in volume_csi_status * command/node_status: display mount flags * nomad/structs/volumes: npe * plugins/csi/plugin: npe in ToCSIRepresentation * jobspec/parse_test: expand volume parse test cases * command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions * command/volume_status_csi: copy paste error * jobspec/test-fixtures/basic: hclfmt * command/volume_status_csi: clean up csiVolMountOption	2020-03-23 13:59:25 -04:00
Lang Martin	6b6ae6c2bd	csi: ACLs for plugin endpoints (#7380 ) * acl/policy: add PolicyList for global ACLs * acl/acl: plugin policy * acl/acl: maxPrivilege is required to allow "list" * nomad/csi_endpoint: enforce plugin access with PolicyPlugin * nomad/csi_endpoint: check job ACL swapped params * nomad/csi_endpoint_test: test alloc filtering * acl/policy: add namespace csi-register-plugin * nomad/job_endpoint: check csi-register-plugin ACL on registration * nomad/job_endpoint_test: add plugin job cases	2020-03-23 13:59:25 -04:00
Lang Martin	b596e67f47	csi: implement volume ACLs (#7339 ) * acl/policy: add the volume ACL policies * nomad/csi_endpoint: enforce ACLs for volume access * nomad/search_endpoint_oss: volume acls * acl/acl: add plugin read as a global policy * acl/policy: add PluginPolicy global cap type * nomad/csi_endpoint: check the global plugin ACL policy * nomad/mock/acl: PluginPolicy * nomad/csi_endpoint: fix list rebase * nomad/core_sched_test: new test since #7358 * nomad/csi_endpoint_test: use correct permissions for list * nomad/csi_endpoint: allowCSIMount keeps ACL checks together * nomad/job_endpoint: check mount permission for jobs * nomad/job_endpoint_test: need plugin read, too	2020-03-23 13:59:25 -04:00
Lang Martin	3621df1dbf	csi: volume ids are only unique per namespace (#7358 ) * nomad/state/schema: use the namespace compound index * scheduler/scheduler: CSIVolumeByID interface signature namespace * scheduler/stack: SetJob on CSIVolumeChecker to capture namespace * scheduler/feasible: pass the captured namespace to CSIVolumeByID * nomad/state/state_store: use namespace in csi_volume index * nomad/fsm: pass namespace to CSIVolumeDeregister & Claim * nomad/core_sched: pass the namespace in volumeClaimReap * nomad/node_endpoint_test: namespaces in Claim testing * nomad/csi_endpoint: pass RequestNamespace to state.* * nomad/csi_endpoint_test: appropriately failed test * command/alloc_status_test: appropriately failed test * node_endpoint_test: avoid notTheNamespace for the job * scheduler/feasible_test: call SetJob to capture the namespace * nomad/csi_endpoint: ACL check the req namespace, query by namespace * nomad/state/state_store: remove deregister namespace check * nomad/state/state_store: remove unused CSIVolumes * scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace * nomad/csi_endpoint: ACL check * nomad/state/state_store_test: remove call to state.CSIVolumes * nomad/core_sched_test: job namespace match so claim gc works	2020-03-23 13:59:25 -04:00
Tim Gross	22e9f679c3	csi: implement controller detach RPCs (#7356 ) This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.	2020-03-23 13:59:25 -04:00
Tim Gross	0cd2d3cc29	csi: make claims on volumes idempotent for the same alloc (#7328 ) Nomad clients will push node updates during client restart which can cause an extra claim for a volume by the same alloc. If an alloc already claims a volume, we can allow it to be treated as a valid claim and continue.	2020-03-23 13:58:30 -04:00
Lang Martin	6750c262a4	csi: use `ExternalID`, when set, to identify volumes for outside RPC calls (#7326 ) * nomad/structs/csi: new RemoteID() uses the ExternalID if set * nomad/csi_endpoint: pass RemoteID to volume request types * client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume	2020-03-23 13:58:30 -04:00
Lang Martin	80619137ab	csi: volumes listed in `nomad node status` (#7318 ) * api/allocations: GetTaskGroup finds the taskgroup struct * command/node_status: display CSI volume names * nomad/state/state_store: new CSIVolumesByNodeID * nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator * nomad/csi_endpoint: deal with a slice of volumes * nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator * nomad/structs/csi: CSIVolumeListRequest takes a NodeID * nomad/csi_endpoint: use the return iterator * command/agent/csi_endpoint: parse query params for CSIVolumes.List * api/nodes: new CSIVolumes to list volumes by node * command/node_status: use the new list endpoint to print volumes * nomad/state/state_store: error messages consider the operator * command/node_status: include the Provider	2020-03-23 13:58:30 -04:00
Lang Martin	de25fc6cf4	csi: csi-hostpath plugin unimplemented error on controller publish (#7299 ) * client/allocrunner/csi_hook: tag errors * nomad/client_csi_endpoint: tag errors * nomad/client_rpc: remove an unnecessary error tag * nomad/state/state_store: ControllerRequired fix intent We use ControllerRequired to indicate that a volume should use the publish/unpublish workflow, rather than that it has a controller. We need to check both RequiresControllerPlugin and SupportsAttachDetach from the fingerprint to check that. * nomad/csi_endpoint: tag errors * nomad/csi_endpoint_test: longer error messages, mock fingerprints	2020-03-23 13:58:30 -04:00
Tim Gross	b04d23dae0	csi: ensure volume query is idempotent (#7303 ) We denormalize the `CSIVolume` struct when we query it from the state store by getting the plugin and its health. But unless we copy the volume, this denormalization gets synced back to the state store without passing through the fsm (which is invalid).	2020-03-23 13:58:30 -04:00
Tim Gross	b57df162ce	csi: ensure GET for plugin is idempotent (#7298 ) We denormalize the `CSIPlugin` struct when we query it from the state store by getting the current set of allocations that provide the plugin. But unless we copy the plugin, this denormalization gets synced back to the state store and each time we query we'll add another copy of the current allocations.	2020-03-23 13:58:30 -04:00
Tim Gross	de4ad6ca38	csi: add Provider field to CSI CLIs and APIs (#7285 ) Derive a provider name and version for plugins (and the volumes that use them) from the CSI identity API `GetPluginInfo`. Expose the vendor name as `Provider` in the API and CLI commands.	2020-03-23 13:58:30 -04:00
Lang Martin	887e1f28c9	csi: CLI for volume status, registration/deregistration and plugin status (#7193 ) * command/csi: csi, csi_plugin, csi_volume * helper/funcs: move ExtraKeys from parse_config to UnusedKeys * command/agent/config_parse: use helper.UnusedKeys * api/csi: annotate CSIVolumes with hcl fields * command/csi_plugin: add Synopsis * command/csi_volume_register: use hcl.Decode style parsing * command/csi_volume_list * command/csi_volume_status: list format, cleanup * command/csi_plugin_list * command/csi_plugin_status * command/csi_volume_deregister * command/csi_volume: add Synopsis * api/contexts/contexts: add csi search contexts to the constants * command/commands: register csi commands * api/csi: fix struct tag for linter * command/csi_plugin_list: unused struct vars * command/csi_plugin_status: unused struct vars * command/csi_volume_list: unused struct vars * api/csi: add allocs to CSIPlugin * command/csi_plugin_status: format the allocs * api/allocations: copy Allocation.Stub in from structs * nomad/client_rpc: add some error context with Errorf * api/csi: collapse read & write alloc maps to a stub list * command/csi_volume_status: cleanup allocation display * command/csi_volume_list: use Schedulable instead of Healthy * command/csi_volume_status: use Schedulable instead of Healthy * command/csi_volume_list: sprintf string * command/csi: delete csi.go, csi_plugin.go * command/plugin: refactor csi components to sub-command plugin status * command/plugin: remove csi * command/plugin_status: remove csi * command/volume: remove csi * command/volume_status: split out csi specific * helper/funcs: add RemoveEqualFold * command/agent/config_parse: use helper.RemoveEqualFold * api/csi: do ,unusedKeys right * command/volume: refactor csi components to `nomad volume` * command/volume_register: split out csi specific * command/commands: use the new top level commands * command/volume_deregister: hardwired type csi for now * command/volume_status: csiFormatVolumes rescued from volume_list * command/plugin_status: avoid a panic on no args * command/volume_status: avoid a panic on no args * command/plugin_status: predictVolumeType * command/volume_status: predictVolumeType * nomad/csi_endpoint_test: move CreateTestPlugin to testing * command/plugin_status_test: use CreateTestCSIPlugin * nomad/structs/structs: add CSIPlugins and CSIVolumes search consts * nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix * nomad/search_endpoint: add CSIPlugins and CSIVolumes * command/plugin_status: move the header to the csi specific * command/volume_status: move the header to the csi specific * nomad/state/state_store: CSIPluginByID prefix * command/status: rename the search context to just Plugins/Volumes * command/plugin,volume_status: test return ids now * command/status: rename the search context to just Plugins/Volumes * command/plugin_status: support -json and -t * command/volume_status: support -json and -t * command/plugin_status_csi: comments * command/_status: clean up text api/csi: fix stale comments * command/volume: make deregister sound less fearsome * command/plugin_status: set the id length * command/plugin_status_csi: more compact plugin health * command/volume: better error message, comment	2020-03-23 13:58:30 -04:00
Tim Gross	b3bf64485e	csi: remove DevDisableBootstrap flag from tests (#7267 ) In #7252 we removed the `DevDisableBootstrap` flag to require tests to honor only `BootstrapExpect`, in order to reduce a source of test flakiness. This changeset applies the same fix to the CSI tests.	2020-03-23 13:58:30 -04:00
Lang Martin	369b0e54b9	csi: volumes use `Schedulable` rather than `Healthy` (#7250 ) * structs: add ControllerRequired, volume.Name, no plug.Type * structs: Healthy -> Schedulable * state_store: Healthy -> Schedulable * api: add ControllerRequired to api data types * api: copy csi structs changes * nomad/structs/csi: include name and external id * api/csi: include Name and ExternalID * nomad/structs/csi: comments for the 3 ids	2020-03-23 13:58:30 -04:00
Lang Martin	a4784ef258	csi add allocation context to fingerprinting results (#7133 ) * structs: CSIInfo include AllocID, CSIPlugins no Jobs * state_store: eliminate plugin Jobs, delete an empty plugin * nomad/structs/csi: detect empty plugins correctly * client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID * client/pluginmanager/csimanager/instance: allocID * client/pluginmanager/csimanager/fingerprint: set AllocID * client/node_updater: split controller and node plugins * api/csi: remove Jobs The CSI Plugin API will map plugins to allocations, which allows plugins to be defined by jobs in many configurations. In particular, multiple plugins can be defined in the same job, and multiple jobs can be used to define a single plugin. Because we now map the allocation context directly from the node, it's no longer necessary to track the jobs associated with a plugin directly. * nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint * client/dynamicplugins: lift AllocID into the struct from Options * api/csi_test: remove Jobs test * nomad/structs/csi: CSIPlugins has an array of allocs * nomad/state/state_store: implement CSIPluginDenormalize * nomad/state/state_store: CSIPluginDenormalize npe on missing alloc * nomad/csi_endpoint_test: defer deleteNodes for clarity * api/csi_test: disable this test awaiting mocks: https://github.com/hashicorp/nomad/issues/7123	2020-03-23 13:58:30 -04:00
Danielle Lancashire	e75f057df3	csi: Fix Controller RPCs Currently the handling of CSINode RPCs does not correctly handle forwarding RPCs to Nodes. This commit fixes this by introducing a shim RPC (nomad/client_csi_enpdoint) that will correctly forward the request to the owning node, or submit the RPC to the client. In the process it also cleans up handling a little bit by adding the `CSIControllerQuery` embeded struct for required forwarding state. The CSIControllerQuery embeding the requirement of a `PluginID` also means we could move node targetting into the shim RPC if wanted in the future.	2020-03-23 13:58:30 -04:00
Tim Gross	8bc5641438	csi: volume claim garbage collection (#7125 ) When an alloc is marked terminal (and after node unstage/unpublish have been called), the client syncs the terminal alloc state with the server via `Node.UpdateAlloc RPC`. For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC handler at the server will emit an eval for a new core job to garbage collect CSI volume claims. When this eval is handled on the core scheduler, it will call a `volumeReap` method to release the claims for all terminal allocs on the job. The volume reap will issue a `ControllerUnpublishVolume` RPC for any node that has no alloc claiming the volume. Once this returns (or is skipped), the volume reap will send a new `CSIVolume.Claim` RPC that releases the volume claim for that allocation in the state store, making it available for scheduling again. This same `volumeReap` method will be called from the core job GC, which gives us a second chance to reclaim volumes during GC if there were controller RPC failures.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	9d4307a3ef	csi_endpoint: Provide AllocID in req, and return Volume Currently, the client has to ship an entire allocation to the server as part of performing a VolumeClaim, this has a few problems: Firstly, it means the client is sending significantly more data than is required (an allocation contains the entire contents of a Nomad job, alongside other irrelevant state) which has a non-zero (de)serialization cost. Secondly, because the allocation was never re-fetched from the state store, it means that we were potentially open to issues caused by stale state on a misbehaving or malicious client. The change removes both of those issues at the cost of a couple of more state store lookups, but they should be relatively cheap. We also now provide the CSIVolume in the response for a claim, so the client can perform a Claim without first going ahead and fetching all of the volumes.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	c3b1154703	csi: Validate Volumes during registration This PR implements some intitial support for doing deeper validation of a volume during its registration with the server. This allows us to validate the capabilities before users attempt to use the volumes during most cases, and also prevents registering volumes without first setting up a plugin, which should help to catch typos and the like during registration. This does have the downside of requiring users to wait for (1) instance of a plugin to be running in their cluster before they can register volumes.	2020-03-23 13:58:30 -04:00
Tim Gross	b03b78b212	csi: server-to-controller publish/unpublish RPCs (#7124 ) Nomad servers need to make requests to CSI controller plugins running on a client for publish/unpublish. The RPC needs to look up the client node based on the plugin, load balancing across controllers, and then perform the required client RPC to that node (via server forwarding if neccessary).	2020-03-23 13:58:30 -04:00
Tim Gross	b9b315f8d1	csi: stub methods for server-to-controller RPC calls (#7117 )	2020-03-23 13:58:30 -04:00
Danielle Lancashire	77bcaa8183	csi_endpoint: Support No ACLs and restrict Nodes This commit refactors the ACL code for the CSI endpoint to support environments that run without acls enabled (e.g developer environments) and also provides an easy way to restrict which endpoints may be accessed with a client's SecretID to limit the blast radius of a malicious client on the state of the environment.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	22e8317a53	csi: Disable validation of volume topology	2020-03-23 13:58:30 -04:00
Tim Gross	01c704ab9d	csi: add PublishContext to CSIVolumeClaimResponse (#7113 ) The `ControllerPublishVolumeResponse` CSI RPC includes the publish context intended to be passed by the orchestrator as an opaque value to the node plugins. This changeset adds it to our response to a volume claim request to proxy the controller's response back to the client node.	2020-03-23 13:58:29 -04:00
Tim Gross	fb1aad66ee	csi: implement releasing volume claims for terminal allocs (#7076 ) When an alloc is marked terminal, and after node unstage/unpublish have been called, the client will sync the terminal alloc state with the server via `Node.UpdateAlloc` RPC. This changeset implements releasing the volume claim for each volume associated with the terminal alloc. It doesn't yet implement the RPC call we need to make to the `ControllerUnpublishVolume` CSI RPC.	2020-03-23 13:58:29 -04:00
Tim Gross	d4cd272de3	csi: implement VolumeClaimRPC (#7048 ) When the client receives an allocation which includes a CSI volume, the alloc runner will block its main `Run` loop. The alloc runner will issue a `VolumeClaim` RPC to the Nomad servers. This changeset implements the portions of the `VolumeClaim` RPC endpoint that have not been previously completed.	2020-03-23 13:58:29 -04:00
Lang Martin	421d7ed2e4	nomad: csi_endpoint send register & deregister requests to raft (#7059 )	2020-03-23 13:58:29 -04:00
Lang Martin	7b675f89ac	csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049 ) * state_store: csi volumes/plugins store the index in the txn * nomad: csi_endpoint_test require index checks need uint64() * nomad: other tests using int 0 not uint64(0) * structs: pass index into New, but not other struct methods * state_store: csi plugin indexes, use new struct interface * nomad: csi_endpoint_test check index/query meta (on explicit 0) * structs: NewCSIVolume takes an index arg now * scheduler/test: NewCSIVolume takes an index arg now	2020-03-23 13:58:29 -04:00
Lang Martin	a0a6766740	CSI: Scheduler knows about CSI constraints and availability (#6995 ) * structs: piggyback csi volumes on host volumes for job specs * state_store: CSIVolumeByID always includes plugins, matches usecase * scheduler/feasible: csi volume checker * scheduler/stack: add csi volumes * contributing: update rpc checklist * scheduler: add volumes to State interface * scheduler/feasible: introduce new checker collection tgAvailable * scheduler/stack: taskGroupCSIVolumes checker is transient * state_store CSIVolumeDenormalizePlugins comment clarity * structs: remote TODO comment in TaskGroup Validate * scheduler/feasible: CSIVolumeChecker hasPlugins improve comment * scheduler/feasible_test: set t.Parallel * Update nomad/state/state_store.go Co-Authored-By: Danielle <dani@hashicorp.com> * Update scheduler/feasible.go Co-Authored-By: Danielle <dani@hashicorp.com> * structs: lift ControllerRequired to each volume * state_store: store plug.ControllerRequired, use it for volume health * feasible: csi match fast path remove stale host volume copied logic * scheduler/feasible: improve comments Co-authored-by: Danielle <dani@builds.terrible.systems>	2020-03-23 13:58:29 -04:00
Tim Gross	8673ea5cba	csi: add empty CSI volume publication GC to scheduled core jobs (#7014 ) This changeset adds a new core job `CoreJobCSIVolumePublicationGC` to the leader's loop for scheduling core job evals. Right now this is an empty method body without even a config file stanza. Later changesets will implement the logic of volume publication GC.	2020-03-23 13:58:29 -04:00
Lang Martin	88316208a0	csi: server-side plugin state tracking and api (#6966 ) * structs: CSIPlugin indexes jobs acting as plugins and node updates * schema: csi_plugins table for CSIPlugin * nomad: csi_endpoint use vol.Denormalize, plugin requests * nomad: csi_volume_endpoint: rename to csi_endpoint * agent: add CSI plugin endpoints * state_store_test: use generated ids to avoid t.Parallel conflicts * contributing: add note about registering new RPC structs * command: agent http register plugin lists * api: CSI plugin queries, ControllerHealthy -> ControllersHealthy * state_store: copy on write for volumes and plugins * structs: copy on write for volumes and plugins * state_store: CSIVolumeByID returns an unhealthy volume, denormalize * nomad: csi_endpoint use CSIVolumeDenormalizePlugins * structs: remove struct errors for missing objects * nomad: csi_endpoint return nil for missing objects, not errors * api: return meta from Register to avoid EOF error * state_store: CSIVolumeDenormalize keep allocs in their own maps * state_store: CSIVolumeDeregister error on missing volume * state_store: CSIVolumeRegister set indexes * nomad: csi_endpoint use CSIVolumeDenormalizePlugins tests	2020-03-23 13:58:29 -04:00
Lang Martin	61cfc806ad	csi_volume_endpoint_test: gen uuids to avoid t.Parallel conflicts	2020-03-23 13:58:29 -04:00
Lang Martin	334979a754	nomad/rpc: indicate missing region in error message	2020-03-23 13:58:29 -04:00
Lang Martin	5b31b140c3	csi: do not use namespace specific identifiers	2020-03-23 13:58:29 -04:00
Lang Martin	e922531aaf	structs: move the content of csi_volumes into csi	2020-03-23 13:58:29 -04:00
Lang Martin	04b6e7c7fb	server: rpc register CSIVolume	2020-03-23 13:58:29 -04:00
Lang Martin	8f33fb9a6d	csi volume endpoint: new RPC endpoint for CSI volumes	2020-03-23 13:58:29 -04:00
Lang Martin	4bb4dd98eb	state_store: CSIVolume insert, get, delete, claim state_store: change claim counts state_store: get volumes by all, by driver state_store: process volume claims state_store: csi volume register error on update	2020-03-23 13:58:29 -04:00
Lang Martin	0422b967db	schema: csi_volumes schema	2020-03-23 13:58:29 -04:00
Lang Martin	857cd37ab5	fsm: dispatch CSIVolume register, deregister, claim	2020-03-23 13:58:29 -04:00
Lang Martin	f9d9faf673	structs: eliminate MaxReaders & MaxWriters	2020-03-23 13:58:29 -04:00
Lang Martin	3a7e1b6d14	client structs: move CSIVolumeAttachmentMode and CSIVolumeAccessMode	2020-03-23 13:58:29 -04:00
Lang Martin	637ce9dfad	structs: new CSIVolume, request types	2020-03-23 13:58:29 -04:00
Danielle Lancashire	57ae1d2cd6	csimanager: Fingerprint Node Service capabilities	2020-03-23 13:58:29 -04:00
Danielle Lancashire	564f5cec93	csimanager: Fingerprint controller capabilities	2020-03-23 13:58:29 -04:00
Danielle Lancashire	426c26d7c0	CSI Plugin Registration (#6555 ) This changeset implements the initial registration and fingerprinting of CSI Plugins as part of #5378. At a high level, it introduces the following: * A `csi_plugin` stanza as part of a Nomad task configuration, to allow a task to expose that it is a plugin. * A new task runner hook: `csi_plugin_supervisor`. This hook does two things. When the `csi_plugin` stanza is detected, it will automatically configure the plugin task to receive bidirectional mounts to the CSI intermediary directory. At runtime, it will then perform an initial heartbeat of the plugin and handle submitting it to the new `dynamicplugins.Registry` for further use by the client, and then run a lightweight heartbeat loop that will emit task events when health changes. * The `dynamicplugins.Registry` for handling plugins that run as Nomad tasks, in contrast to the existing catalog that requires `go-plugin` type plugins and to know the plugin configuration in advance. * The `csimanager` which fingerprints CSI plugins, in a similar way to `drivermanager` and `devicemanager`. It currently only fingerprints the NodeID from the plugin, and assumes that all plugins are monolithic. Missing features * We do not use the live updates of the `dynamicplugin` registry in the `csimanager` yet. * We do not deregister the plugins from the client when they shutdown yet, they just become indefinitely marked as unhealthy. This is deliberate until we figure out how we should manage deploying new versions of plugins/transitioning them.	2020-03-23 13:58:28 -04:00
Drew Bailey	b09abef332	Audit config, seams for enterprise audit features allow oss to parse sink duration clean up audit sink parsing ent eventer config reload fix typo SetEnabled to eventer interface client acl test rm dead code fix failing test	2020-03-23 13:47:42 -04:00
Jasmine Dahilig	73a64e4397	change jobspec lifecycle stanza to use sidecar attribute instead of block_until status	2020-03-21 17:52:57 -04:00
Jasmine Dahilig	1485b342e2	remove deadline code for now	2020-03-21 17:52:56 -04:00
Jasmine Dahilig	d54a83afee	fix linting errors	2020-03-21 17:52:53 -04:00
Jasmine Dahilig	a0fe570317	clean up restore test	2020-03-21 17:52:52 -04:00
Jasmine Dahilig	7ed08eb75a	partial test for restore functionality	2020-03-21 17:52:52 -04:00
Jasmine Dahilig	81d051d7e8	fix bug in lifecycle scheduler test mocks	2020-03-21 17:52:51 -04:00
Jasmine Dahilig	b7f08c9d13	add appropriate lifecycle deadline default of 120s	2020-03-21 17:52:48 -04:00
Jasmine Dahilig	0cc9212a54	add test cases for scheduler alloc placement with lifecycle resources	2020-03-21 17:52:47 -04:00
Jasmine Dahilig	0d2988652c	add lifecycle job mock	2020-03-21 17:52:47 -04:00
Jasmine Dahilig	c27223207c	update task hook coordinator tests	2020-03-21 17:52:46 -04:00
Mahmood Ali	b880607bad	update scheduler to account for hooks	2020-03-21 17:52:45 -04:00
Jasmine Dahilig	12393f90e7	add test for lifecycle coordinator	2020-03-21 17:52:42 -04:00
Jasmine Dahilig	f6e58d6dad	add canonicalize in the right place	2020-03-21 17:52:41 -04:00
Jasmine Dahilig	4498c8c24f	add canonicalization	2020-03-21 17:52:39 -04:00
Jasmine Dahilig	67262d841b	add validation tests and more validation	2020-03-21 17:52:39 -04:00
Mahmood Ali	214d128bd9	it's running now	2020-03-21 17:52:37 -04:00
Jasmine Dahilig	fc13fa9739	change TaskLifecycle RunLevel to Hook and add Deadline time duration	2020-03-21 17:52:37 -04:00
Mahmood Ali	4ebeac721a	update structs with lifecycle	2020-03-21 17:52:36 -04:00
Yoan Blanc	67692789b7	vendor: vault api and sdk Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-21 17:57:48 +01:00
Mahmood Ali	53e20e5cc2	Deflake TestRPC_Limits_Streaming test The test starts enough connections to hit the limit, then closes the connection and immediately starts one expecting the new one to succeed. We must wait until the server side recognizes the connection closing and free up a limits slot. The current test attempts to achieve that by waiting to get an error on conn.Read, however, this error is returned from local client without waiting for server update. As such, I change the logic so it retries on connection rejection but force the first non-EOF failure to be a deadline error.	2020-03-20 17:21:43 -04:00
Mahmood Ali	0da7130a1a	Protect against args being modified	2020-03-18 08:11:16 -04:00
Mahmood Ali	52fd31af80	server: node connections must not be forwarded This fixes a bug where a forwarded node update request may be assumed to be the actual direct client connection if the server just lost leadership. When a nomad non-leader server receives a Node.UpdateStatus request, it forwards the RPC request to the leader, and holds on the request Yamux connection in a cache to allow for server<->client forwarding. When the leader handles the request, it must differentiate between a forwarded connection vs the actual connection. This is done in https://github.com/hashicorp/nomad/blob/v0.10.4/nomad/node_endpoint.go#L412 Now, consider if the non-leader server forwards to the connection to a recently deposed nomad leader, which in turn forwards the RPC request to the new leader. Without this change, the deposed leader will mistake the forwarded connection for the actual client connection and cache it mapped to the client ID. If the server attempts to connect to that client, it will attempt to start a connection/session to the other server instead and the call will hang forever. This change ensures that we only add node connection mapping if the request is not a forwarded request, regardless of circumstances.	2020-03-17 16:39:01 -04:00
Mahmood Ali	9d88f1d568	tests: deflake deploymentwatcher package This deflake the tests in the deploymentwatcher package. The package uses a mock deployment watcher backend, where the watcher in a background goroutine calls UpdateDeploymentStatus . If the mock isn't configured to expect the call, the background goroutine will fail. One UpdateDeploymentStatus call is made at the end of the background goroutine, which may occur after the test completes, thus explaining the flakiness.	2020-03-12 15:42:01 -04:00
Michael Schurter	2dcc85bed1	jobspec: fixup vault_grace deprecation Followup to #7170 - Moved canonicalization of VaultGrace back into `api/` package. - Fixed tests. - Made docs styling consistent.	2020-03-10 14:58:49 -07:00
Michael Schurter	b72b3e765c	Merge pull request #7170 from fredrikhgrelland/consul_template_upgrade Update consul-template to v0.24.1 and remove deprecated vault grace	2020-03-10 14:15:47 -07:00
Mahmood Ali	005bd37758	tests: deflake TestServer_ReconcileMember TestServer_ReconcileMember assumes that S3 isn't the leader: `reconcileMembers` call would fail when attempting to remove itself!	2020-03-06 14:14:41 -05:00
Mahmood Ali	17ee94b52b	fix typo	2020-03-03 16:55:54 -05:00
Mahmood Ali	acbfeb5815	Simplify Bootstrap logic in tests This change updates tests to honor `BootstrapExpect` exclusively when forming test clusters and removes test only knobs, e.g. `config.DevDisableBootstrap`. Background: Test cluster creation is fragile. Test servers don't follow the BootstapExpected route like production clusters. Instead they start as single node clusters and then get rejoin and may risk causing brain split or other test flakiness. The test framework expose few knobs to control those (e.g. `config.DevDisableBootstrap` and `config.Bootstrap`) that control whether a server should bootstrap the cluster. These flags are confusing and it's unclear when to use: their usage in multi-node cluster isn't properly documented. Furthermore, they have some bad side-effects as they don't control Raft library: If `config.DevDisableBootstrap` is true, the test server may not immediately attempt to bootstrap a cluster, but after an election timeout (~50ms), Raft may force a leadership election and win it (with only one vote) and cause a split brain. The knobs are also confusing as Bootstrap is an overloaded term. In BootstrapExpect, we refer to bootstrapping the cluster only after N servers are connected. But in tests and the knobs above, it refers to whether the server is a single node cluster and shouldn't wait for any other server. Changes: This commit makes two changes: First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or `DevMode` flags. This change is relatively trivial. Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped. This allows us to keep `BootstrapExpected` immutable. Previously, the flag was a config value but it gets set to 0 after cluster bootstrap completes.	2020-03-02 13:47:43 -05:00
Fredrik Hoem Grelland	edb3bd0f3f	Update consul-template to v0.24.1 and remove deprecated vault_grace (#7170 )	2020-02-23 16:24:53 +01:00
Seth Hoenig	0f99cdd0d9	Merge pull request #7192 from hashicorp/b-connect-stanza-ignore consul/connect: in-place update sidecar service registrations on changes	2020-02-21 09:24:53 -06:00
Seth Hoenig	07b9b24ceb	nomad: note why AddressMode is not part of CSD hash Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2020-02-21 09:24:42 -06:00
Seth Hoenig	54b5173eca	consul/connect: in-place update sidecar service registrations on changes Fix a bug where consul service definitions would not be updated if changes were made to the service in the Nomad job. Currently this only fixes the bug for cases where the fix is a matter of updating consul agent's service registration. There is related bug where destructive changes are required (see #6877) which will be fixed in another PR. The enable_tag_override configuration setting for the parent service is applied to the sidecar service. Fixes #6459	2020-02-19 13:07:04 -06:00
Mahmood Ali	98ad59b1de	update rest of consul packages	2020-02-16 16:25:04 -06:00
Mahmood Ali	f492ab6d9e	implement MinQuorum	2020-02-16 16:04:59 -06:00
Mahmood Ali	3dcc65d58d	Update consul autopilot dependency	2020-02-16 15:41:43 -06:00
Mahmood Ali	cf53ee57cd	remove unused dropButLastChannel	2020-02-13 18:56:53 -05:00
Mahmood Ali	fd51982018	tests: Avoid StartAsLeader raft config flag It's being deprecated	2020-02-13 18:56:53 -05:00
Mahmood Ali	367133a399	Use latest raft patterns	2020-02-13 18:56:52 -05:00
Seth Hoenig	543354aabe	Merge pull request #7106 from hashicorp/f-ctag-override client: enable configuring enable_tag_override for services	2020-02-13 12:34:48 -06:00
Michael Schurter	8c332a3757	Merge pull request #7102 from hashicorp/test-limits Fix some race conditions and flaky tests	2020-02-13 10:19:11 -08:00
Mahmood Ali	bc70beeb4a	Merge pull request #7044 from hashicorp/f-use-multiplexv2 rpc: Use MultiplexV2 for connections	2020-02-13 12:07:20 -05:00
Drew Bailey	24a5d36fcf	Merge pull request #7112 from hashicorp/f-include-pro-tag include pro tag in serveral oss.go files	2020-02-13 11:26:41 -05:00
Seth Hoenig	2829b4cd23	Merge pull request #7129 from hashicorp/b-consistent-ct-name command: use consistent CONSUL_HTTP_TOKEN name	2020-02-12 12:27:46 -06:00
Seth Hoenig	7f33b92e0b	command: use consistent CONSUL_HTTP_TOKEN name Consul CLI uses CONSUL_HTTP_TOKEN, so Nomad should use the same. Note that consul-template uses CONSUL_TOKEN, which Nomad also uses, so be careful to preserve any reference to that in the consul-template context.	2020-02-12 10:42:33 -06:00
Seth Hoenig	ce50345b7a	nomad: assert consul token is unset on job register in tests	2020-02-12 10:17:42 -06:00
Seth Hoenig	02151dee45	nomad: unset consul token on job register	2020-02-12 09:58:51 -06:00
Drew Bailey	6bd6c6638c	include pro tag in serveral oss.go files	2020-02-10 15:56:14 -05:00
Seth Hoenig	0e44094d1a	client: enable configuring enable_tag_override for services Consul provides a feature of Service Definitions where the tags associated with a service can be modified through the Catalog API, overriding the value(s) configured in the agent's service configuration. To enable this feature, the flag enable_tag_override must be configured in the service definition. Previously, Nomad did not allow configuring this flag, and thus the default value of false was used. Now, it is configurable. Because Nomad itself acts as a state machine around the the service definitions of the tasks it manages, it's worth describing what happens when this feature is enabled and why. Consider the basic case where there is no Nomad, and your service is provided to consul as a boring JSON file. The ultimate source of truth for the definition of that service is the file, and is stored in the agent. Later, Consul performs "anti-entropy" which synchronizes the Catalog (stored only the leaders). Then with enable_tag_override=true, the tags field is available for "external" modification through the Catalog API (rather than directly configuring the service definition file, or using the Agent API). The important observation is that if the service definition ever changes (i.e. the file is changed & config reloaded OR the Agent API is used to modify the service), those "external" tag values are thrown away, and the new service definition is once again the source of truth. In the Nomad case, Nomad itself is the source of truth over the Agent in the same way the JSON file was the source of truth in the example above. That means any time Nomad sets a new service definition, any externally configured tags are going to be replaced. When does this happen? Only on major lifecycle events, for example when a task is modified because of an updated job spec from the 'nomad job run <existing>' command. Otherwise, Nomad's periodic re-sync's with Consul will now no longer try to restore the externally modified tag values (as long as enable_tag_override=true). Fixes #2057	2020-02-10 08:00:55 -06:00
Michael Schurter	c5073f61a7	test: add timeout to ease debugging	2020-02-07 15:50:53 -08:00
Michael Schurter	9905dec6a3	test: workaround limits race	2020-02-07 15:50:53 -08:00
Michael Schurter	14c5ef3a8d	test: fix race around reused default rpc addr The default RPC addr was a global which is fine for normal runtime use when it only has a single user. However many tests modify it and cause races. Follow our convention of returning defaults from funcs instead of using globals.	2020-02-07 15:50:53 -08:00
Mahmood Ali	e106d373b2	rpc: Use MultiplexV2 for connections MultiplexV2 is a new connection multiplex header that supports multiplex both RPC and streaming requests over the same Yamux connection. MultiplexV2 was added in 0.8.0 as part of https://github.com/hashicorp/nomad/pull/3892 . So Nomad 0.11 can expect it to be supported. Though, some more rigorous testing is required before merging this. I want to call out some implementation details: First, the current connection pool reuses the Yamux stream for multiple RPC calls, and doesn't close them until an error is encountered. This commit doesn't change it, and sets the `RpcNomad` byte only at stream creation. Second, the StreamingRPC session gets closed by callers and cannot be reused. Every StreamingRPC opens a new Yamux session.	2020-02-03 19:31:39 -05:00
Drew Bailey	9a65556211	add state store test to ensure PlacedCanaries is updated	2020-02-03 13:58:01 -05:00
Drew Bailey	f51a3d1f37	nomad state store must be modified through raft, rm local state change	2020-02-03 13:57:34 -05:00
Drew Bailey	74779f23e6	keep placed canaries aligned with alloc status	2020-02-03 13:57:33 -05:00
Michael Schurter	9bedd0202e	sentinel: copy jobs to prevent mutation It's unclear whether Sentinel code can mutate values passed to the eval, so ensure it cannot by copying the job.	2020-02-03 08:48:51 -05:00
Seth Hoenig	6bfa50acdc	nomad: remove unused default schedular variable This is from a merge conflict resolution that went the wrong direction. I assumed the block had been added, but really it had been removed. Now, it is removed once again.	2020-01-31 19:06:37 -06:00
Seth Hoenig	d3cd6afd7e	nomad: min cluster version for connect ACLs is now v0.10.4	2020-01-31 19:06:19 -06:00
Seth Hoenig	587a5d4a8d	nomad: make TaskGroup.UsesConnect helper a public helper	2020-01-31 19:05:11 -06:00
Seth Hoenig	ee89a754f1	nomad: fix leftover missed refactoring in consul policy checking	2020-01-31 19:05:06 -06:00
Seth Hoenig	4ee55fcd6c	nomad,client: apply more comment/style PR tweaks	2020-01-31 19:04:52 -06:00
Seth Hoenig	be7c671919	nomad,client: apply smaller PR suggestions Apply smaller suggestions like doc strings, variable names, etc. Co-Authored-By: Nick Ethier <nethier@hashicorp.com> Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2020-01-31 19:04:40 -06:00
Seth Hoenig	78a7d1e426	comments: cleanup some leftover debug comments and such	2020-01-31 19:04:35 -06:00
Seth Hoenig	8219c78667	nomad: handle SI token revocations concurrently Be able to revoke SI token accessors concurrently, and also ratelimit the requests being made to Consul for the various ACL API uses.	2020-01-31 19:04:14 -06:00
Seth Hoenig	2c7ac9a80d	nomad: fixup token policy validation	2020-01-31 19:04:08 -06:00
Seth Hoenig	9df33f622f	nomad: proxy requests for Service Identity tokens between Clients and Consul Nomad jobs may be configured with a TaskGroup which contains a Service definition that is Consul Connect enabled. These service definitions end up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In the case where Consul ACLs are enabled, a Service Identity token is required for these tasks to run & connect, etc. This changeset enables the Nomad Server to recieve RPC requests for the derivation of SI tokens on behalf of instances of Consul Connect using Tasks. Those tokens are then relayed back to the requesting Client, which then injects the tokens in the secrets directory of the Task.	2020-01-31 19:03:53 -06:00
Seth Hoenig	93cf770edb	client: enable nomad client to request and set SI tokens for tasks When a job is configured with Consul Connect aware tasks (i.e. sidecar), the Nomad Client should be able to request from Consul (through Nomad Server) Service Identity tokens specific to those tasks.	2020-01-31 19:03:38 -06:00
Seth Hoenig	2b66ce93bb	nomad: ensure a unique ClusterID exists when leader (gh-6702) Enable any Server to lookup the unique ClusterID. If one has not been generated, and this node is the leader, generate a UUID and attempt to apply it through raft. The value is not yet used anywhere in this changeset, but is a prerequisite for gh-6701.	2020-01-31 19:03:26 -06:00
Seth Hoenig	f030a22c7c	command, docs: create and document consul token configuration for connect acls (gh-6716) This change provides an initial pass at setting up the configuration necessary to enable use of Connect with Consul ACLs. Operators will be able to pass in a Consul Token through `-consul-token` or `$CONSUL_TOKEN` in the `job run` and `job revert` commands (similar to Vault tokens). These values are not actually used yet in this changeset.	2020-01-31 19:02:53 -06:00
Michael Schurter	dd7712795d	Merge branch 'master' into b-tls-validation	2020-01-30 11:05:15 -08:00
Mahmood Ali	a9f551542d	Merge pull request #160 from hashicorp/b-mtls-hostname server: validate role and region for RPC w/ mTLS	2020-01-30 12:59:17 -06:00
Michael Schurter	c82b14b0c4	core: add limits to unauthorized connections Introduce limits to prevent unauthorized users from exhausting all ephemeral ports on agents: * `{https,rpc}_handshake_timeout` * `{http,rpc}_max_conns_per_client` The handshake timeout closes connections that have not completed the TLS handshake by the deadline (5s by default). For RPC connections this timeout also separately applies to first byte being read so RPC connections with TLS enabled have `rpc_handshake_time * 2` as their deadline. The connection limit per client prevents a single remote TCP peer from exhausting all ephemeral ports. The default is 100, but can be lowered to a minimum of 26. Since streaming RPC connections create a new TCP connection (until MultiplexV2 is used), 20 connections are reserved for Raft and non-streaming RPCs to prevent connection exhaustion due to streaming RPCs. All limits are configurable and may be disabled by setting them to `0`. This also includes a fix that closes connections that attempt to create TLS RPC connections recursively. While only users with valid mTLS certificates could perform such an operation, it was added as a safeguard to prevent programming errors before they could cause resource exhaustion.	2020-01-30 10:38:25 -08:00
Drew Bailey	da4af9bef3	fix tests, update changelog	2020-01-29 13:55:39 -05:00
Drew Bailey	a61bf32314	Allow nomad monitor command to lookup server UUID Allows addressing servers with nomad monitor using the servers name or ID. Also unifies logic for addressing servers for client_agent_endpoint commands and makes addressing logic region aware. rpc getServer test	2020-01-29 13:55:29 -05:00
Mahmood Ali	9611324654	Merge pull request #6922 from hashicorp/b-alloc-canoncalize Handle Upgrades and Alloc.TaskResources modification	2020-01-28 15:12:41 -05:00
Mahmood Ali	90cae566e5	Merge pull request #6935 from hashicorp/b-default-preemption-flag scheduler: allow configuring default preemption for system scheduler	2020-01-28 15:11:06 -05:00
Mahmood Ali	af17b4afc7	Support customizing full scheduler config	2020-01-28 14:51:42 -05:00
Mahmood Ali	f7a51a14c6	Merge pull request #6977 from hashicorp/b-leadership-flapping-2 Handle Nomad leadership flapping (attempt 2)	2020-01-28 11:40:41 -05:00
Mahmood Ali	687d2b7054	tests: defer closing shutdownCh	2020-01-28 09:53:48 -05:00
Mahmood Ali	ded4233c27	tweak leadership flapping log messages	2020-01-28 09:49:36 -05:00
Mahmood Ali	79823ae07d	handle channel close signal Always deliver last value then send close signal.	2020-01-28 09:44:34 -05:00
Mahmood Ali	d202924a93	include test and address review comments	2020-01-28 09:06:52 -05:00
Nick Ethier	5cbb94e16e	consul: add support for canary meta	2020-01-27 09:53:30 -05:00
Mahmood Ali	e436d2701a	Handle Nomad leadership flapping Fixes a deadlock in leadership handling if leadership flapped. Raft propagates leadership transition to Nomad through a NotifyCh channel. Raft blocks when writing to this channel, so channel must be buffered or aggressively consumed[1]. Otherwise, Raft blocks indefinitely in `raft.runLeader` until the channel is consumed[1] and does not move on to executing follower related logic (in `raft.runFollower`). While Raft `runLeader` defer function blocks, raft cannot process any other raft operations. For example, `run{Leader\|Follower}` methods consume `raft.applyCh`, and while runLeader defer is blocked, all raft log applications or config lookup will block indefinitely. Sadly, `leaderLoop` and `establishLeader` makes few Raft calls! `establishLeader` attempts to auto-create autopilot/scheduler config [3]; and `leaderLoop` attempts to check raft configuration [4]. All of these calls occur without a timeout. Thus, if leadership flapped quickly while `leaderLoop/establishLeadership` is invoked and hit any of these Raft calls, Raft handler _deadlock_ forever. Depending on how many times it flapped and where exactly we get stuck, I suspect it's possible to get in the following case: * Agent metrics/stats http and RPC calls hang as they check raft.Configurations * raft.State remains in Leader state, and server attempts to handle RPC calls (e.g. node/alloc updates) and these hang as well As we create goroutines per RPC call, the number of goroutines grow over time and may trigger a out of memory errors in addition to missed updates. [1] `d90d6d6bda/config.go (L190-L193)` [2] `d90d6d6bda/raft.go (L425-L436)` [3] `2a89e47746/nomad/leader.go (L198-L202)` [4] `2a89e47746/nomad/leader.go (L877)`	2020-01-22 13:08:34 -05:00
Mahmood Ali	129c884105	extract leader step function	2020-01-22 10:55:48 -05:00
Mahmood Ali	f36cc54efd	actually always canonicalize alloc.Job alloc.Job may be stale as well and need to migrate it. It does cost extra cycles but should be negligible.	2020-01-15 09:02:48 -05:00
Mahmood Ali	b1b714691c	address review comments	2020-01-15 08:57:05 -05:00
Mahmood Ali	1ab682f622	scheduler: allow configuring default preemption for system scheduler Some operators want a greater control over when preemption is enabled, especially during an upgrade to limit potential side-effects.	2020-01-13 08:30:49 -05:00
Drew Bailey	ff4bfb8809	Merge pull request #6841 from hashicorp/f-agent-pprof-acl Remote agent pprof endpoints	2020-01-10 14:52:39 -05:00
Mahmood Ali	bfa33cf471	canonicalize allocs from plan results too	2020-01-10 10:41:12 -05:00
Nick Ethier	1f28633954	Merge pull request #6816 from hashicorp/b-multiple-envoy connect: configure envoy to support multiple sidecars in the same alloc	2020-01-09 23:25:39 -05:00
Drew Bailey	b702dede49	adds qc param, address pr feedback	2020-01-09 15:15:11 -05:00
Drew Bailey	45210ed901	Rename profile package to pprof Address pr feedback, rename profile package to pprof to more accurately describe its purpose. Adds gc param for heap lookup profiles.	2020-01-09 15:15:10 -05:00
Drew Bailey	1b8af920f3	address pr feedback	2020-01-09 15:15:09 -05:00

... 5 6 7 8 9 ...

3656 commits