open-nomad

Author	SHA1	Message	Date
Tim Gross	2854298089	csi: release claims via csi_hook postrun unpublish RPC (#8580 ) Add a Postrun hook to send the `CSIVolume.Unpublish` RPC to the server. This may forward client RPCs to the node plugins or to the controller plugins, depending on whether other allocations on this node have claims on this volume. By making clients responsible for running the `CSIVolume.Unpublish` RPC (and making the RPC available to a `nomad volume detach` command), the volumewatcher becomes only used by the core GC job and we no longer need async volume GC from job deregister and node update.	2020-08-06 14:51:46 -04:00
Mahmood Ali	6dfe0f5d3b	tests: use t.Fatalf when it's clearer	2020-05-27 10:09:56 -04:00
Mahmood Ali	5d80d2a511	tests: eval may be processed quickly	2020-05-26 18:53:24 -04:00
Tim Gross	72430a4e62	csi: don't pass volume claim releases thru GC eval (#8021 ) Following the new volumewatcher in #7794 and performance improvements to it that landed afterwards, there's no particular reason we should be threading claim releases through the GC eval rather than writing an empty `CSIVolumeClaimRequest` with the mode set to `CSIVolumeClaimRelease`, just as the GC evaluation would do. Also, by batching up these raft messages, we can reduce the amount of raft writes by 1 and cross-server RPCs by 1 per volume we release claims on.	2020-05-20 15:22:51 -04:00
Seth Hoenig	f6c8db8a8a	consul/connect: use task kind to get service name Fixes #8000 When requesting a Service Identity token from Consul, use the TaskKind of the Task to get at the service name associated with the task. In the past using the TaskName worked because it was generated as a sidecar task with a name that included the service. In the Native context, we need to get at the service name in a more correct way, i.e. using the TaskKind which is defined to include the service name.	2020-05-18 13:46:00 -06:00
Mahmood Ali	64de395df0	tests: ease debugging TestClientEndpoint_CreateNodeEvals TestClientEndpoint_CreateNodeEvals flakes a bit but its output is very confusing, as `structs.Evaluations` overrides GoString. Here, we emit the entire struct of the evaluation, and hopefully we'll figure out the problem the next time it happens	2020-05-09 16:04:32 -04:00
Tim Gross	a7a64443e1	csi: move volume claim release into volumewatcher (#7794 ) This changeset adds a subsystem to run on the leader, similar to the deployment watcher or node drainer. The `Watcher` performs a blocking query on updates to the `CSIVolumes` table and triggers reaping of volume claims. This will avoid tying up scheduling workers by immediately sending volume claim workloads into their own loop, rather than blocking the scheduling workers in the core GC job doing things like talking to CSI controllers The volume watcher is enabled on leader step-up and disabled on leader step-down. The volume claim GC mechanism now makes an empty claim RPC for the volume to trigger an index bump. That in turn unblocks the blocking query in the volume watcher so it can assess which claims can be released for a volume.	2020-04-30 09:13:00 -04:00
Tim Gross	083b35d651	csi: checkpoint volume claim garbage collection (#7782 ) Adds a `CSIVolumeClaim` type to be tracked as current and past claims on a volume. Allows for a client RPC failure during node or controller detachment without having to keep the allocation around after the first garbage collection eval. This changeset lays groundwork for moving the actual detachment RPCs into a volume watching loop outside the GC eval.	2020-04-23 11:06:23 -04:00
Tim Gross	027277a0d9	csi: make volume GC in job deregister safely async The `Job.Deregister` call will block on the client CSI controller RPCs while the alloc still exists on the Nomad client node. So we need to make the volume claim reaping async from the `Job.Deregister`. This allows `nomad job stop` to return immediately. In order to make this work, this changeset changes the volume GC so that the GC jobs are on a by-volume basis rather than a by-job basis; we won't have to query the (possibly deleted) job at the time of volume GC. We smuggle the volume ID and whether it's a purge into the GC eval ID the same way we smuggled the job ID previously.	2020-04-06 10:15:55 -04:00
Lang Martin	24449e23af	csi: volume validate namespace (#7587 ) * nomad/state/state_store: enforce that the volume namespace exists * nomad/csi_endpoint_test: a couple of broken namespaces now * nomad/csi_endpoint_test: one more test * nomad/node_endpoint_test: use structs.DefaultNamespace * nomad/state/state_store_test: use DefaultNamespace	2020-04-02 10:13:41 -04:00
Michael Schurter	464dae514c	test: assert HostVolumes included in ListNodes	2020-03-30 17:34:44 -07:00
Lang Martin	3621df1dbf	csi: volume ids are only unique per namespace (#7358 ) * nomad/state/schema: use the namespace compound index * scheduler/scheduler: CSIVolumeByID interface signature namespace * scheduler/stack: SetJob on CSIVolumeChecker to capture namespace * scheduler/feasible: pass the captured namespace to CSIVolumeByID * nomad/state/state_store: use namespace in csi_volume index * nomad/fsm: pass namespace to CSIVolumeDeregister & Claim * nomad/core_sched: pass the namespace in volumeClaimReap * nomad/node_endpoint_test: namespaces in Claim testing * nomad/csi_endpoint: pass RequestNamespace to state.* * nomad/csi_endpoint_test: appropriately failed test * command/alloc_status_test: appropriately failed test * node_endpoint_test: avoid notTheNamespace for the job * scheduler/feasible_test: call SetJob to capture the namespace * nomad/csi_endpoint: ACL check the req namespace, query by namespace * nomad/state/state_store: remove deregister namespace check * nomad/state/state_store: remove unused CSIVolumes * scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace * nomad/csi_endpoint: ACL check * nomad/state/state_store_test: remove call to state.CSIVolumes * nomad/core_sched_test: job namespace match so claim gc works	2020-03-23 13:59:25 -04:00
Tim Gross	8bc5641438	csi: volume claim garbage collection (#7125 ) When an alloc is marked terminal (and after node unstage/unpublish have been called), the client syncs the terminal alloc state with the server via `Node.UpdateAlloc RPC`. For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC handler at the server will emit an eval for a new core job to garbage collect CSI volume claims. When this eval is handled on the core scheduler, it will call a `volumeReap` method to release the claims for all terminal allocs on the job. The volume reap will issue a `ControllerUnpublishVolume` RPC for any node that has no alloc claiming the volume. Once this returns (or is skipped), the volume reap will send a new `CSIVolume.Claim` RPC that releases the volume claim for that allocation in the state store, making it available for scheduling again. This same `volumeReap` method will be called from the core job GC, which gives us a second chance to reclaim volumes during GC if there were controller RPC failures.	2020-03-23 13:58:30 -04:00
Tim Gross	fb1aad66ee	csi: implement releasing volume claims for terminal allocs (#7076 ) When an alloc is marked terminal, and after node unstage/unpublish have been called, the client will sync the terminal alloc state with the server via `Node.UpdateAlloc` RPC. This changeset implements releasing the volume claim for each volume associated with the terminal alloc. It doesn't yet implement the RPC call we need to make to the `ControllerUnpublishVolume` CSI RPC.	2020-03-23 13:58:29 -04:00
Lang Martin	7b675f89ac	csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049 ) * state_store: csi volumes/plugins store the index in the txn * nomad: csi_endpoint_test require index checks need uint64() * nomad: other tests using int 0 not uint64(0) * structs: pass index into New, but not other struct methods * state_store: csi plugin indexes, use new struct interface * nomad: csi_endpoint_test check index/query meta (on explicit 0) * structs: NewCSIVolume takes an index arg now * scheduler/test: NewCSIVolume takes an index arg now	2020-03-23 13:58:29 -04:00
Mahmood Ali	acbfeb5815	Simplify Bootstrap logic in tests This change updates tests to honor `BootstrapExpect` exclusively when forming test clusters and removes test only knobs, e.g. `config.DevDisableBootstrap`. Background: Test cluster creation is fragile. Test servers don't follow the BootstapExpected route like production clusters. Instead they start as single node clusters and then get rejoin and may risk causing brain split or other test flakiness. The test framework expose few knobs to control those (e.g. `config.DevDisableBootstrap` and `config.Bootstrap`) that control whether a server should bootstrap the cluster. These flags are confusing and it's unclear when to use: their usage in multi-node cluster isn't properly documented. Furthermore, they have some bad side-effects as they don't control Raft library: If `config.DevDisableBootstrap` is true, the test server may not immediately attempt to bootstrap a cluster, but after an election timeout (~50ms), Raft may force a leadership election and win it (with only one vote) and cause a split brain. The knobs are also confusing as Bootstrap is an overloaded term. In BootstrapExpect, we refer to bootstrapping the cluster only after N servers are connected. But in tests and the knobs above, it refers to whether the server is a single node cluster and shouldn't wait for any other server. Changes: This commit makes two changes: First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or `DevMode` flags. This change is relatively trivial. Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped. This allows us to keep `BootstrapExpected` immutable. Previously, the flag was a config value but it gets set to 0 after cluster bootstrap completes.	2020-03-02 13:47:43 -05:00
Seth Hoenig	587a5d4a8d	nomad: make TaskGroup.UsesConnect helper a public helper	2020-01-31 19:05:11 -06:00
Seth Hoenig	8219c78667	nomad: handle SI token revocations concurrently Be able to revoke SI token accessors concurrently, and also ratelimit the requests being made to Consul for the various ACL API uses.	2020-01-31 19:04:14 -06:00
Seth Hoenig	9df33f622f	nomad: proxy requests for Service Identity tokens between Clients and Consul Nomad jobs may be configured with a TaskGroup which contains a Service definition that is Consul Connect enabled. These service definitions end up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In the case where Consul ACLs are enabled, a Service Identity token is required for these tasks to run & connect, etc. This changeset enables the Nomad Server to recieve RPC requests for the derivation of SI tokens on behalf of instances of Consul Connect using Tasks. Those tokens are then relayed back to the requesting Client, which then injects the tokens in the secrets directory of the Task.	2020-01-31 19:03:53 -06:00
Seth Hoenig	f0c3dca49c	tests: swap lib/freeport for tweaked helper/freeport Copy the updated version of freeport (sdk/freeport), and tweak it for use in Nomad tests. This means staying below port 10000 to avoid conflicts with the lib/freeport that is still transitively used by the old version of consul that we vendor. Also provide implementations to find ephemeral ports of macOS and Windows environments. Ports acquired through freeport are supposed to be returned to freeport, which this change now also introduces. Many tests are modified to include calls to a cleanup function for Server objects. This should help quite a bit with some flakey tests, but not all of them. Our port problems will not go away completely until we upgrade our vendor version of consul. With Go modules, we'll probably do a 'replace' to swap out other copies of freeport with the one now in 'nomad/helper/freeport'.	2019-12-09 08:37:32 -06:00
Luiz Aoqui	e862b61daa	api: use the same initial time for all drain properties	2019-11-14 16:06:09 -05:00
Luiz Aoqui	5bd7cdd5c3	api: add `StartedAt` in `Node.DrainStrategy`	2019-11-13 17:54:40 -05:00
Jasmine Dahilig	8d980edd2e	add create and modify timestamps to evaluations (#5881 )	2019-08-07 09:50:35 -07:00
Lang Martin	a95225d754	NodeDeregisterBatch -> NodeBatchDeregister match JobBatch pattern	2019-07-10 13:56:20 -04:00
Lang Martin	fa5649998e	node endpoint support new NodeDeregisterBatchRequest	2019-07-10 13:56:19 -04:00
Lang Martin	09fd05bd8f	node_endpoint raft store then shutdown, test deprecation	2019-07-10 13:56:19 -04:00
Lang Martin	b176066d42	node_endpoint deregister the batch of nodes	2019-07-10 13:56:19 -04:00
Mahmood Ali	9ff5f163b5	update callers in tests	2019-05-21 21:10:17 -04:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Nick Ethier	2555bff4f5	nomad: add error check in test	2018-06-06 14:08:42 -04:00
Nick Ethier	d35bf6d184	nomad: handle edge case where node drain event shouldn't be emitted	2018-06-06 14:02:10 -04:00
Alex Dadgar	21c5ed850d	Register events	2018-05-22 14:06:33 -07:00
Alex Dadgar	17aac1c9de	node heartbeat missed event	2018-05-22 14:05:46 -07:00
Alex Dadgar	5f2080bc26	Emit events based on eligibility	2018-05-22 14:04:59 -07:00
Alex Dadgar	0cb31feb1f	Add node event when draining is set/removed/updated	2018-05-10 16:54:43 -07:00
Alex Dadgar	a35248d1d8	Plumb event via FSM	2018-05-10 16:30:54 -07:00
Preetha Appan	a569d34f25	Add custom status description for rescheduling follow up evals, and make unit test robust	2018-04-10 15:30:15 -05:00
Preetha Appan	b3402efd0b	Adds a new custom description for update alloc triggered evals to make it easier to unit test.	2018-04-10 14:00:07 -05:00
Preetha Appan	24203ae2f7	Remove duplicate commit	2018-04-09 15:08:09 -05:00
Preetha Appan	d1cb5df477	Batch evals for rescheduling failed allocs correctly and group them by job ID	2018-04-09 14:05:31 -05:00
Alex Dadgar	de4b3772f1	Create evals for system jobs when drain is unset This PR creates evals for system jobs when: * Drain is unset and mark eligible is true * Eligibility is restored to the node	2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo	31557cc44f	move tests to use time.Time	2018-03-27 15:43:57 -04:00
Alex Dadgar	5dacb057b7	Only track nodes if the conn is from the node Fixes a bug in which a connection to a Nomad server was treated as a connection to a node because the server forwarded a node specific RPC.	2018-03-27 09:59:31 -07:00
Michael Schurter	e432c9af55	test: disable node drainer during tests Node drainer would throw off the index checks	2018-03-21 16:51:45 -07:00
Michael Schurter	5c8c4bce2a	test: disable drain during fsm test drainer was unsetting drain before fsm could read written value	2018-03-21 16:51:45 -07:00
Alex Dadgar	7b2bad8c5e	Toggle Drain allows resetting eligibility This PR allows marking a node as eligible for scheduling while toggling drain. By default the `nomad node drain -disable` commmand will mark it as eligible but the drainer will maintain in-eligibility.	2018-03-21 16:51:44 -07:00
Alex Dadgar	e63bcb474d	Drainer	2018-03-21 16:51:44 -07:00
Alex Dadgar	0fba0101b6	RPC/FSM/State Store for Eligibility	2018-03-21 16:51:44 -07:00
Alex Dadgar	2f5309d82a	Remove update time	2018-03-21 16:51:43 -07:00
Alex Dadgar	0965c9ed28	Fix tests	2018-03-21 16:51:43 -07:00

1 2 3 4

152 commits