open-nomad

History

Tim Gross 622ed093ae CSI: node unmount from the client before unpublish RPC (#11892 ) When an allocation stops, the `csi_hook` makes an unpublish RPC to the servers to unpublish via the CSI RPCs: first to the node plugins and then the controller plugins. The controller RPCs must happen after the node RPCs so that the node has had a chance to unmount the volume before the controller tries to detach the associated device. But the client has local access to the node plugins and can independently determine if it's safe to send unpublish RPC to those plugins. This will allow the server to treat the node plugin as abandoned if a client is disconnected and `stop_on_client_disconnect` is set. This will let the server try to send unpublish RPCs to the controller plugins, under the assumption that the client will be trying to unmount the volume on its end first. Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can return ignorable errors in the case where the volume has already been unmounted from the node. Handle all other errors by retrying until we get success so as to give operators the opportunity to reschedule a failed node plugin (ex. in the case where they accidentally drained a node without `-ignore-system`). Fan-out the work for each volume into its own goroutine so that we can release a subset of volumes if only one is stuck.		2022-01-28 14:43:58 -05:00
..
allocdir	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
allochealth	e2e test for on_update service checks	2021-02-08 08:32:40 -05:00
allocrunner	CSI: node unmount from the client before unpublish RPC (#11892 )	2022-01-28 14:43:58 -05:00
allocwatcher	client: never embed alloc_dir in chroot	2021-10-18 09:22:01 -07:00
config	Update IsEmpty to check for pre-1.2.4 fields (#11930 )	2022-01-28 14:41:49 -05:00
consul	consul: plubming for specifying consul namespace in job/group	2021-04-05 10:03:19 -06:00
devicemanager	Fix some errcheck errors (#9811 )	2021-01-14 12:46:35 -08:00
dynamicplugins	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
fingerprint	fix host network reserved port fingerprint (#11728 )	2021-12-22 15:29:54 -05:00
interfaces	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
lib	gofmt all the files	2021-10-01 10:14:28 -04:00
logmon	logmon: Fix a memory leak on task restart	2021-10-05 12:11:53 -04:00
pluginmanager	CSI: node unmount from the client before unpublish RPC (#11892 )	2022-01-28 14:43:58 -05:00
servers	client: drop unused DC field from servers list	2019-05-20 14:19:15 -07:00
state	gofmt all the files	2021-10-01 10:14:28 -04:00
stats	deps: bump gopsutil to v3.21.2	2021-03-30 16:02:51 -04:00
structs	Generate files for 1.2.4 release	2022-01-18 23:43:00 +00:00
taskenv	client: task env vars should take precendece over host env vars.	2021-09-20 09:15:28 +01:00
testutil	cli: refactor operator debug capture (#11466 )	2021-11-05 19:43:10 -04:00
vaultclient	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
acl.go	Audit config, seams for enterprise audit features	2020-03-23 13:47:42 -04:00
acl_test.go	Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447 )	2020-12-01 11:11:34 -05:00
agent_endpoint.go	json handles were moved to a new package in #10202	2021-04-02 13:31:10 +00:00
agent_endpoint_test.go	fix params for Agent.Host client RPC (#8795 )	2020-08-31 17:14:26 -04:00
alloc_endpoint.go	client: fix multiple imports (#10537 )	2021-05-13 14:30:31 -04:00
alloc_endpoint_test.go	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
alloc_watcher_e2e_test.go	tests: swap lib/freeport for tweaked helper/freeport	2019-12-09 08:37:32 -06:00
client.go	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
client_stats_endpoint.go
client_stats_endpoint_test.go	tests: swap lib/freeport for tweaked helper/freeport	2019-12-09 08:37:32 -06:00
client_test.go	client: defensively log reserved ports	2021-10-04 15:43:35 -07:00
csi_endpoint.go	CSI: use AccessMode/AttachmentMode from CSIVolumeClaim	2021-04-07 11:24:09 -04:00
csi_endpoint_test.go	CSI ListSnapshots secrets implementation	2021-07-28 11:30:29 -07:00
driver_manager_test.go	tests: fix data race in client TestDriverManager_Fingerprint_Periodic	2019-05-21 09:49:56 -04:00
enterprise_client_oss.go	gofmt all the files	2021-10-01 10:14:28 -04:00
fingerprint_manager.go	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
fingerprint_manager_test.go	use allow/deny instead of the colored alternatives (#9019 )	2020-10-12 08:47:05 -04:00
fs_endpoint.go	Fix log streaming missing frames (#11721 )	2022-01-04 14:07:16 -05:00
fs_endpoint_test.go	Fix log streaming missing frames (#11721 )	2022-01-04 14:07:16 -05:00
gc.go	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
gc_test.go	Events/msgtype cleanup (#9117 )	2020-10-19 09:30:15 -04:00
heartbeatstop.go	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 )	2020-06-03 09:48:38 -04:00
heartbeatstop_test.go	docs: s/hearbeat/heartbeat and fix link	2020-07-23 11:33:34 -07:00
node_updater.go	client: use NewNodeEvent builder for consistency (#7559 )	2020-03-31 10:02:16 -04:00
rpc.go	pool: track usage of incoming streams (#10710 )	2021-06-07 10:22:37 -04:00
rpc_test.go	Simplify Bootstrap logic in tests	2020-03-02 13:47:43 -05:00
testing.go	test infrastructure for mock client RPCs (#10193 )	2021-03-31 16:37:09 -04:00
util.go	Revert "client: defensive against getting stale alloc updates"	2020-06-19 15:39:44 -04:00
util_test.go	Update state with server	2018-10-16 16:53:29 -07:00