open-nomad/client
Tim Gross 66b4b28b1a
CSI: node unmount from the client before unpublish RPC (#11892)
When an allocation stops, the `csi_hook` makes an unpublish RPC to the
servers to unpublish via the CSI RPCs: first to the node plugins and
then the controller plugins. The controller RPCs must happen after the
node RPCs so that the node has had a chance to unmount the volume
before the controller tries to detach the associated device.

But the client has local access to the node plugins and can
independently determine if it's safe to send unpublish RPC to those
plugins. This will allow the server to treat the node plugin as
abandoned if a client is disconnected and `stop_on_client_disconnect`
is set. This will let the server try to send unpublish RPCs to the
controller plugins, under the assumption that the client will be
trying to unmount the volume on its end first.

Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can 
return ignorable errors in the case where the volume has already been
unmounted from the node. Handle all other errors by retrying until we
get success so as to give operators the opportunity to reschedule a
failed node plugin (ex. in the case where they accidentally drained a
node without `-ignore-system`). Fan-out the work for each volume into
its own goroutine so that we can release a subset of volumes if only
one is stuck.
2022-01-28 08:30:31 -05:00
..
allocdir replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
allochealth e2e test for on_update service checks 2021-02-08 08:32:40 -05:00
allocrunner CSI: node unmount from the client before unpublish RPC (#11892) 2022-01-28 08:30:31 -05:00
allocwatcher client: never embed alloc_dir in chroot 2021-10-18 09:22:01 -07:00
config Update IsEmpty to check for pre-1.2.4 fields (#11930) 2022-01-26 11:31:37 -05:00
consul consul: plubming for specifying consul namespace in job/group 2021-04-05 10:03:19 -06:00
devicemanager Fix some errcheck errors (#9811) 2021-01-14 12:46:35 -08:00
dynamicplugins Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
fingerprint fix host network reserved port fingerprint (#11728) 2021-12-22 15:29:54 -05:00
interfaces replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
lib client: change test to not poke cgroupv2 edge case 2022-01-27 08:27:40 -06:00
logmon logmon: Fix a memory leak on task restart 2021-10-05 12:11:53 -04:00
pluginmanager CSI: node unmount from the client before unpublish RPC (#11892) 2022-01-28 08:30:31 -05:00
servers
state gofmt all the files 2021-10-01 10:14:28 -04:00
stats deps: bump gopsutil to v3.21.2 2021-03-30 16:02:51 -04:00
structs metrics: added mapped_file metric (#11500) 2022-01-10 15:35:19 -05:00
taskenv client: task env vars should take precendece over host env vars. 2021-09-20 09:15:28 +01:00
testutil cli: refactor operator debug capture (#11466) 2021-11-05 19:43:10 -04:00
vaultclient Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
acl.go
acl_test.go Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447) 2020-12-01 11:11:34 -05:00
agent_endpoint.go json handles were moved to a new package in #10202 2021-04-02 13:31:10 +00:00
agent_endpoint_test.go
alloc_endpoint.go client: fix multiple imports (#10537) 2021-05-13 14:30:31 -04:00
alloc_endpoint_test.go tests: remove duplicate import statements. 2021-06-11 09:39:22 +02:00
alloc_watcher_e2e_test.go
client.go replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
client_stats_endpoint.go
client_stats_endpoint_test.go
client_test.go client: defensively log reserved ports 2021-10-04 15:43:35 -07:00
csi_endpoint.go CSI: use AccessMode/AttachmentMode from CSIVolumeClaim 2021-04-07 11:24:09 -04:00
csi_endpoint_test.go CSI ListSnapshots secrets implementation 2021-07-28 11:30:29 -07:00
driver_manager_test.go
enterprise_client_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
fingerprint_manager.go chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
fingerprint_manager_test.go use allow/deny instead of the colored alternatives (#9019) 2020-10-12 08:47:05 -04:00
fs_endpoint.go Fix log streaming missing frames (#11721) 2022-01-04 14:07:16 -05:00
fs_endpoint_test.go Fix log streaming missing frames (#11721) 2022-01-04 14:07:16 -05:00
gc.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
gc_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
heartbeatstop.go
heartbeatstop_test.go
node_updater.go
rpc.go pool: track usage of incoming streams (#10710) 2021-06-07 10:22:37 -04:00
rpc_test.go
testing.go test infrastructure for mock client RPCs (#10193) 2021-03-31 16:37:09 -04:00
util.go
util_test.go