open-nomad

History

Tim Gross 66b4b28b1a CSI: node unmount from the client before unpublish RPC (#11892 ) When an allocation stops, the `csi_hook` makes an unpublish RPC to the servers to unpublish via the CSI RPCs: first to the node plugins and then the controller plugins. The controller RPCs must happen after the node RPCs so that the node has had a chance to unmount the volume before the controller tries to detach the associated device. But the client has local access to the node plugins and can independently determine if it's safe to send unpublish RPC to those plugins. This will allow the server to treat the node plugin as abandoned if a client is disconnected and `stop_on_client_disconnect` is set. This will let the server try to send unpublish RPCs to the controller plugins, under the assumption that the client will be trying to unmount the volume on its end first. Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can return ignorable errors in the case where the volume has already been unmounted from the node. Handle all other errors by retrying until we get success so as to give operators the opportunity to reschedule a failed node plugin (ex. in the case where they accidentally drained a node without `-ignore-system`). Fan-out the work for each volume into its own goroutine so that we can release a subset of volumes if only one is stuck.		2022-01-28 08:30:31 -05:00
..
csimanager	CSI: node unmount from the client before unpublish RPC (#11892 )	2022-01-28 08:30:31 -05:00
drivermanager	Log error if there are no event handlers registered	2021-10-11 19:44:52 +00:00
group.go	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
group_test.go	pluginmanager: WaitForFirstFingerprint times out (#9597 )	2020-12-10 07:27:15 -08:00
manager.go	client: batch initial fingerprinting in plugin manangers	2018-12-18 22:56:19 -05:00
testing.go	pluginmanager: WaitForFirstFingerprint times out (#9597 )	2020-12-10 07:27:15 -08:00