open-nomad

History

Tim Gross a6652bffad CSI: reorder controller volume detachment (#12387 ) In #12112 and #12113 we solved for the problem of races in releasing volume claims, but there was a case that we missed. During a node drain with a controller attach/detach, we can hit a race where we call controller publish before the unpublish has completed. This is discouraged in the spec but plugins are supposed to handle it safely. But if the storage provider's API is slow enough and the plugin doesn't handle the case safely, the volume can get "locked" into a state where the provider's API won't detach it cleanly. Check the claim before making any external controller publish RPC calls so that Nomad is responsible for the canonical information about whether a volume is currently claimed. This has a couple side-effects that also had to get fixed here: * Changing the order means that the volume will have a past claim without a valid external node ID because it came from the client, and this uncovered a separate bug where we didn't assert the external node ID was valid before returning it. Fallthrough to getting the ID from the plugins in the state store in this case. We avoided this originally because of concerns around plugins getting lost during node drain but now that we've fixed that we may want to revisit it in future work. * We should make sure we're handling `FailedPrecondition` cases from the controller plugin the same way we handle other retryable cases. * Several tests had to be updated because they were assuming we fail in a particular order that we're no longer doing.		2022-03-29 09:44:00 -04:00
..
allocdir	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
allochealth	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
allocrunner	CSI: reorder controller volume detachment (#12387 )	2022-03-29 09:44:00 -04:00
allocwatcher	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
config	Merge branch 'main' into f-1.3-boogie-nights	2022-03-25 16:40:32 +01:00
consul	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
devicemanager	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
dynamicplugins	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
fingerprint	Merge pull request #12368 from hashicorp/f-1.3-boogie-nights	2022-03-25 18:04:47 +01:00
interfaces	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
lib	client: cgroups v2 code review followup	2022-03-24 13:40:42 -05:00
logmon	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
pluginmanager	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
servers	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
serviceregistration	client: modify service wrapper to accomodate restore behaviour.	2022-03-21 09:49:39 +01:00
state	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
stats	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
structs	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
taskenv	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
testutil	client: cgroups v2 code review followup	2022-03-24 13:40:42 -05:00
vaultclient	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
acl.go	Audit config, seams for enterprise audit features	2020-03-23 13:47:42 -04:00
acl_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
agent_endpoint.go	json handles were moved to a new package in #10202	2021-04-02 13:31:10 +00:00
agent_endpoint_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
alloc_endpoint.go	client: fix multiple imports (#10537 )	2021-05-13 14:30:31 -04:00
alloc_endpoint_test.go	client: enable support for cgroups v2	2022-03-23 11:35:27 -05:00
alloc_watcher_e2e_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
client.go	Merge branch 'main' into f-1.3-boogie-nights	2022-03-25 16:40:32 +01:00
client_stats_endpoint.go	Server side impl + touch ups	2018-02-15 13:59:02 -08:00
client_stats_endpoint_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
client_test.go	client: enable support for cgroups v2	2022-03-23 11:35:27 -05:00
csi_endpoint.go	CSI: allow updates to volumes on re-registration (#12167 )	2022-03-07 11:06:59 -05:00
csi_endpoint_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
driver_manager_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
enterprise_client_oss.go	gofmt all the files	2021-10-01 10:14:28 -04:00
fingerprint_manager.go	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
fingerprint_manager_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
fs_endpoint.go	Fix log streaming missing frames (#11721 )	2022-01-04 14:07:16 -05:00
fs_endpoint_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
gc.go	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
gc_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
heartbeatstop.go	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 )	2020-06-03 09:48:38 -04:00
heartbeatstop_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
node_updater.go	client: use NewNodeEvent builder for consistency (#7559 )	2020-03-31 10:02:16 -04:00
rpc.go	core: remove all traces of unused protocol version	2022-02-18 16:12:36 -08:00
rpc_test.go	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
testing.go	client: refactor common service registration objects from Consul.	2022-03-15 09:38:30 +01:00
util.go	Revert "client: defensive against getting stale alloc updates"	2020-06-19 15:39:44 -04:00