* acl/policy: add the volume ACL policies
* nomad/csi_endpoint: enforce ACLs for volume access
* nomad/search_endpoint_oss: volume acls
* acl/acl: add plugin read as a global policy
* acl/policy: add PluginPolicy global cap type
* nomad/csi_endpoint: check the global plugin ACL policy
* nomad/mock/acl: PluginPolicy
* nomad/csi_endpoint: fix list rebase
* nomad/core_sched_test: new test since #7358
* nomad/csi_endpoint_test: use correct permissions for list
* nomad/csi_endpoint: allowCSIMount keeps ACL checks together
* nomad/job_endpoint: check mount permission for jobs
* nomad/job_endpoint_test: need plugin read, too
* nomad/state/schema: use the namespace compound index
* scheduler/scheduler: CSIVolumeByID interface signature namespace
* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace
* scheduler/feasible: pass the captured namespace to CSIVolumeByID
* nomad/state/state_store: use namespace in csi_volume index
* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim
* nomad/core_sched: pass the namespace in volumeClaimReap
* nomad/node_endpoint_test: namespaces in Claim testing
* nomad/csi_endpoint: pass RequestNamespace to state.*
* nomad/csi_endpoint_test: appropriately failed test
* command/alloc_status_test: appropriately failed test
* node_endpoint_test: avoid notTheNamespace for the job
* scheduler/feasible_test: call SetJob to capture the namespace
* nomad/csi_endpoint: ACL check the req namespace, query by namespace
* nomad/state/state_store: remove deregister namespace check
* nomad/state/state_store: remove unused CSIVolumes
* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace
* nomad/csi_endpoint: ACL check
* nomad/state/state_store_test: remove call to state.CSIVolumes
* nomad/core_sched_test: job namespace match so claim gc works
This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.
When an alloc is marked terminal (and after node unstage/unpublish
have been called), the client syncs the terminal alloc state with the
server via `Node.UpdateAlloc RPC`.
For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC
handler at the server will emit an eval for a new core job to garbage
collect CSI volume claims. When this eval is handled on the core
scheduler, it will call a `volumeReap` method to release the claims
for all terminal allocs on the job.
The volume reap will issue a `ControllerUnpublishVolume` RPC for any
node that has no alloc claiming the volume. Once this returns (or
is skipped), the volume reap will send a new `CSIVolume.Claim` RPC
that releases the volume claim for that allocation in the state store,
making it available for scheduling again.
This same `volumeReap` method will be called from the core job GC,
which gives us a second chance to reclaim volumes during GC if there
were controller RPC failures.
This changeset adds a new core job `CoreJobCSIVolumePublicationGC` to
the leader's loop for scheduling core job evals. Right now this is an
empty method body without even a config file stanza. Later changesets
will implement the logic of volume publication GC.
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.
Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.
This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
This PR fixes an edge case where we could GC an allocation that was in a
desired stop state but had not terminated yet. This can be hit if the
client hasn't shutdown the allocation yet or if the allocation is still
shutting down (long kill_timeout).
Fixes https://github.com/hashicorp/nomad/issues/4940