Commit graph

17425 commits

Author SHA1 Message Date
Danielle Lancashire e75f057df3 csi: Fix Controller RPCs
Currently the handling of CSINode RPCs does not correctly handle
forwarding RPCs to Nodes.

This commit fixes this by introducing a shim RPC
(nomad/client_csi_enpdoint) that will correctly forward the request to
the owning node, or submit the RPC to the client.

In the process it also cleans up handling a little bit by adding the
`CSIControllerQuery` embeded struct for required forwarding state.

The CSIControllerQuery embeding the requirement of a `PluginID` also
means we could move node targetting into the shim RPC if wanted in the
future.
2020-03-23 13:58:30 -04:00
Danielle Lancashire d5e255f97a client: Rename ClientCSI -> CSIController 2020-03-23 13:58:30 -04:00
Danielle Lancashire 5b05baf9f6 csi: Add /dev mounts to CSI Plugins
CSI Plugins that manage devices need not just access to the CSI
directory, but also to manage devices inside `/dev`.

This commit introduces a `/dev:/dev` mount to the container so that they
may do so.
2020-03-23 13:58:30 -04:00
Tim Gross 8bc5641438 csi: volume claim garbage collection (#7125)
When an alloc is marked terminal (and after node unstage/unpublish
have been called), the client syncs the terminal alloc state with the
server via `Node.UpdateAlloc RPC`.

For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC
handler at the server will emit an eval for a new core job to garbage
collect CSI volume claims. When this eval is handled on the core
scheduler, it will call a `volumeReap` method to release the claims
for all terminal allocs on the job.

The volume reap will issue a `ControllerUnpublishVolume` RPC for any
node that has no alloc claiming the volume. Once this returns (or
is skipped), the volume reap will send a new `CSIVolume.Claim` RPC
that releases the volume claim for that allocation in the state store,
making it available for scheduling again.

This same `volumeReap` method will be called from the core job GC,
which gives us a second chance to reclaim volumes during GC if there
were controller RPC failures.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 6fc7f7779d csimanager/volume: Update MountVolume docstring 2020-03-23 13:58:30 -04:00
Danielle Lancashire cd5b4923d0 api: Register CSIPlugin before registering a Volume 2020-03-23 13:58:30 -04:00
Danielle Lancashire 1b70fb1398 hook resources: Init with empty resources during setup 2020-03-23 13:58:30 -04:00
Danielle Lancashire 511b7775a6 csi: Claim CSI Volumes during csi_hook.Prerun
This commit is the initial implementation of claiming volumes from the
server and passes through any publishContext information as appropriate.

There's nothing too fancy here.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 9d4307a3ef csi_endpoint: Provide AllocID in req, and return Volume
Currently, the client has to ship an entire allocation to the server as
part of performing a VolumeClaim, this has a few problems:

Firstly, it means the client is sending significantly more data than is
required (an allocation contains the entire contents of a Nomad job,
alongside other irrelevant state) which has a non-zero (de)serialization
cost.

Secondly, because the allocation was never re-fetched from the state
store, it means that we were potentially open to issues caused by stale
state on a misbehaving or malicious client.

The change removes both of those issues at the cost of a couple of more
state store lookups, but they should be relatively cheap.

We also now provide the CSIVolume in the response for a claim, so the
client can perform a Claim without first going ahead and fetching all of
the volumes.
2020-03-23 13:58:30 -04:00
Danielle Lancashire f79351915c csi: Basic volume usage tracking 2020-03-23 13:58:30 -04:00
Danielle Lancashire 0203341033 csi: Add comment to UsageOptions.ToFS() 2020-03-23 13:58:30 -04:00
Danielle Lancashire c3b1154703 csi: Validate Volumes during registration
This PR implements some intitial support for doing deeper validation of
a volume during its registration with the server. This allows us to
validate the capabilities before users attempt to use the volumes during
most cases, and also prevents registering volumes without first setting
up a plugin, which should help to catch typos and the like during
registration.

This does have the downside of requiring users to wait for (1) instance
of a plugin to be running in their cluster before they can register
volumes.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 9f1a076bd5 client: Implement ClientCSI.ControllerValidateVolume 2020-03-23 13:58:30 -04:00
Danielle Lancashire 34acb596e3 plugins/csi: Implement ConvtrollerValidateCapabilities RPC 2020-03-23 13:58:30 -04:00
Danielle Lancashire 6b7ee96a88 csi: Move VolumeCapabilties helper to package 2020-03-23 13:58:30 -04:00
Danielle Lancashire e227f31584 sched/feasible: Return more detailed CSI Failure messages 2020-03-23 13:58:30 -04:00
Danielle Lancashire da4f6b60a2 csi: Pass through usage options to the csimanager
The CSI Spec requires us to attach and stage volumes based on different
types of usage information when it may effect how they are bound. Here
we pass through some basic usage options in the CSI Hook (specifically
the volume aliases ReadOnly field), and the attachment/access mode from
the volume. We pass the attachment/access mode seperately from the
volume as it simplifies some handling and doesn't necessarily force
every attachment to use the same mode should more be supported (I.e if
we let each `volume "foo" {}` specify an override in the future).
2020-03-23 13:58:30 -04:00
Danielle Lancashire a62a90e03c csi: Unpublish volumes during ar.Postrun
This commit introduces initial support for unmounting csi volumes.

It takes a relatively simplistic approach to performing
NodeUnpublishVolume calls, optimising for cleaning up any leftover state
rather than terminating early in the case of errors.

This is because it happens during an allocation's shutdown flow and may
not always have a corresponding call to `NodePublishVolume` that
succeeded.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 6762442199 csiclient: Add grpc.CallOption support to NodeUnpublishVolume 2020-03-23 13:58:30 -04:00
Danielle Lancashire 6665bdec2e taskrunner/volume_hook: Cleanup arg order of prepareHostVolumes 2020-03-23 13:58:30 -04:00
Danielle Lancashire 8692ca86bb taskrunner/volume_hook: Mounts for CSI Volumes
This commit implements support for creating driver mounts for CSI
Volumes.

It works by fetching the created mounts from the allocation resources
and then iterates through the volume requests, creating driver mount
configs as required.

It's a little bit messy primarily because there's _so_ much terminology
overlap and it's a bit difficult to follow.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 7a33864edf volume_hook: Loosen validation in host volume prep 2020-03-23 13:58:30 -04:00
Danielle Lancashire d8334cf884 allocrunner: Push state from hooks to taskrunners
This commit is an initial (read: janky) approach to forwarding state
from an allocrunner hook to a taskrunner using a similar `hookResources`
approach that tr's use internally.

It should eventually probably be replaced with something a little bit
more message based, but for things that only come from pre-run hooks,
and don't change, it's probably fine for now.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 3ef41fbb86 csi_hook: Stage/Mount volumes as required
This commit introduces the first stage of volume mounting for an
allocation. The csimanager.VolumeMounter interface manages the blocking
and actual minutia of the CSI implementation allowing this hook to do
the minimal work of volume retrieval and creating mount info.

In the future the `CSIVolume.Get` request should be replaced by
`CSIVolume.Claim(Batch?)` to minimize the number of RPCs and to handle
external triggering of a ControllerPublishVolume request as required.

We also need to ensure that if pre-run hooks fail, we still get a full
unwinding of any publish and staged volumes to ensure that there are no hanging
references to volumes. That is not handled in this commit.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 4a2492ecb1 client: Pass an RPC Client to AllocRunners
As part of introducing support for CSI, AllocRunner hooks need to be
able to communicate with Nomad Servers for validation of and interaction
with storage volumes. Here we create a small RPCer interface and pass
the client (rpc client) to the AR in preparation for making these RPCs.
2020-03-23 13:58:30 -04:00
Tim Gross b03b78b212 csi: server-to-controller publish/unpublish RPCs (#7124)
Nomad servers need to make requests to CSI controller plugins running
on a client for publish/unpublish. The RPC needs to look up the client
node based on the plugin, load balancing across controllers, and then
perform the required client RPC to that node (via server forwarding if
neccessary).
2020-03-23 13:58:30 -04:00
Tim Gross b9b315f8d1 csi: stub methods for server-to-controller RPC calls (#7117) 2020-03-23 13:58:30 -04:00
Danielle Lancashire 77bcaa8183 csi_endpoint: Support No ACLs and restrict Nodes
This commit refactors the ACL code for the CSI endpoint to support
environments that run without acls enabled (e.g developer environments)
and also provides an easy way to restrict which endpoints may be
accessed with a client's SecretID to limit the blast radius of a
malicious client on the state of the environment.
2020-03-23 13:58:30 -04:00
Danielle Lancashire a2e01c4369 sched/feasible: Validate CSIVolume's correctly
Previously we were looking up plugins based on the Alias Name for a CSI
Volume within the context of its task group.

Here we first look up a volume based on its identifier and then validate
the existence of the plugin based on its `PluginID`.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 22e8317a53 csi: Disable validation of volume topology 2020-03-23 13:58:30 -04:00
Danielle Lancashire 15c6c05ccf api: Parse CSI Volumes
Previously when deserializing volumes we skipped over volumes that were
not of type `host`. This commit ensures that we parse both host and csi
volumes correctly.
2020-03-23 13:58:30 -04:00
Danielle Lancashire e56c677221 sched/feasible: CSI - Filter applicable volumes
This commit filters the jobs volumes when setting them on the
feasibility checker. This ensures that the rest of the checker does not
have to worry about non-csi volumes.
2020-03-23 13:58:30 -04:00
Tim Gross 01c704ab9d csi: add PublishContext to CSIVolumeClaimResponse (#7113)
The `ControllerPublishVolumeResponse` CSI RPC includes the publish
context intended to be passed by the orchestrator as an opaque value
to the node plugins. This changeset adds it to our response to a
volume claim request to proxy the controller's response back to the
client node.
2020-03-23 13:58:29 -04:00
Tim Gross 60901fa764 csi: implement CSI controller detach request/response (#7107)
This changeset implements the minimal structs on the client-side we
need to compile the work-in-progress implementation of the
server-to-controller RPCs. It doesn't include implementing the
`ClientCSI.DettachVolume` RPC on the client.
2020-03-23 13:58:29 -04:00
Danielle Lancashire f77d3813d1 csi: Fix broken call to newVolumeManager 2020-03-23 13:58:29 -04:00
Danielle Lancashire 3bff9fefae csi: Provide plugin-scoped paths during RPCs
When providing paths to plugins, the path needs to be in the scope of
the plugins container, rather than that of the host.

Here we enable that by providing the mount point through the plugin
registration and then use it when constructing request target paths.
2020-03-23 13:58:29 -04:00
Danielle Lancashire 94e87fbe9c csimanager: Cleanup volumemanager setup 2020-03-23 13:58:29 -04:00
Danielle Lancashire ee85c468c0 csimanager: Instantiate fingerprint manager's csiclient 2020-03-23 13:58:29 -04:00
Tim Gross fb1aad66ee csi: implement releasing volume claims for terminal allocs (#7076)
When an alloc is marked terminal, and after node unstage/unpublish
have been called, the client will sync the terminal alloc state with
the server via `Node.UpdateAlloc` RPC.

This changeset implements releasing the volume claim for each volume
associated with the terminal alloc. It doesn't yet implement the RPC
call we need to make to the `ControllerUnpublishVolume` CSI RPC.
2020-03-23 13:58:29 -04:00
Tim Gross d4cd272de3 csi: implement VolumeClaimRPC (#7048)
When the client receives an allocation which includes a CSI volume,
the alloc runner will block its main `Run` loop. The alloc runner will
issue a `VolumeClaim` RPC to the Nomad servers. This changeset
implements the portions of the `VolumeClaim` RPC endpoint that have
not been previously completed.
2020-03-23 13:58:29 -04:00
Lang Martin 421d7ed2e4 nomad: csi_endpoint send register & deregister requests to raft (#7059) 2020-03-23 13:58:29 -04:00
Lang Martin 7b675f89ac csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049)
* state_store: csi volumes/plugins store the index in the txn

* nomad: csi_endpoint_test require index checks need uint64()

* nomad: other tests using int 0 not uint64(0)

* structs: pass index into New, but not other struct methods

* state_store: csi plugin indexes, use new struct interface

* nomad: csi_endpoint_test check index/query meta (on explicit 0)

* structs: NewCSIVolume takes an index arg now

* scheduler/test: NewCSIVolume takes an index arg now
2020-03-23 13:58:29 -04:00
Danielle Lancashire bbf6a9c14b volume_manager: cleanup of mount detection
No functional changes, but makes ensure.*Dir follow a nicer return
style.
2020-03-23 13:58:29 -04:00
Danielle Lancashire 80b7aa0a31 volume_manager: Add support for publishing volumes 2020-03-23 13:58:29 -04:00
Danielle Lancashire a5c96ce2e1 csi: Add grpc.CallOption support to NodePublishVolume 2020-03-23 13:58:29 -04:00
Lang Martin a0a6766740 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Danielle Lancashire e619ae5a42 volume_manager: Initial support for unstaging volumes 2020-03-23 13:58:29 -04:00
Danielle Lancashire add55e37b8 csi: Expose gRPC Options on NodeUnstageVolume 2020-03-23 13:58:29 -04:00
Tim Gross 8673ea5cba csi: add empty CSI volume publication GC to scheduled core jobs (#7014)
This changeset adds a new core job `CoreJobCSIVolumePublicationGC` to
the leader's loop for scheduling core job evals. Right now this is an
empty method body without even a config file stanza. Later changesets
will implement the logic of volume publication GC.
2020-03-23 13:58:29 -04:00
Danielle Lancashire 6e71baa77d volume_manager: NodeStageVolume Support
This commit introduces support for staging volumes when a plugin
implements the STAGE_UNSTAGE_VOLUME capability.

See the following for further reference material:
 4731db0e0b/spec.md (nodestagevolume)
2020-03-23 13:58:29 -04:00