open-nomad

Author	SHA1	Message	Date
Tim Gross	d2e479505c	CSI: capability check ListVolumes at RPC for nicer error messages The plugin stub object does not include fine-grained capability checks, which means `nomad volume status -verbose` will return ugly and verbose error "Unimplemented" messages from the plugin if it does not support the CSI `ListVolumes` RPC. Return a nicer error message from our RPC handler instead.	2021-04-07 12:00:22 -04:00
Tim Gross	276633673d	CSI: use AccessMode/AttachmentMode from CSIVolumeClaim Registration of Nomad volumes previously allowed for a single volume capability (access mode + attachment mode pair). The recent `volume create` command requires that we pass a list of requested capabilities, but the existing workflow for claiming volumes and attaching them on the client assumed that the volume's single capability was correct and unchanging. Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to set the initial claim value, and add backwards compatibility logic to handle the existing volumes that already have claims without these fields.	2021-04-07 11:24:09 -04:00
Tim Gross	466b620fa4	CSI: volume snapshot	2021-04-01 11:16:52 -04:00
Tim Gross	9fc4cf1419	CSI: fingerprint detailed controller capabilities In order to support new controller RPCs, we need to fingerprint volume capabilities in more detail and perform controller RPCs only when the specific capability is present. This fixes a bug in Ceph support where the plugin can only suport create/delete but we assume that it also supports attach/detach.	2021-03-31 16:37:09 -04:00
Tim Gross	aec5337862	CSI: HTTP handlers for create/delete/list	2021-03-31 16:37:09 -04:00
Tim Gross	d38008176e	CSI: create/delete/list volume RPCs This commit implements the RPC handlers on the client that talk to the CSI plugins on that client for the Create/Delete/List RPC.	2021-03-31 16:37:09 -04:00
Tim Gross	9b2b580d1a	CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158 ) Callers of `CSIVolumeByID` are generally assuming they should receive a single volume. This potentially results in feasibility checking being performed against the wrong volume if a volume's ID is a prefix substring of other volume (for example: "test" and "testing"). Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix matching in the command line client. Add the required elements for prefix matching to the commands and API.	2021-03-18 14:32:40 -04:00
Kris Hicks	93155ba3da	Add gocritic to golangci-lint config (#9556 )	2020-12-08 12:47:04 -08:00
Tim Gross	b2cd0da0a2	CSI: fix transaction handling in state store (#9438 ) When making updates to CSI plugins, the state store methods that have open write transactions were querying the state store using the same methods used by the CSI RPC endpoint, but these method creates their own top-level read transactions. During concurrent plugin updates (as happens when a plugin job is stopped), this can cause write skew in the plugin counts. * Refactor the CSIPlugin query methods to have an implementation method that accepts a transaction, which can be called with either a read txn or a write txn. * Refactor the CSIVolume query methods to have an implementation method that accepts a transaction, which can be called with either a read txn or a write txn. * CSI volumes need to be "denormalized" with their plugins and (optionally) allocations. Read-only RPC endpoints should take a snapshot so that we can make multiple state store method calls with a consistent view.	2020-11-25 11:15:57 -05:00
Tim Gross	60874ebe25	csi: Postrun hook should not change mode (#9323 ) The unpublish workflow requires that we know the mode (RW vs RO) if we want to unpublish the node. Update the hook and the Unpublish RPC so that we mark the claim for release in a new state but leave the mode alone. This fixes a bug where RO claims were failing node unpublish. The core job GC doesn't know the mode, but we don't need it for that workflow, so add a mode specifically for GC; the volumewatcher uses this as a sentinel to check whether claims (with their specific RW vs RO modes) need to be claimed.	2020-11-11 13:06:30 -05:00
Tim Gross	ec1e75d9f4	csi: remove stray TODO comment This item was completed in #8626	2020-10-09 11:03:51 -04:00
Tim Gross	443fdaa86b	csi: nomad volume detach command (#8584 ) The soundness guarantees of the CSI specification leave a little to be desired in our ability to provide a 100% reliable automated solution for managing volumes. This changeset provides a new command to bridge this gap by providing the operator the ability to intervene. The command doesn't take an allocation ID so that the operator doesn't have to keep track of alloc IDs that may have been GC'd. Handle this case in the unpublish RPC by sending the client RPC for all the terminal/nil allocs on the selected node.	2020-08-11 10:18:54 -04:00
Tim Gross	7d53ed88d6	csi: client RPCs should return wrapped errors for checking (#8605 ) When the client-side actions of a CSI client RPC succeed but we get disconnected during the RPC or we fail to checkpoint the claim state, we want to be able to retry the client RPC without getting blocked by the client-side state (ex. mount points) already having been cleaned up in previous calls.	2020-08-07 11:01:36 -04:00
Tim Gross	81b604fa13	csi: controller unpublish should check current alloc count (#8604 ) Using the count of node claims from earlier in the `CSIVolume.Unpublish RPC doesn't correctly account for cases where the RPC was interrupted but checkpointed. Instead, we'll check the current allocation count and status to determine whether we need to send a controller unpublish.	2020-08-07 10:43:45 -04:00
Tim Gross	eaa14ab64c	csi: add unpublish RPC (#8572 ) This changeset is plumbing for a `nomad volume detach` command that will be reused by the volumewatcher claim GC as well.	2020-08-06 13:51:29 -04:00
Tim Gross	3902709c0a	csi: check for empty arguments on CSI endpoint (#8027 ) Some of the CSI RPC endpoints were missing validation that the ID or the Volume definition was present. This could result in nonsense `CSIVolume` structs being written to raft during registration. This changeset corrects that bug and adds validation checks to present nicer error messages to operators in some other cases.	2020-05-20 10:22:24 -04:00
Tim Gross	2082cf738a	csi: support for VolumeContext and VolumeParameters (#7957 ) The MVP for CSI in the 0.11.0 release of Nomad did not include support for opaque volume parameters or volume context. This changeset adds support for both. This also moves args for ControllerValidateCapabilities into a struct. The CSI plugin `ControllerValidateCapabilities` struct that we turn into a CSI RPC is accumulating arguments, so moving it into a request struct will reduce the churn of this internal API, make the plugin code more readable, and make this method consistent with the other plugin methods in that package.	2020-05-15 08:16:01 -04:00
Tim Gross	4374c1a837	csi: support Secrets parameter in CSI RPCs (#7923 ) CSI plugins can require credentials for some publishing and unpublishing workflow RPCs. Secrets are configured at the time of volume registration, stored in the volume struct, and then passed around as an opaque map by Nomad to the plugins.	2020-05-11 17:12:51 -04:00
Tim Gross	801ebcfe8d	periodic GC for CSI plugins (#7878 ) This changeset implements a periodic garbage collection of unused CSI plugins. Plugins are self-cleaning when the last allocation for a plugin is stopped, but this feature will cover any missing edge cases and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins cleaned up.	2020-05-06 16:49:12 -04:00
Tim Gross	e34f099d20	csi: read-repair CSI volume claims (#7824 ) The `CSIVolumeClaim` fields were added after 0.11.1, so claims made before that may be missing the value. Repair this when we read the volume out of the state store. The `NodeID` field was added after 0.11.0, so we need to ensure it's been populated during upgrades from 0.11.0.	2020-04-29 11:57:19 -04:00
Tim Gross	083b35d651	csi: checkpoint volume claim garbage collection (#7782 ) Adds a `CSIVolumeClaim` type to be tracked as current and past claims on a volume. Allows for a client RPC failure during node or controller detachment without having to keep the allocation around after the first garbage collection eval. This changeset lays groundwork for moving the actual detachment RPCs into a volume watching loop outside the GC eval.	2020-04-23 11:06:23 -04:00
Tim Gross	4e9bd1e1d1	refactor: consolidate private methods for CSI RPC (#7702 ) Follow-up for a method missed in the refactor for #7688. The `volAndPluginLookup` method is only ever called from the server's `CSI` RPC and never the `ClientCSI` RPC, so move it into that scope.	2020-04-13 10:46:43 -04:00
Tim Gross	f37e986b1b	refactor: make nodeForControllerPlugin private to ClientCSI (#7688 ) The current design of `ClientCSI` RPC requires that callers in the server know about the free-standing `nodeForControllerPlugin` function. This makes it difficult to send `ClientCSI` RPC messages from subpackages of `nomad` and adds a bunch of boilerplate to every server-side caller of a controller RPC. This changeset makes it so that the `ClientCSI` RPCs will populate and validate the controller's client node ID if it hasn't been passed by the caller, centralizing the logic of picking and validating controller targets into the `nomad.ClientCSI` struct.	2020-04-10 16:47:21 -04:00
Tim Gross	f6b3d38eb8	CSI: move node unmount to server-driven RPCs (#7596 ) If a volume-claiming alloc stops and the CSI Node plugin that serves that alloc's volumes is missing, there's no way for the allocrunner hook to send the `NodeUnpublish` and `NodeUnstage` RPCs. This changeset addresses this issue with a redesign of the client-side for CSI. Rather than unmounting in the alloc runner hook, the alloc runner hook will simply exit. When the server gets the `Node.UpdateAlloc` for the terminal allocation that had a volume claim, it creates a volume claim GC job. This job will made client RPCs to a new node plugin RPC endpoint, and only once that succeeds, move on to making the client RPCs to the controller plugin. If the node plugin is unavailable, the GC job will fail and be requeued.	2020-04-02 16:04:56 -04:00
Lang Martin	e03c328792	csi: use node MaxVolumes during scheduling (#7565 ) * nomad/state/state_store: CSIVolumesByNodeID ignores namespace * scheduler/scheduler: add CSIVolumesByNodeID to the state interface * scheduler/feasible: check node MaxVolumes * nomad/csi_endpoint: no namespace inn CSIVolumesByNodeID anymore * nomad/state/state_store: avoid DenormalizeAllocationSlice * nomad/state/iterator: clean up SliceIterator Next * scheduler/feasible_test: block with MaxVolumes * nomad/state/state_store_test: fix args to CSIVolumesByNodeID	2020-03-31 17:16:47 -04:00
Lang Martin	0847cb513c	csi: volume/plugin list should return an empty array, not nil (#7443 ) * nomad/csi_endpoint: return an empty list, not nil * nomad/csi_endpoint_test: volume list returns non-nil	2020-03-23 21:21:40 -04:00
Lang Martin	e100444740	csi: add mount_options to volumes and volume requests (#7398 ) Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host. Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context. closes #7007 * nomad/structs/volumes: add MountOptions to volume request * jobspec/test-fixtures/basic.hcl: add mount_options to volume block * jobspec/parse_test: add expected MountOptions * api/tasks: add mount_options * jobspec/parse_group: use hcl decode not mapstructure, mount_options * client/allocrunner/csi_hook: pass MountOptions through client/allocrunner/csi_hook: add a VolumeMountOptions client/allocrunner/csi_hook: drop Options client/allocrunner/csi_hook: use the structs options * client/pluginmanager/csimanager/interface: UsageOptions.MountOptions * client/pluginmanager/csimanager/volume: pass MountOptions in capabilities * plugins/csi/plugin: remove todo 7007 comment * nomad/structs/csi: MountOptions * api/csi: add options to the api for parsing, match structs * plugins/csi/plugin: move VolumeMountOptions to structs * api/csi: use specific type for mount_options * client/allocrunner/csi_hook: merge MountOptions here * rename CSIOptions to CSIMountOptions * client/allocrunner/csi_hook * client/pluginmanager/csimanager/volume * nomad/structs/csi * plugins/csi/fake/client: add PrevVolumeCapability * plugins/csi/plugin * client/pluginmanager/csimanager/volume_test: remove debugging * client/pluginmanager/csimanager/volume: fix odd merging logic * api: rename CSIOptions -> CSIMountOptions * nomad/csi_endpoint: remove a 7007 comment * command/alloc_status: show mount options in the volume list * nomad/structs/csi: include MountOptions in the volume stub * api/csi: add MountOptions to stub * command/volume_status_csi: clean up csiVolMountOption, add it * command/alloc_status: csiVolMountOption lives in volume_csi_status * command/node_status: display mount flags * nomad/structs/volumes: npe * plugins/csi/plugin: npe in ToCSIRepresentation * jobspec/parse_test: expand volume parse test cases * command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions * command/volume_status_csi: copy paste error * jobspec/test-fixtures/basic: hclfmt * command/volume_status_csi: clean up csiVolMountOption	2020-03-23 13:59:25 -04:00
Lang Martin	6b6ae6c2bd	csi: ACLs for plugin endpoints (#7380 ) * acl/policy: add PolicyList for global ACLs * acl/acl: plugin policy * acl/acl: maxPrivilege is required to allow "list" * nomad/csi_endpoint: enforce plugin access with PolicyPlugin * nomad/csi_endpoint: check job ACL swapped params * nomad/csi_endpoint_test: test alloc filtering * acl/policy: add namespace csi-register-plugin * nomad/job_endpoint: check csi-register-plugin ACL on registration * nomad/job_endpoint_test: add plugin job cases	2020-03-23 13:59:25 -04:00
Lang Martin	b596e67f47	csi: implement volume ACLs (#7339 ) * acl/policy: add the volume ACL policies * nomad/csi_endpoint: enforce ACLs for volume access * nomad/search_endpoint_oss: volume acls * acl/acl: add plugin read as a global policy * acl/policy: add PluginPolicy global cap type * nomad/csi_endpoint: check the global plugin ACL policy * nomad/mock/acl: PluginPolicy * nomad/csi_endpoint: fix list rebase * nomad/core_sched_test: new test since #7358 * nomad/csi_endpoint_test: use correct permissions for list * nomad/csi_endpoint: allowCSIMount keeps ACL checks together * nomad/job_endpoint: check mount permission for jobs * nomad/job_endpoint_test: need plugin read, too	2020-03-23 13:59:25 -04:00
Lang Martin	3621df1dbf	csi: volume ids are only unique per namespace (#7358 ) * nomad/state/schema: use the namespace compound index * scheduler/scheduler: CSIVolumeByID interface signature namespace * scheduler/stack: SetJob on CSIVolumeChecker to capture namespace * scheduler/feasible: pass the captured namespace to CSIVolumeByID * nomad/state/state_store: use namespace in csi_volume index * nomad/fsm: pass namespace to CSIVolumeDeregister & Claim * nomad/core_sched: pass the namespace in volumeClaimReap * nomad/node_endpoint_test: namespaces in Claim testing * nomad/csi_endpoint: pass RequestNamespace to state.* * nomad/csi_endpoint_test: appropriately failed test * command/alloc_status_test: appropriately failed test * node_endpoint_test: avoid notTheNamespace for the job * scheduler/feasible_test: call SetJob to capture the namespace * nomad/csi_endpoint: ACL check the req namespace, query by namespace * nomad/state/state_store: remove deregister namespace check * nomad/state/state_store: remove unused CSIVolumes * scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace * nomad/csi_endpoint: ACL check * nomad/state/state_store_test: remove call to state.CSIVolumes * nomad/core_sched_test: job namespace match so claim gc works	2020-03-23 13:59:25 -04:00
Tim Gross	22e9f679c3	csi: implement controller detach RPCs (#7356 ) This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.	2020-03-23 13:59:25 -04:00
Lang Martin	6750c262a4	csi: use `ExternalID`, when set, to identify volumes for outside RPC calls (#7326 ) * nomad/structs/csi: new RemoteID() uses the ExternalID if set * nomad/csi_endpoint: pass RemoteID to volume request types * client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume	2020-03-23 13:58:30 -04:00
Lang Martin	80619137ab	csi: volumes listed in `nomad node status` (#7318 ) * api/allocations: GetTaskGroup finds the taskgroup struct * command/node_status: display CSI volume names * nomad/state/state_store: new CSIVolumesByNodeID * nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator * nomad/csi_endpoint: deal with a slice of volumes * nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator * nomad/structs/csi: CSIVolumeListRequest takes a NodeID * nomad/csi_endpoint: use the return iterator * command/agent/csi_endpoint: parse query params for CSIVolumes.List * api/nodes: new CSIVolumes to list volumes by node * command/node_status: use the new list endpoint to print volumes * nomad/state/state_store: error messages consider the operator * command/node_status: include the Provider	2020-03-23 13:58:30 -04:00
Lang Martin	de25fc6cf4	csi: csi-hostpath plugin unimplemented error on controller publish (#7299 ) * client/allocrunner/csi_hook: tag errors * nomad/client_csi_endpoint: tag errors * nomad/client_rpc: remove an unnecessary error tag * nomad/state/state_store: ControllerRequired fix intent We use ControllerRequired to indicate that a volume should use the publish/unpublish workflow, rather than that it has a controller. We need to check both RequiresControllerPlugin and SupportsAttachDetach from the fingerprint to check that. * nomad/csi_endpoint: tag errors * nomad/csi_endpoint_test: longer error messages, mock fingerprints	2020-03-23 13:58:30 -04:00
Tim Gross	b04d23dae0	csi: ensure volume query is idempotent (#7303 ) We denormalize the `CSIVolume` struct when we query it from the state store by getting the plugin and its health. But unless we copy the volume, this denormalization gets synced back to the state store without passing through the fsm (which is invalid).	2020-03-23 13:58:30 -04:00
Tim Gross	b57df162ce	csi: ensure GET for plugin is idempotent (#7298 ) We denormalize the `CSIPlugin` struct when we query it from the state store by getting the current set of allocations that provide the plugin. But unless we copy the plugin, this denormalization gets synced back to the state store and each time we query we'll add another copy of the current allocations.	2020-03-23 13:58:30 -04:00
Tim Gross	de4ad6ca38	csi: add Provider field to CSI CLIs and APIs (#7285 ) Derive a provider name and version for plugins (and the volumes that use them) from the CSI identity API `GetPluginInfo`. Expose the vendor name as `Provider` in the API and CLI commands.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	e75f057df3	csi: Fix Controller RPCs Currently the handling of CSINode RPCs does not correctly handle forwarding RPCs to Nodes. This commit fixes this by introducing a shim RPC (nomad/client_csi_enpdoint) that will correctly forward the request to the owning node, or submit the RPC to the client. In the process it also cleans up handling a little bit by adding the `CSIControllerQuery` embeded struct for required forwarding state. The CSIControllerQuery embeding the requirement of a `PluginID` also means we could move node targetting into the shim RPC if wanted in the future.	2020-03-23 13:58:30 -04:00
Tim Gross	8bc5641438	csi: volume claim garbage collection (#7125 ) When an alloc is marked terminal (and after node unstage/unpublish have been called), the client syncs the terminal alloc state with the server via `Node.UpdateAlloc RPC`. For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC handler at the server will emit an eval for a new core job to garbage collect CSI volume claims. When this eval is handled on the core scheduler, it will call a `volumeReap` method to release the claims for all terminal allocs on the job. The volume reap will issue a `ControllerUnpublishVolume` RPC for any node that has no alloc claiming the volume. Once this returns (or is skipped), the volume reap will send a new `CSIVolume.Claim` RPC that releases the volume claim for that allocation in the state store, making it available for scheduling again. This same `volumeReap` method will be called from the core job GC, which gives us a second chance to reclaim volumes during GC if there were controller RPC failures.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	9d4307a3ef	csi_endpoint: Provide AllocID in req, and return Volume Currently, the client has to ship an entire allocation to the server as part of performing a VolumeClaim, this has a few problems: Firstly, it means the client is sending significantly more data than is required (an allocation contains the entire contents of a Nomad job, alongside other irrelevant state) which has a non-zero (de)serialization cost. Secondly, because the allocation was never re-fetched from the state store, it means that we were potentially open to issues caused by stale state on a misbehaving or malicious client. The change removes both of those issues at the cost of a couple of more state store lookups, but they should be relatively cheap. We also now provide the CSIVolume in the response for a claim, so the client can perform a Claim without first going ahead and fetching all of the volumes.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	c3b1154703	csi: Validate Volumes during registration This PR implements some intitial support for doing deeper validation of a volume during its registration with the server. This allows us to validate the capabilities before users attempt to use the volumes during most cases, and also prevents registering volumes without first setting up a plugin, which should help to catch typos and the like during registration. This does have the downside of requiring users to wait for (1) instance of a plugin to be running in their cluster before they can register volumes.	2020-03-23 13:58:30 -04:00
Tim Gross	b03b78b212	csi: server-to-controller publish/unpublish RPCs (#7124 ) Nomad servers need to make requests to CSI controller plugins running on a client for publish/unpublish. The RPC needs to look up the client node based on the plugin, load balancing across controllers, and then perform the required client RPC to that node (via server forwarding if neccessary).	2020-03-23 13:58:30 -04:00
Tim Gross	b9b315f8d1	csi: stub methods for server-to-controller RPC calls (#7117 )	2020-03-23 13:58:30 -04:00
Danielle Lancashire	77bcaa8183	csi_endpoint: Support No ACLs and restrict Nodes This commit refactors the ACL code for the CSI endpoint to support environments that run without acls enabled (e.g developer environments) and also provides an easy way to restrict which endpoints may be accessed with a client's SecretID to limit the blast radius of a malicious client on the state of the environment.	2020-03-23 13:58:30 -04:00
Tim Gross	d4cd272de3	csi: implement VolumeClaimRPC (#7048 ) When the client receives an allocation which includes a CSI volume, the alloc runner will block its main `Run` loop. The alloc runner will issue a `VolumeClaim` RPC to the Nomad servers. This changeset implements the portions of the `VolumeClaim` RPC endpoint that have not been previously completed.	2020-03-23 13:58:29 -04:00
Lang Martin	421d7ed2e4	nomad: csi_endpoint send register & deregister requests to raft (#7059 )	2020-03-23 13:58:29 -04:00
Lang Martin	88316208a0	csi: server-side plugin state tracking and api (#6966 ) * structs: CSIPlugin indexes jobs acting as plugins and node updates * schema: csi_plugins table for CSIPlugin * nomad: csi_endpoint use vol.Denormalize, plugin requests * nomad: csi_volume_endpoint: rename to csi_endpoint * agent: add CSI plugin endpoints * state_store_test: use generated ids to avoid t.Parallel conflicts * contributing: add note about registering new RPC structs * command: agent http register plugin lists * api: CSI plugin queries, ControllerHealthy -> ControllersHealthy * state_store: copy on write for volumes and plugins * structs: copy on write for volumes and plugins * state_store: CSIVolumeByID returns an unhealthy volume, denormalize * nomad: csi_endpoint use CSIVolumeDenormalizePlugins * structs: remove struct errors for missing objects * nomad: csi_endpoint return nil for missing objects, not errors * api: return meta from Register to avoid EOF error * state_store: CSIVolumeDenormalize keep allocs in their own maps * state_store: CSIVolumeDeregister error on missing volume * state_store: CSIVolumeRegister set indexes * nomad: csi_endpoint use CSIVolumeDenormalizePlugins tests	2020-03-23 13:58:29 -04:00

47 commits