Commit Graph

66 Commits

Author SHA1 Message Date
Piotr Kazmierczak 14b53df3b6
renamed stanza to block for consistency with other projects (#15941) 2023-01-30 15:48:43 +01:00
Seth Hoenig 2088ca3345
cleanup more helper updates (#14638)
* cleanup: refactor MapStringStringSliceValueSet to be cleaner

* cleanup: replace SliceStringToSet with actual set

* cleanup: replace SliceStringSubset with real set

* cleanup: replace SliceStringContains with slices.Contains

* cleanup: remove unused function SliceStringHasPrefix

* cleanup: fixup StringHasPrefixInSlice doc string

* cleanup: refactor SliceSetDisjoint to use real set

* cleanup: replace CompareSliceSetString with SliceSetEq

* cleanup: replace CompareMapStringString with maps.Equal

* cleanup: replace CopyMapStringString with CopyMap

* cleanup: replace CopyMapStringInterface with CopyMap

* cleanup: fixup more CopyMapStringString and CopyMapStringInt

* cleanup: replace CopySliceString with slices.Clone

* cleanup: remove unused CopySliceInt

* cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice

* cleanup: replace CopyMap with maps.Clone

* cleanup: run go mod tidy
2022-09-21 14:53:25 -05:00
Tim Gross 3fc7482ecd
CSI: failed allocation should not block its own controller unpublish (#14484)
A Nomad user reported problems with CSI volumes associated with failed
allocations, where the Nomad server did not send a controller unpublish RPC.

The controller unpublish is skipped if other non-terminal allocations on the
same node claim the volume. The check has a bug where the allocation belonging
to the claim being freed was included in the check incorrectly. During a normal
allocation stop for job stop or a new version of the job, the allocation is
terminal. But allocations that fail are not yet marked terminal at the point in
time when the client sends the unpublish RPC to the server.

For CSI plugins that support controller attach/detach, this means that the
controller will not be able to detach the volume from the allocation's host and
the replacement claim will fail until a GC is run. This changeset fixes the
conditional so that the claim's own allocation is not included, and makes the
logic easier to read. Include a test case covering this path.

Also includes two minor extra bugfixes:

* Entities we get from the state store should always be copied before
altering. Ensure that we copy the volume in the top-level unpublish workflow
before handing off to the steps.

* The list stub object for volumes in `nomad/structs` did not match the stub
object in `api`. The `api` package also did not include the current
readers/writers fields that are expected by the UI. True up the two objects and
add the previously undocumented fields to the docs.
2022-09-08 13:30:05 -04:00
Eric Weber cbce13c1ac
Add stage_publish_base_dir field to csi_plugin stanza of a job (#13919)
* Allow specification of CSI staging and publishing directory path
* Add website documentation for stage_publish_dir
* Replace erroneous reference to csi_plugin.mount_config with csi_plugin.mount_dir
* Avoid requiring CSI plugins to be redeployed after introducing StagePublishDir
2022-08-02 09:42:44 -04:00
Grant Griffiths 99896da443
CSI: make plugin health_timeout configurable in csi_plugin stanza (#13340)
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2022-06-14 10:04:16 -04:00
Tim Gross 826d9d47f9
CSI: replace structs->api with serialization extension (#12583)
The CSI HTTP API has to transform the CSI volume to redact secrets,
remove the claims fields, and to consolidate the allocation stubs into
a single slice of alloc stubs. This was done manually in #8590 but
this is a large amount of code and has proven both very bug prone
(see #8659, #8666, #8699, #8735, and #12150) and requires updating
lots of code every time we add a field to volumes or plugins.

In #10202 we introduce encoding improvements for the `Node` struct
that allow a more minimal transformation. Apply this same approach to
serializing `structs.CSIVolume` to API responses.

Also, the original reasoning behind #8590 for plugins no longer holds
because the counts are now denormalized within the state store, so we
can simply remove this transformation entirely.
2022-04-15 14:29:34 -04:00
Grant Griffiths 18a0a2c9a4
CSI: Add secrets flag support for delete volume (#11245) 2022-04-05 08:59:11 -04:00
Luiz Aoqui ab8ce87bba
Add pagination, filtering and sort to more API endpoints (#12186) 2022-03-08 20:54:17 -05:00
Tim Gross 2dafe46fe3
CSI: allow updates to volumes on re-registration (#12167)
CSI `CreateVolume` RPC is idempotent given that the topology,
capabilities, and parameters are unchanged. CSI volumes have many
user-defined fields that are immutable once set, and many fields that
are not user-settable.

Update the `Register` RPC so that updating a volume via the API merges
onto any existing volume without touching Nomad-controlled fields,
while validating it with the same strict requirements expected for
idempotent `CreateVolume` RPCs.

Also, clarify that this state store method is used for everything, not just
for the `Register` RPC.
2022-03-07 11:06:59 -05:00
Tim Gross f2a4ad0949
CSI: implement support for topology (#12129) 2022-03-01 10:15:46 -05:00
Tim Gross cfe3117af8
CSI: enforce usage at claim time (#12112)
* Remove redundant schedulable check in `FreeWriteClaims`. If a volume
  has been created but not yet claimed, its capabilities will be checked
  in `WriteSchedulable` at both scheduling time and claim time. We don't
  need to also check them in the `FreeWriteClaims` method.

* Enforce maximum volume claims for writers.

  When the scheduler checks feasibility for CSI volumes, the check is
  fairly loose: earlier versions of the same job are not counted as
  active claims. This allows the scheduler to place new allocations
  for the new version of a job, under the assumption that we'll replace
  the existing allocations and their volume claims.

  But when the alloc runner claims the volume, we need to enforce the
  active claims even if they're for allocations of an earlier version of
  the job. Otherwise we'll try to mount a volume that's currently being
  unmounted, and this will cause replacement allocations to frequently
  fail.

* Enforce single-node reader check for read-only volumes. When the
  alloc runner makes a claim for a read-only volume, we only check that
  the volume is potentially schedulable and not that it actually has
  free read claims.
2022-02-24 09:37:37 -05:00
Tim Gross 57a546489f
CSI: minor refactoring (#12105)
* rename method checking that free write claims are available
* use package-level variables for claim errors
* semgrep fix for testify
2022-02-23 11:13:51 -05:00
James Rasell 45f4689f9c
chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
James Rasell b6813f1221
chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
Grant Griffiths fecbbaee22 CSI ListSnapshots secrets implementation
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-07-28 11:30:29 -07:00
Tim Gross 0892d34ff9 CSI: capability block is required for volume registration 2021-04-08 13:02:24 -04:00
Tim Gross 276633673d CSI: use AccessMode/AttachmentMode from CSIVolumeClaim
Registration of Nomad volumes previously allowed for a single volume
capability (access mode + attachment mode pair). The recent `volume create`
command requires that we pass a list of requested capabilities, but the
existing workflow for claiming volumes and attaching them on the client
assumed that the volume's single capability was correct and unchanging.

Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to
set the initial claim value, and add backwards compatibility logic to handle
the existing volumes that already have claims without these fields.
2021-04-07 11:24:09 -04:00
Tim Gross 72c07f15fb refactor: internal claim methods should be private 2021-04-07 11:24:09 -04:00
Tim Gross 0856483115 CSI: fingerprint detailed node capabilities
In order to support new node RPCs, we need to fingerprint plugin capabilities
in more detail. This changeset mirrors recent work to fingerprint controller
capabilities, but is not yet in use by any Nomad RPC.
2021-04-01 16:00:58 -04:00
Tim Gross 466b620fa4
CSI: volume snapshot 2021-04-01 11:16:52 -04:00
Tim Gross 9fc4cf1419 CSI: fingerprint detailed controller capabilities
In order to support new controller RPCs, we need to fingerprint volume
capabilities in more detail and perform controller RPCs only when the specific
capability is present. This fixes a bug in Ceph support where the plugin can
only suport create/delete but we assume that it also supports attach/detach.
2021-03-31 16:37:09 -04:00
Tim Gross f149abfa41 CSI: volume creation/registration should not validate attachment
The CSI specification requires that we validate a list of `Capability` (access
mode + accessibility) when we create volume, but the existing volume
registration workflow incorrectly validates a single capability. The
specific capability required by a volume claim is checked at the time we make
the claim, so remove the check for `AttachmentMode`/`AcccessMode`.
2021-03-31 16:37:09 -04:00
Tim Gross aec5337862 CSI: HTTP handlers for create/delete/list 2021-03-31 16:37:09 -04:00
Tim Gross d38008176e CSI: create/delete/list volume RPCs
This commit implements the RPC handlers on the client that talk to the CSI
plugins on that client for the Create/Delete/List RPC.
2021-03-31 16:37:09 -04:00
Kris Hicks 39e369c3bb
csi: Return error when deleting node (#9803)
In this change we'll properly return the error in the
CSIPluginTypeMonolith case (which is the type given in DeleteNode()),
and also return the error when the given ID is not found.

This was found via errcheck.
2021-01-14 12:44:50 -08:00
Tim Gross c320c1ba57
CSI: fix struct copying errors (#9239)
The CSIVolume struct "denormalizes" allocations when it's first queried from
the state store. The CSIVolumeByID method on the state store copies the volume
before denormalizing so that we don't end up with unexpected changes. The
copying has some subtle bugs that meant that Allocations (as well as
Topologies and MountOptions) were not getting copied when expected.

Also, ensure we never write allocations attached to volumes to the state store
during claims.
2020-11-18 10:59:25 -05:00
Tim Gross 60874ebe25
csi: Postrun hook should not change mode (#9323)
The unpublish workflow requires that we know the mode (RW vs RO) if we want to
unpublish the node. Update the hook and the Unpublish RPC so that we mark the
claim for release in a new state but leave the mode alone. This fixes a bug
where RO claims were failing node unpublish.

The core job GC doesn't know the mode, but we don't need it for that workflow,
so add a mode specifically for GC; the volumewatcher uses this as a sentinel
to check whether claims (with their specific RW vs RO modes) need to be claimed.
2020-11-11 13:06:30 -05:00
Tim Gross e8c13a2307
csi: validate mount options during volume registration (#9044)
Volumes using attachment mode `file-system` use the CSI filesystem API when
they're mounted, and can be passed mount options. But `block-device` mode
volumes don't have this option. When RPCs are made to plugins, we are silently
dropping the mount options we don't expect to see, but this results in a poor
operator experience when the mount options aren't honored. This changeset
makes passing mount options to a `block-device` volume a validation error.
2020-10-08 09:23:21 -04:00
Tim Gross 3ceb5b36b1
csi: allow more than 1 writer claim for multi-writer mode (#9040)
Fixes a bug where CSI volumes with the `MULTI_NODE_MULTI_WRITER` access mode
were using the same logic as `MULTI_NODE_SINGLE_WRITER` to determine whether
the volume had writer claims available for scheduling.

Extends CSI claim endpoint test to exercise multi-reader and make sure `WriteFreeClaims`
is exercised for multi-writer in feasibility test.
2020-10-07 10:43:23 -04:00
Mahmood Ali 117aec0036 Fix accidental broken clones
Fix CSIMountOptions.Copy() and VolumeRequest.Copy() where they
accidentally returned a reference to self rather than a deep copy.

`&(*ref)` in Golang apparently equivalent to plain `&ref`.
2020-08-28 15:29:22 -04:00
Lang Martin 7d483f93c0
csi: plugins track jobs in addition to allocations, and use job information to set expected counts (#8699)
* nomad/structs/csi: add explicit job support
* nomad/state/state_store: capture job updates directly
* api/nodes: CSIInfo needs the AllocID
* command/agent/csi_endpoint: AllocID was missing
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-08-27 17:20:00 -04:00
Tim Gross eaa14ab64c
csi: add unpublish RPC (#8572)
This changeset is plumbing for a `nomad volume detach` command that will be
reused by the volumewatcher claim GC as well.
2020-08-06 13:51:29 -04:00
Tim Gross 23be116da0
csi: add -force flag to volume deregister (#8295)
The `nomad volume deregister` command currently returns an error if the volume
has any claims, but in cases where the claims can't be dropped because of
plugin errors, providing a `-force` flag gives the operator an escape hatch.

If the volume has no allocations or if they are all terminal, this flag
deletes the volume from the state store, immediately and implicitly dropping
all claims without further CSI RPCs. Note that this will not also
unmount/detach the volume, which we'll make the responsibility of a separate
`nomad volume detach` command.
2020-07-01 12:17:51 -04:00
Tim Gross 2082cf738a
csi: support for VolumeContext and VolumeParameters (#7957)
The MVP for CSI in the 0.11.0 release of Nomad did not include support
for opaque volume parameters or volume context. This changeset adds
support for both.

This also moves args for ControllerValidateCapabilities into a struct.
The CSI plugin `ControllerValidateCapabilities` struct that we turn
into a CSI RPC is accumulating arguments, so moving it into a request
struct will reduce the churn of this internal API, make the plugin
code more readable, and make this method consistent with the other
plugin methods in that package.
2020-05-15 08:16:01 -04:00
Tim Gross 4374c1a837
csi: support Secrets parameter in CSI RPCs (#7923)
CSI plugins can require credentials for some publishing and
unpublishing workflow RPCs. Secrets are configured at the time of
volume registration, stored in the volume struct, and then passed
around as an opaque map by Nomad to the plugins.
2020-05-11 17:12:51 -04:00
Tim Gross 801ebcfe8d
periodic GC for CSI plugins (#7878)
This changeset implements a periodic garbage collection of unused CSI
plugins. Plugins are self-cleaning when the last allocation for a
plugin is stopped, but this feature will cover any missing edge cases
and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins
cleaned up.
2020-05-06 16:49:12 -04:00
Tim Gross ce86a594a6
csi: fix plugin counts on node update (#7844)
In this changeset:

* If a Nomad client node is running both a controller and a node
  plugin (which is a common case), then if only the controller or the
  node is removed, the plugin was not being updated with the correct
  counts.
* The existing test for plugin cleanup didn't go back to the state
  store, which normally is ok but is complicated in this case by
  denormalization which changes the behavior. This commit makes the
  test more comprehensive.
* Set "controller required" when plugin has `PUBLISH_READONLY`. All
  known controllers that support `PUBLISH_READONLY` also support
  `PUBLISH_UNPUBLISH_VOLUME` but we shouldn't assume this.
* Only create plugins when the allocs for those plugins are
  healthy. If we allow a plugin to be created for the first time when
  the alloc is not healthy, then we'll recreate deleted plugins when
  the job's allocs all get marked terminal.
* Terminal plugin alloc updates should cleanup the plugin. The client
  fingerprint can't tell if the plugin is unhealthy intentionally (for
  the case of updates or job stop). Allocations that are
  server-terminal should delete themselves from the plugin and trigger
  a plugin self-GC, the same as an unused node.
2020-05-05 15:39:57 -04:00
Tim Gross 22e3815e8c
docstring improvements and typo fixes (#7862) 2020-05-05 10:30:50 -04:00
Tim Gross 52e805a6a6
csi: ensure Read/WriteAllocs aren't released early (#7841)
We should only remove the `ReadAllocs`/`WriteAllocs` values for a
volume after the claim has entered the "ready to free"
state. The volume will eventually be released as expected. But
querying the volume API will show the volume is released before the
controller unpublish has finished and this can cause a race with
starting new jobs.

Test updates are to cover cases where we're dropping claims but not
running through the whole reaping process.
2020-04-30 17:11:31 -04:00
Tim Gross a7a64443e1
csi: move volume claim release into volumewatcher (#7794)
This changeset adds a subsystem to run on the leader, similar to the
deployment watcher or node drainer. The `Watcher` performs a blocking
query on updates to the `CSIVolumes` table and triggers reaping of
volume claims.

This will avoid tying up scheduling workers by immediately sending
volume claim workloads into their own loop, rather than blocking the
scheduling workers in the core GC job doing things like talking to CSI
controllers

The volume watcher is enabled on leader step-up and disabled on leader
step-down.

The volume claim GC mechanism now makes an empty claim RPC for the
volume to trigger an index bump. That in turn unblocks the blocking
query in the volume watcher so it can assess which claims can be
released for a volume.
2020-04-30 09:13:00 -04:00
Tim Gross 083b35d651
csi: checkpoint volume claim garbage collection (#7782)
Adds a `CSIVolumeClaim` type to be tracked as current and past claims
on a volume. Allows for a client RPC failure during node or controller
detachment without having to keep the allocation around after the
first garbage collection eval.

This changeset lays groundwork for moving the actual detachment RPCs
into a volume watching loop outside the GC eval.
2020-04-23 11:06:23 -04:00
Tim Gross bd74b593d0
csi: nil-check allocs for VolumeDenormalize and claim methods (#7760) 2020-04-21 08:32:24 -04:00
Lang Martin 50ff9ccd44
csi: plugin deregistration on plugin job GC (#7502)
* nomad/structs/csi: delete just one plugin type from a node

* nomad/structs/csi: add DeleteAlloc

* nomad/state/state_store: add deleteJobFromPlugin

* nomad/state/state_store: use DeleteAlloc not DeleteNodeType

* move CreateTestCSIPlugin to state to avoid an import cycle

* nomad/state/state_store_test: delete a plugin by deleting its jobs

* nomad/*_test: move CreateTestCSIPlugin to state

* nomad/state/state_store: update one plugin per transaction

* command/plugin_status_test: move CreateTestCSIPlugin

* nomad: csi: handle nils CSIPlugin methods, clarity
2020-03-26 17:07:18 -04:00
Lang Martin ea80330aaa
csi: nomad/structs: test volume denormalize without plugin (#7472) 2020-03-26 09:43:59 -04:00
Lang Martin bd22afd003
csi: volume deregister fails for volumes actively in use (#7445)
* nomad/structs/csi: add InUse to CSIVolume

* nomad/state/state_store: block volume deregistration for in use vols
2020-03-24 10:10:44 -04:00
Tim Gross fa01a6ea59 csi: fix missing health count from volume list stub 2020-03-24 09:42:59 -04:00
Lang Martin d994990ef0
csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Lang Martin e100444740 csi: add mount_options to volumes and volume requests (#7398)
Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host.

Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context.

closes #7007

* nomad/structs/volumes: add MountOptions to volume request

* jobspec/test-fixtures/basic.hcl: add mount_options to volume block

* jobspec/parse_test: add expected MountOptions

* api/tasks: add mount_options

* jobspec/parse_group: use hcl decode not mapstructure, mount_options

* client/allocrunner/csi_hook: pass MountOptions through

client/allocrunner/csi_hook: add a VolumeMountOptions

client/allocrunner/csi_hook: drop Options

client/allocrunner/csi_hook: use the structs options

* client/pluginmanager/csimanager/interface: UsageOptions.MountOptions

* client/pluginmanager/csimanager/volume: pass MountOptions in capabilities

* plugins/csi/plugin: remove todo 7007 comment

* nomad/structs/csi: MountOptions

* api/csi: add options to the api for parsing, match structs

* plugins/csi/plugin: move VolumeMountOptions to structs

* api/csi: use specific type for mount_options

* client/allocrunner/csi_hook: merge MountOptions here

* rename CSIOptions to CSIMountOptions

* client/allocrunner/csi_hook

* client/pluginmanager/csimanager/volume

* nomad/structs/csi

* plugins/csi/fake/client: add PrevVolumeCapability

* plugins/csi/plugin

* client/pluginmanager/csimanager/volume_test: remove debugging

* client/pluginmanager/csimanager/volume: fix odd merging logic

* api: rename CSIOptions -> CSIMountOptions

* nomad/csi_endpoint: remove a 7007 comment

* command/alloc_status: show mount options in the volume list

* nomad/structs/csi: include MountOptions in the volume stub

* api/csi: add MountOptions to stub

* command/volume_status_csi: clean up csiVolMountOption, add it

* command/alloc_status: csiVolMountOption lives in volume_csi_status

* command/node_status: display mount flags

* nomad/structs/volumes: npe

* plugins/csi/plugin: npe in ToCSIRepresentation

* jobspec/parse_test: expand volume parse test cases

* command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions

* command/volume_status_csi: copy paste error

* jobspec/test-fixtures/basic: hclfmt

* command/volume_status_csi: clean up csiVolMountOption
2020-03-23 13:59:25 -04:00
Tim Gross 0cd2d3cc29 csi: make claims on volumes idempotent for the same alloc (#7328)
Nomad clients will push node updates during client restart which can
cause an extra claim for a volume by the same alloc. If an alloc
already claims a volume, we can allow it to be treated as a valid
claim and continue.
2020-03-23 13:58:30 -04:00
Lang Martin 6750c262a4 csi: use `ExternalID`, when set, to identify volumes for outside RPC calls (#7326)
* nomad/structs/csi: new RemoteID() uses the ExternalID if set

* nomad/csi_endpoint: pass RemoteID to volume request types

* client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
2020-03-23 13:58:30 -04:00