Commit Graph

28 Commits

Author SHA1 Message Date
Tim Gross cbae10333c
csi: check returned volume capability validation (#7831)
This changeset corrects handling of the `ValidationVolumeCapabilities`
response:

* The CSI spec for the `ValidationVolumeCapabilities` requires that
  plugins only set the `Confirmed` field if they've validated all
  capabilities. The Nomad client improperly assumes that the lack of a
  `Confirmed` field should be treated as a failure. This breaks the
  Azure and Linode block storage plugins, which don't set this
  optional field.

* The CSI spec also requires that the orchestrator check the validation
  responses to guard against older versions of a plugin reporting
  "valid" for newer fields it doesn't understand.
2020-04-30 17:12:32 -04:00
Tim Gross f3bae55fae
set safe default for CSI plugin MaxVolumes (#7583) 2020-04-01 11:08:55 -04:00
Tim Gross 6ffd36c4e5
csi: add grpc retries to client controller RPCs (#7549)
The CSI Specification defines various gRPC Errors and how they may be retried. After auditing all our CSI RPC calls in #6863, this changeset:

* adds retries and backoffs to the where they were needed but not implemented
* annotates those CSI RPCs that do not need retries so that we don't wonder whether it's been left off accidentally
* added a timeout and cancellation context to the `Probe` call, which didn't have one.
2020-03-30 16:26:03 -04:00
Lang Martin e100444740 csi: add mount_options to volumes and volume requests (#7398)
Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host.

Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context.

closes #7007

* nomad/structs/volumes: add MountOptions to volume request

* jobspec/test-fixtures/basic.hcl: add mount_options to volume block

* jobspec/parse_test: add expected MountOptions

* api/tasks: add mount_options

* jobspec/parse_group: use hcl decode not mapstructure, mount_options

* client/allocrunner/csi_hook: pass MountOptions through

client/allocrunner/csi_hook: add a VolumeMountOptions

client/allocrunner/csi_hook: drop Options

client/allocrunner/csi_hook: use the structs options

* client/pluginmanager/csimanager/interface: UsageOptions.MountOptions

* client/pluginmanager/csimanager/volume: pass MountOptions in capabilities

* plugins/csi/plugin: remove todo 7007 comment

* nomad/structs/csi: MountOptions

* api/csi: add options to the api for parsing, match structs

* plugins/csi/plugin: move VolumeMountOptions to structs

* api/csi: use specific type for mount_options

* client/allocrunner/csi_hook: merge MountOptions here

* rename CSIOptions to CSIMountOptions

* client/allocrunner/csi_hook

* client/pluginmanager/csimanager/volume

* nomad/structs/csi

* plugins/csi/fake/client: add PrevVolumeCapability

* plugins/csi/plugin

* client/pluginmanager/csimanager/volume_test: remove debugging

* client/pluginmanager/csimanager/volume: fix odd merging logic

* api: rename CSIOptions -> CSIMountOptions

* nomad/csi_endpoint: remove a 7007 comment

* command/alloc_status: show mount options in the volume list

* nomad/structs/csi: include MountOptions in the volume stub

* api/csi: add MountOptions to stub

* command/volume_status_csi: clean up csiVolMountOption, add it

* command/alloc_status: csiVolMountOption lives in volume_csi_status

* command/node_status: display mount flags

* nomad/structs/volumes: npe

* plugins/csi/plugin: npe in ToCSIRepresentation

* jobspec/parse_test: expand volume parse test cases

* command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions

* command/volume_status_csi: copy paste error

* jobspec/test-fixtures/basic: hclfmt

* command/volume_status_csi: clean up csiVolMountOption
2020-03-23 13:59:25 -04:00
Tim Gross 32b94bf1a4 csi: stub fingerprint on instance manager shutdown (#7388)
Run the plugin fingerprint one last time with a closed client during
instance manager shutdown. This will return quickly and will give us a
correctly-populated `PluginInfo` marked as unhealthy so the Nomad
client can update the server about plugin health.
2020-03-23 13:59:25 -04:00
Tim Gross de4ad6ca38 csi: add Provider field to CSI CLIs and APIs (#7285)
Derive a provider name and version for plugins (and the volumes that
use them) from the CSI identity API `GetPluginInfo`. Expose the vendor
name as `Provider` in the API and CLI commands.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 247e86bb35 csi: VolumeCapabilities for ControllerPublishVolume
This commit introduces support for providing VolumeCapabilities during
requests to `ControllerPublishVolumes` as this is a required field.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 34acb596e3 plugins/csi: Implement ConvtrollerValidateCapabilities RPC 2020-03-23 13:58:30 -04:00
Danielle Lancashire 6b7ee96a88 csi: Move VolumeCapabilties helper to package 2020-03-23 13:58:30 -04:00
Danielle Lancashire 6762442199 csiclient: Add grpc.CallOption support to NodeUnpublishVolume 2020-03-23 13:58:30 -04:00
Tim Gross 60901fa764 csi: implement CSI controller detach request/response (#7107)
This changeset implements the minimal structs on the client-side we
need to compile the work-in-progress implementation of the
server-to-controller RPCs. It doesn't include implementing the
`ClientCSI.DettachVolume` RPC on the client.
2020-03-23 13:58:29 -04:00
Danielle Lancashire a5c96ce2e1 csi: Add grpc.CallOption support to NodePublishVolume 2020-03-23 13:58:29 -04:00
Danielle Lancashire add55e37b8 csi: Expose gRPC Options on NodeUnstageVolume 2020-03-23 13:58:29 -04:00
Danielle Lancashire 65d9ddc9af csi: Expose grpc.CallOptions for NodeStageVolume 2020-03-23 13:58:29 -04:00
Danielle Lancashire 51270ae0f4 csi: Support for NodeUnpublishVolume RPCs 2020-03-23 13:58:29 -04:00
Danielle Lancashire a4b96aff33 csi: Nil check ToCSIRepresentation implementations 2020-03-23 13:58:29 -04:00
Danielle Lancashire 02c4612e65 csi: Add NodePublishVolume RPCs 2020-03-23 13:58:29 -04:00
Danielle Lancashire 98f00a9220 csi: Add NodeUnstageVolume RPCs to CSIPlugin 2020-03-23 13:58:29 -04:00
Danielle Lancashire 5c447396fa csi: Add NodeUnstageVolume as a CSI Dependency 2020-03-23 13:58:29 -04:00
Danielle Lancashire f208770e94 csi: Add NodeStageVolume to fake client 2020-03-23 13:58:29 -04:00
Danielle Lancashire 07651a5231 csi: Add NodeStageVolume RPC 2020-03-23 13:58:29 -04:00
Danielle Lancashire 317b680744 csi: Add csi.NodeStageVolume to the NodeClient
Implements a fake version of NodeStageVolume as a dependency of
implementing the client.NodeStageVolume request
2020-03-23 13:58:29 -04:00
Danielle Lancashire ab1edd4e24 csi: Add Nomad Model for VolumeCapabilities
This commit introduces a nomad model for interacting with CSI
VolumeCapabilities as a pre-requisite for implementing NodeStageVolume
and NodeMountVolume correctly.

These fields have a few special characteristics that I've tried to model
here - specificially, we make a basic attempt to avoid printing data
that should be redacted during debug logs (additional mount flags), and
also attempt to make debuggability of other integer fields easier by
implementing the fmt.Stringer and fmt.GoStringer interfaces as
necessary.

We do not currnetly implement a CSI Protobuf -> Nomad implementation
transformation as this is currently not needed by any used RPCs.
2020-03-23 13:58:29 -04:00
Danielle Lancashire de5d373001 csi: Setup gRPC Clients with a logger 2020-03-23 13:58:29 -04:00
Danielle Lancashire c16812280c csi: Add NodeGetCapabilities RPC 2020-03-23 13:58:29 -04:00
Danielle Lancashire 05525c98ae plugins_csi: Add GetControllerCapabilities RPC 2020-03-23 13:58:28 -04:00
Danielle Lancashire 72ee2d4c1c csi: Add initial plumbing for controller rpcs 2020-03-23 13:58:28 -04:00
Danielle Lancashire 426c26d7c0 CSI Plugin Registration (#6555)
This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:

* A `csi_plugin` stanza as part of a Nomad task configuration, to
  allow a task to expose that it is a plugin.

* A new task runner hook: `csi_plugin_supervisor`. This hook does two
  things. When the `csi_plugin` stanza is detected, it will
  automatically configure the plugin task to receive bidirectional
  mounts to the CSI intermediary directory. At runtime, it will then
  perform an initial heartbeat of the plugin and handle submitting it to
  the new `dynamicplugins.Registry` for further use by the client, and
  then run a lightweight heartbeat loop that will emit task events
  when health changes.

* The `dynamicplugins.Registry` for handling plugins that run
  as Nomad tasks, in contrast to the existing catalog that requires
  `go-plugin` type plugins and to know the plugin configuration in
  advance.

* The `csimanager` which fingerprints CSI plugins, in a similar way to
  `drivermanager` and `devicemanager`. It currently only fingerprints
  the NodeID from the plugin, and assumes that all plugins are
  monolithic.

Missing features

* We do not use the live updates of the `dynamicplugin` registry in
  the `csimanager` yet.

* We do not deregister the plugins from the client when they shutdown
  yet, they just become indefinitely marked as unhealthy. This is
  deliberate until we figure out how we should manage deploying new
  versions of plugins/transitioning them.
2020-03-23 13:58:28 -04:00