This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.
In order to correctly fingerprint dynamic plugins on client restarts,
we need to persist a handle to the plugin (that is, connection info)
to the client state store.
The dynamic registry will sync automatically to the client state
whenever it receives a register/deregister call.
* nomad/structs/csi: new RemoteID() uses the ExternalID if set
* nomad/csi_endpoint: pass RemoteID to volume request types
* client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
Fix some docstring typos and fix noisy log message during client restarts.
A log for the common case where the plugin socket isn't ready yet
isn't actionable by the operator so having it at info is just noise.
* client/allocrunner/csi_hook: tag errors
* nomad/client_csi_endpoint: tag errors
* nomad/client_rpc: remove an unnecessary error tag
* nomad/state/state_store: ControllerRequired fix intent
We use ControllerRequired to indicate that a volume should use the
publish/unpublish workflow, rather than that it has a controller. We
need to check both RequiresControllerPlugin and SupportsAttachDetach
from the fingerprint to check that.
* nomad/csi_endpoint: tag errors
* nomad/csi_endpoint_test: longer error messages, mock fingerprints
Derive a provider name and version for plugins (and the volumes that
use them) from the CSI identity API `GetPluginInfo`. Expose the vendor
name as `Provider` in the API and CLI commands.
* structs: CSIInfo include AllocID, CSIPlugins no Jobs
* state_store: eliminate plugin Jobs, delete an empty plugin
* nomad/structs/csi: detect empty plugins correctly
* client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID
* client/pluginmanager/csimanager/instance: allocID
* client/pluginmanager/csimanager/fingerprint: set AllocID
* client/node_updater: split controller and node plugins
* api/csi: remove Jobs
The CSI Plugin API will map plugins to allocations, which allows
plugins to be defined by jobs in many configurations. In particular,
multiple plugins can be defined in the same job, and multiple jobs can
be used to define a single plugin.
Because we now map the allocation context directly from the node, it's
no longer necessary to track the jobs associated with a plugin
directly.
* nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint
* client/dynamicplugins: lift AllocID into the struct from Options
* api/csi_test: remove Jobs test
* nomad/structs/csi: CSIPlugins has an array of allocs
* nomad/state/state_store: implement CSIPluginDenormalize
* nomad/state/state_store: CSIPluginDenormalize npe on missing alloc
* nomad/csi_endpoint_test: defer deleteNodes for clarity
* api/csi_test: disable this test awaiting mocks:
https://github.com/hashicorp/nomad/issues/7123
Currently the handling of CSINode RPCs does not correctly handle
forwarding RPCs to Nodes.
This commit fixes this by introducing a shim RPC
(nomad/client_csi_enpdoint) that will correctly forward the request to
the owning node, or submit the RPC to the client.
In the process it also cleans up handling a little bit by adding the
`CSIControllerQuery` embeded struct for required forwarding state.
The CSIControllerQuery embeding the requirement of a `PluginID` also
means we could move node targetting into the shim RPC if wanted in the
future.
CSI Plugins that manage devices need not just access to the CSI
directory, but also to manage devices inside `/dev`.
This commit introduces a `/dev:/dev` mount to the container so that they
may do so.
This commit is the initial implementation of claiming volumes from the
server and passes through any publishContext information as appropriate.
There's nothing too fancy here.
The CSI Spec requires us to attach and stage volumes based on different
types of usage information when it may effect how they are bound. Here
we pass through some basic usage options in the CSI Hook (specifically
the volume aliases ReadOnly field), and the attachment/access mode from
the volume. We pass the attachment/access mode seperately from the
volume as it simplifies some handling and doesn't necessarily force
every attachment to use the same mode should more be supported (I.e if
we let each `volume "foo" {}` specify an override in the future).
This commit introduces initial support for unmounting csi volumes.
It takes a relatively simplistic approach to performing
NodeUnpublishVolume calls, optimising for cleaning up any leftover state
rather than terminating early in the case of errors.
This is because it happens during an allocation's shutdown flow and may
not always have a corresponding call to `NodePublishVolume` that
succeeded.
This commit implements support for creating driver mounts for CSI
Volumes.
It works by fetching the created mounts from the allocation resources
and then iterates through the volume requests, creating driver mount
configs as required.
It's a little bit messy primarily because there's _so_ much terminology
overlap and it's a bit difficult to follow.
This commit is an initial (read: janky) approach to forwarding state
from an allocrunner hook to a taskrunner using a similar `hookResources`
approach that tr's use internally.
It should eventually probably be replaced with something a little bit
more message based, but for things that only come from pre-run hooks,
and don't change, it's probably fine for now.
This commit introduces the first stage of volume mounting for an
allocation. The csimanager.VolumeMounter interface manages the blocking
and actual minutia of the CSI implementation allowing this hook to do
the minimal work of volume retrieval and creating mount info.
In the future the `CSIVolume.Get` request should be replaced by
`CSIVolume.Claim(Batch?)` to minimize the number of RPCs and to handle
external triggering of a ControllerPublishVolume request as required.
We also need to ensure that if pre-run hooks fail, we still get a full
unwinding of any publish and staged volumes to ensure that there are no hanging
references to volumes. That is not handled in this commit.
As part of introducing support for CSI, AllocRunner hooks need to be
able to communicate with Nomad Servers for validation of and interaction
with storage volumes. Here we create a small RPCer interface and pass
the client (rpc client) to the AR in preparation for making these RPCs.
This changeset implements the minimal structs on the client-side we
need to compile the work-in-progress implementation of the
server-to-controller RPCs. It doesn't include implementing the
`ClientCSI.DettachVolume` RPC on the client.
When providing paths to plugins, the path needs to be in the scope of
the plugins container, rather than that of the host.
Here we enable that by providing the mount point through the plugin
registration and then use it when constructing request target paths.
This commit introduces support for staging volumes when a plugin
implements the STAGE_UNSTAGE_VOLUME capability.
See the following for further reference material:
4731db0e0b/spec.md (nodestagevolume)
This commit adds helpers that create and validate the staging directory
for a given volume. It is currently missing usage options as the
interfaces are not yet in place for those.
The staging directory is only required when a volume has the
STAGE_UNSTAGE Volume capability and has to live within the plugin root
as the plugin needs to be able to create mounts inside it from within
the container.
This changeset is some pre-requisite boilerplate that is required for
introducing CSI volume management for client nodes.
It extracts out fingerprinting logic from the csi instance manager.
This change is to facilitate reusing the csimanager to also manage the
node-local CSI functionality, as it is the easiest place for us to
guaruntee health checking and to provide additional visibility into the
running operations through the fingerprinter mechanism and goroutine.
It also introduces the VolumeMounter interface that will be used to
manage staging/publishing unstaging/unpublishing of volumes on the host.
This commit introduces a new set of endpoints to a Nomad Client:
ClientCSI.
ClientCSI is responsible for mediating requests from a Nomad Server to
a CSI Plugin running on a Nomad Client. It should only really be used to
make controller RPCs.
This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:
* A `csi_plugin` stanza as part of a Nomad task configuration, to
allow a task to expose that it is a plugin.
* A new task runner hook: `csi_plugin_supervisor`. This hook does two
things. When the `csi_plugin` stanza is detected, it will
automatically configure the plugin task to receive bidirectional
mounts to the CSI intermediary directory. At runtime, it will then
perform an initial heartbeat of the plugin and handle submitting it to
the new `dynamicplugins.Registry` for further use by the client, and
then run a lightweight heartbeat loop that will emit task events
when health changes.
* The `dynamicplugins.Registry` for handling plugins that run
as Nomad tasks, in contrast to the existing catalog that requires
`go-plugin` type plugins and to know the plugin configuration in
advance.
* The `csimanager` which fingerprints CSI plugins, in a similar way to
`drivermanager` and `devicemanager`. It currently only fingerprints
the NodeID from the plugin, and assumes that all plugins are
monolithic.
Missing features
* We do not use the live updates of the `dynamicplugin` registry in
the `csimanager` yet.
* We do not deregister the plugins from the client when they shutdown
yet, they just become indefinitely marked as unhealthy. This is
deliberate until we figure out how we should manage deploying new
versions of plugins/transitioning them.