Commit graph

17476 commits

Author SHA1 Message Date
Lang Martin 0847cb513c
csi: volume/plugin list should return an empty array, not nil (#7443)
* nomad/csi_endpoint: return an empty list, not nil

* nomad/csi_endpoint_test: volume list returns non-nil
2020-03-23 21:21:40 -04:00
Lang Martin d994990ef0
csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Drew Bailey 2a0f3badfa
Merge pull request #7444 from hashicorp/rename-auditor
make auditor interface more explicit
2020-03-23 20:23:28 -04:00
Drew Bailey 10f3b6899b
rename struct field to auditor 2020-03-23 20:09:01 -04:00
Mahmood Ali d2fd04745c
Merge pull request #7238 from hashicorp/vendor-hcl-20190228
Update github.com/hashicorp/hcl
2020-03-23 20:00:33 -04:00
Mahmood Ali 10e061e28d
Merge pull request #7384 from hashicorp/vendoring-tweaks-20200318
Vendor package cleanup
2020-03-23 19:48:12 -04:00
Drew Bailey cf5fcf3748
make auditor interface more explicit 2020-03-23 19:32:58 -04:00
Mahmood Ali ec1ad8d457
Merge pull request #7417 from hashicorp/f-allocstatus-lifecycle
cli: show lifecycle info in alloc status
2020-03-23 16:33:17 -04:00
Mahmood Ali 791d2d1555 vendor golang.org/x/crypto/ed25519/internal/edwards25519 2020-03-23 16:29:10 -04:00
Mahmood Ali 4ca15cf28e vendor: remove appengine unused package 2020-03-23 16:28:43 -04:00
Mahmood Ali a53738aebd explicitly set github.com/pkg/errors version 2020-03-23 16:28:42 -04:00
Mahmood Ali bed536812d remove unused packages 2020-03-23 16:28:11 -04:00
Mahmood Ali b03fca2913 update github.com/pkg/errors 2020-03-23 16:27:47 -04:00
Mahmood Ali 0d8430fe36 bad package 2020-03-23 16:27:24 -04:00
Mahmood Ali 61c42034d5 cli: show lifecycle info in alloc status
Display task lifecycle info in `nomad alloc status <alloc_id>` output.
I chose to embed it in the Task header and only add it for tasks with
lifecycle info.

Also, I chose to order the tasks in the following order:

1. prestart non-sidecar tasks
2. prestart sidecar tasks
3. main tasks

The tasks are sorted lexicographically within each tier.

Sample output:

```
$ nomad alloc status 6ec0eb52
ID                  = 6ec0eb52-e6c8-665c-169c-113d6081309b
Eval ID             = fb0caa98
Name                = lifecycle.cache[0]
[...]

Task "init" (prestart) is "dead"
Task Resources
CPU        Memory       Disk     Addresses
0/500 MHz  0 B/256 MiB  300 MiB
[...]

Task "some-sidecar" (prestart sidecar) is "running"
Task Resources
CPU        Memory          Disk     Addresses
0/500 MHz  68 KiB/256 MiB  300 MiB
[...]

Task "redis" is "running"
Task Resources
CPU         Memory           Disk     Addresses
10/500 MHz  984 KiB/256 MiB  300 MiB
[...]
```
2020-03-23 15:57:24 -04:00
Mahmood Ali c7d07aa20c
Merge pull request #7437 from hashicorp/ci-build-darwin
build darwin binaries in CI
2020-03-23 15:53:06 -04:00
Mahmood Ali 8a8817beb7 ci: fix darwin artifact path 2020-03-23 15:42:44 -04:00
Drew Bailey c040845131
Merge pull request #7436 from hashicorp/b-fix-compilation
fix compilation with  correct func
2020-03-23 14:37:08 -04:00
Drew Bailey d0d32d8f06
fix compilation with correct func 2020-03-23 14:32:11 -04:00
Tim Gross 076fbbf08f
Merge pull request #7012 from hashicorp/f-csi-volumes
Container Storage Interface Support
2020-03-23 14:19:46 -04:00
Drew Bailey 768686deea
Merge pull request #7419 from hashicorp/f-event-pkg
Audit config, seams for enterprise audit features
2020-03-23 14:01:52 -04:00
Lang Martin e100444740 csi: add mount_options to volumes and volume requests (#7398)
Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host.

Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context.

closes #7007

* nomad/structs/volumes: add MountOptions to volume request

* jobspec/test-fixtures/basic.hcl: add mount_options to volume block

* jobspec/parse_test: add expected MountOptions

* api/tasks: add mount_options

* jobspec/parse_group: use hcl decode not mapstructure, mount_options

* client/allocrunner/csi_hook: pass MountOptions through

client/allocrunner/csi_hook: add a VolumeMountOptions

client/allocrunner/csi_hook: drop Options

client/allocrunner/csi_hook: use the structs options

* client/pluginmanager/csimanager/interface: UsageOptions.MountOptions

* client/pluginmanager/csimanager/volume: pass MountOptions in capabilities

* plugins/csi/plugin: remove todo 7007 comment

* nomad/structs/csi: MountOptions

* api/csi: add options to the api for parsing, match structs

* plugins/csi/plugin: move VolumeMountOptions to structs

* api/csi: use specific type for mount_options

* client/allocrunner/csi_hook: merge MountOptions here

* rename CSIOptions to CSIMountOptions

* client/allocrunner/csi_hook

* client/pluginmanager/csimanager/volume

* nomad/structs/csi

* plugins/csi/fake/client: add PrevVolumeCapability

* plugins/csi/plugin

* client/pluginmanager/csimanager/volume_test: remove debugging

* client/pluginmanager/csimanager/volume: fix odd merging logic

* api: rename CSIOptions -> CSIMountOptions

* nomad/csi_endpoint: remove a 7007 comment

* command/alloc_status: show mount options in the volume list

* nomad/structs/csi: include MountOptions in the volume stub

* api/csi: add MountOptions to stub

* command/volume_status_csi: clean up csiVolMountOption, add it

* command/alloc_status: csiVolMountOption lives in volume_csi_status

* command/node_status: display mount flags

* nomad/structs/volumes: npe

* plugins/csi/plugin: npe in ToCSIRepresentation

* jobspec/parse_test: expand volume parse test cases

* command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions

* command/volume_status_csi: copy paste error

* jobspec/test-fixtures/basic: hclfmt

* command/volume_status_csi: clean up csiVolMountOption
2020-03-23 13:59:25 -04:00
Tim Gross d1f43a5fea csi: improve error messages from scheduler (#7426) 2020-03-23 13:59:25 -04:00
Tim Gross 32b94bf1a4 csi: stub fingerprint on instance manager shutdown (#7388)
Run the plugin fingerprint one last time with a closed client during
instance manager shutdown. This will return quickly and will give us a
correctly-populated `PluginInfo` marked as unhealthy so the Nomad
client can update the server about plugin health.
2020-03-23 13:59:25 -04:00
Tim Gross 5a0bcd39d1 csi: dynamically update plugin registration (#7386)
Allow for faster updates to plugin status when allocations become
terminal by listening for register/deregister events from the dynamic
plugin registry (which in turn are triggered by the plugin supervisor
hook).

The deregistration function closures that we pass up to the CSI plugin
manager don't properly close over the name and type of the
registration, causing monolith-type plugins to deregister only one of
their two plugins on alloc shutdown. Rebind plugin supervisor 
deregistration targets to fix that.

Includes log message and comment improvements
2020-03-23 13:59:25 -04:00
Lang Martin 6b6ae6c2bd csi: ACLs for plugin endpoints (#7380)
* acl/policy: add PolicyList for global ACLs

* acl/acl: plugin policy

* acl/acl: maxPrivilege is required to allow "list"

* nomad/csi_endpoint: enforce plugin access with PolicyPlugin

* nomad/csi_endpoint: check job ACL swapped params

* nomad/csi_endpoint_test: test alloc filtering

* acl/policy: add namespace csi-register-plugin

* nomad/job_endpoint: check csi-register-plugin ACL on registration

* nomad/job_endpoint_test: add plugin job cases
2020-03-23 13:59:25 -04:00
Lang Martin b596e67f47 csi: implement volume ACLs (#7339)
* acl/policy: add the volume ACL policies

* nomad/csi_endpoint: enforce ACLs for volume access

* nomad/search_endpoint_oss: volume acls

* acl/acl: add plugin read as a global policy

* acl/policy: add PluginPolicy global cap type

* nomad/csi_endpoint: check the global plugin ACL policy

* nomad/mock/acl: PluginPolicy

* nomad/csi_endpoint: fix list rebase

* nomad/core_sched_test: new test since #7358

* nomad/csi_endpoint_test: use correct permissions for list

* nomad/csi_endpoint: allowCSIMount keeps ACL checks together

* nomad/job_endpoint: check mount permission for jobs

* nomad/job_endpoint_test: need plugin read, too
2020-03-23 13:59:25 -04:00
Lang Martin 3621df1dbf csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Tim Gross fe926e899e volumes: add task environment interpolation to volume_mount (#7364) 2020-03-23 13:59:25 -04:00
Tim Gross 22e9f679c3 csi: implement controller detach RPCs (#7356)
This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.
2020-03-23 13:59:25 -04:00
Tim Gross cd1c6173f4 csi: e2e tests for EBS and EFS plugins (#7343)
This changeset provides two basic e2e tests for CSI plugins targeting
common AWS use cases.

The EBS test launches the EBS plugin (controller + nodes) and registers
an EBS volume as a Nomad CSI volume. We deploy a job that writes to
the volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The EFS test launches the EFS plugin (nodes-only) and registers an EFS
volume as a Nomad CSI volume. We deploy a job that writes to the
volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The writer jobs mount the CSI volume at a location within the alloc
dir.
2020-03-23 13:59:18 -04:00
Tim Gross 0cd2d3cc29 csi: make claims on volumes idempotent for the same alloc (#7328)
Nomad clients will push node updates during client restart which can
cause an extra claim for a volume by the same alloc. If an alloc
already claims a volume, we can allow it to be treated as a valid
claim and continue.
2020-03-23 13:58:30 -04:00
Tim Gross eda7be552c csi: add dynamicplugins registry to client state store (#7330)
In order to correctly fingerprint dynamic plugins on client restarts,
we need to persist a handle to the plugin (that is, connection info)
to the client state store.

The dynamic registry will sync automatically to the client state
whenever it receives a register/deregister call.
2020-03-23 13:58:30 -04:00
Lang Martin 6750c262a4 csi: use ExternalID, when set, to identify volumes for outside RPC calls (#7326)
* nomad/structs/csi: new RemoteID() uses the ExternalID if set

* nomad/csi_endpoint: pass RemoteID to volume request types

* client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
2020-03-23 13:58:30 -04:00
Tim Gross 1cf7ef44ed csi: docstring and log message fixups (#7327)
Fix some docstring typos and fix noisy log message during client restarts.
A log for the common case where the plugin socket isn't ready yet
isn't actionable by the operator so having it at info is just noise.
2020-03-23 13:58:30 -04:00
Lang Martin 99841222ed csi: change the API paths to match CLI command layout (#7325)
* command/agent/csi_endpoint: support type filter in volumes & plugins

* command/agent/http: use /v1/volume/csi & /v1/plugin/csi

* api/csi: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: not /volumes/csi, just /volumes

* command/agent/csi_endpoint: fix ot parameter parsing
2020-03-23 13:58:30 -04:00
Lang Martin 80619137ab csi: volumes listed in nomad node status (#7318)
* api/allocations: GetTaskGroup finds the taskgroup struct

* command/node_status: display CSI volume names

* nomad/state/state_store: new CSIVolumesByNodeID

* nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator

* nomad/csi_endpoint: deal with a slice of volumes

* nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator

* nomad/structs/csi: CSIVolumeListRequest takes a NodeID

* nomad/csi_endpoint: use the return iterator

* command/agent/csi_endpoint: parse query params for CSIVolumes.List

* api/nodes: new CSIVolumes to list volumes by node

* command/node_status: use the new list endpoint to print volumes

* nomad/state/state_store: error messages consider the operator

* command/node_status: include the Provider
2020-03-23 13:58:30 -04:00
Lang Martin de25fc6cf4 csi: csi-hostpath plugin unimplemented error on controller publish (#7299)
* client/allocrunner/csi_hook: tag errors

* nomad/client_csi_endpoint: tag errors

* nomad/client_rpc: remove an unnecessary error tag

* nomad/state/state_store: ControllerRequired fix intent

We use ControllerRequired to indicate that a volume should use the
publish/unpublish workflow, rather than that it has a controller. We
need to check both RequiresControllerPlugin and SupportsAttachDetach
from the fingerprint to check that.

* nomad/csi_endpoint: tag errors

* nomad/csi_endpoint_test: longer error messages, mock fingerprints
2020-03-23 13:58:30 -04:00
Tim Gross b04d23dae0 csi: ensure volume query is idempotent (#7303)
We denormalize the `CSIVolume` struct when we query it from the state
store by getting the plugin and its health. But unless we copy the
volume, this denormalization gets synced back to the state store
without passing through the fsm (which is invalid).
2020-03-23 13:58:30 -04:00
Tim Gross b57df162ce csi: ensure GET for plugin is idempotent (#7298)
We denormalize the `CSIPlugin` struct when we query it from the state
store by getting the current set of allocations that provide the
plugin. But unless we copy the plugin, this denormalization gets
synced back to the state store and each time we query we'll add
another copy of the current allocations.
2020-03-23 13:58:30 -04:00
Tim Gross de4ad6ca38 csi: add Provider field to CSI CLIs and APIs (#7285)
Derive a provider name and version for plugins (and the volumes that
use them) from the CSI identity API `GetPluginInfo`. Expose the vendor
name as `Provider` in the API and CLI commands.
2020-03-23 13:58:30 -04:00
Lang Martin 887e1f28c9 csi: CLI for volume status, registration/deregistration and plugin status (#7193)
* command/csi: csi, csi_plugin, csi_volume

* helper/funcs: move ExtraKeys from parse_config to UnusedKeys

* command/agent/config_parse: use helper.UnusedKeys

* api/csi: annotate CSIVolumes with hcl fields

* command/csi_plugin: add Synopsis

* command/csi_volume_register: use hcl.Decode style parsing

* command/csi_volume_list

* command/csi_volume_status: list format, cleanup

* command/csi_plugin_list

* command/csi_plugin_status

* command/csi_volume_deregister

* command/csi_volume: add Synopsis

* api/contexts/contexts: add csi search contexts to the constants

* command/commands: register csi commands

* api/csi: fix struct tag for linter

* command/csi_plugin_list: unused struct vars

* command/csi_plugin_status: unused struct vars

* command/csi_volume_list: unused struct vars

* api/csi: add allocs to CSIPlugin

* command/csi_plugin_status: format the allocs

* api/allocations: copy Allocation.Stub in from structs

* nomad/client_rpc: add some error context with Errorf

* api/csi: collapse read & write alloc maps to a stub list

* command/csi_volume_status: cleanup allocation display

* command/csi_volume_list: use Schedulable instead of Healthy

* command/csi_volume_status: use Schedulable instead of Healthy

* command/csi_volume_list: sprintf string

* command/csi: delete csi.go, csi_plugin.go

* command/plugin: refactor csi components to sub-command plugin status

* command/plugin: remove csi

* command/plugin_status: remove csi

* command/volume: remove csi

* command/volume_status: split out csi specific

* helper/funcs: add RemoveEqualFold

* command/agent/config_parse: use helper.RemoveEqualFold

* api/csi: do ,unusedKeys right

* command/volume: refactor csi components to `nomad volume`

* command/volume_register: split out csi specific

* command/commands: use the new top level commands

* command/volume_deregister: hardwired type csi for now

* command/volume_status: csiFormatVolumes rescued from volume_list

* command/plugin_status: avoid a panic on no args

* command/volume_status: avoid a panic on no args

* command/plugin_status: predictVolumeType

* command/volume_status: predictVolumeType

* nomad/csi_endpoint_test: move CreateTestPlugin to testing

* command/plugin_status_test: use CreateTestCSIPlugin

* nomad/structs/structs: add CSIPlugins and CSIVolumes search consts

* nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix

* nomad/search_endpoint: add CSIPlugins and CSIVolumes

* command/plugin_status: move the header to the csi specific

* command/volume_status: move the header to the csi specific

* nomad/state/state_store: CSIPluginByID prefix

* command/status: rename the search context to just Plugins/Volumes

* command/plugin,volume_status: test return ids now

* command/status: rename the search context to just Plugins/Volumes

* command/plugin_status: support -json and -t

* command/volume_status: support -json and -t

* command/plugin_status_csi: comments

* command/*_status: clean up text

* api/csi: fix stale comments

* command/volume: make deregister sound less fearsome

* command/plugin_status: set the id length

* command/plugin_status_csi: more compact plugin health

* command/volume: better error message, comment
2020-03-23 13:58:30 -04:00
Tim Gross 016281135c storage: add volumes to 'nomad alloc status' CLI (#7256)
Adds a stanza for both Host Volumes and CSI Volumes to the the CLI
output for `nomad alloc status`. Mostly relies on information already
in the API structs, but in the case where there are CSI Volumes we
need to make extra API calls to get the volume status. To reduce
overhead, these extra calls are hidden behind the `-verbose` flag.
2020-03-23 13:58:30 -04:00
Tim Gross b3bf64485e csi: remove DevDisableBootstrap flag from tests (#7267)
In #7252 we removed the `DevDisableBootstrap` flag to require tests to
honor only `BootstrapExpect`, in order to reduce a source of test
flakiness. This changeset applies the same fix to the CSI tests.
2020-03-23 13:58:30 -04:00
Lang Martin 369b0e54b9 csi: volumes use Schedulable rather than Healthy (#7250)
* structs: add ControllerRequired, volume.Name, no plug.Type

* structs: Healthy -> Schedulable

* state_store: Healthy -> Schedulable

* api: add ControllerRequired to api data types

* api: copy csi structs changes

* nomad/structs/csi: include name and external id

* api/csi: include Name and ExternalID

* nomad/structs/csi: comments for the 3 ids
2020-03-23 13:58:30 -04:00
Lang Martin a4784ef258 csi add allocation context to fingerprinting results (#7133)
* structs: CSIInfo include AllocID, CSIPlugins no Jobs

* state_store: eliminate plugin Jobs, delete an empty plugin

* nomad/structs/csi: detect empty plugins correctly

* client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID

* client/pluginmanager/csimanager/instance: allocID

* client/pluginmanager/csimanager/fingerprint: set AllocID

* client/node_updater: split controller and node plugins

* api/csi: remove Jobs

The CSI Plugin API will map plugins to allocations, which allows
plugins to be defined by jobs in many configurations. In particular,
multiple plugins can be defined in the same job, and multiple jobs can
be used to define a single plugin.

Because we now map the allocation context directly from the node, it's
no longer necessary to track the jobs associated with a plugin
directly.

* nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint

* client/dynamicplugins: lift AllocID into the struct from Options

* api/csi_test: remove Jobs test

* nomad/structs/csi: CSIPlugins has an array of allocs

* nomad/state/state_store: implement CSIPluginDenormalize

* nomad/state/state_store: CSIPluginDenormalize npe on missing alloc

* nomad/csi_endpoint_test: defer deleteNodes for clarity

* api/csi_test: disable this test awaiting mocks:
https://github.com/hashicorp/nomad/issues/7123
2020-03-23 13:58:30 -04:00
Danielle Lancashire 247e86bb35 csi: VolumeCapabilities for ControllerPublishVolume
This commit introduces support for providing VolumeCapabilities during
requests to `ControllerPublishVolumes` as this is a required field.
2020-03-23 13:58:30 -04:00
Danielle Lancashire e75f057df3 csi: Fix Controller RPCs
Currently the handling of CSINode RPCs does not correctly handle
forwarding RPCs to Nodes.

This commit fixes this by introducing a shim RPC
(nomad/client_csi_enpdoint) that will correctly forward the request to
the owning node, or submit the RPC to the client.

In the process it also cleans up handling a little bit by adding the
`CSIControllerQuery` embeded struct for required forwarding state.

The CSIControllerQuery embeding the requirement of a `PluginID` also
means we could move node targetting into the shim RPC if wanted in the
future.
2020-03-23 13:58:30 -04:00
Danielle Lancashire d5e255f97a client: Rename ClientCSI -> CSIController 2020-03-23 13:58:30 -04:00
Danielle Lancashire 5b05baf9f6 csi: Add /dev mounts to CSI Plugins
CSI Plugins that manage devices need not just access to the CSI
directory, but also to manage devices inside `/dev`.

This commit introduces a `/dev:/dev` mount to the container so that they
may do so.
2020-03-23 13:58:30 -04:00