Commit Graph

3282 Commits

Author SHA1 Message Date
Lang Martin 50ff9ccd44
csi: plugin deregistration on plugin job GC (#7502)
* nomad/structs/csi: delete just one plugin type from a node

* nomad/structs/csi: add DeleteAlloc

* nomad/state/state_store: add deleteJobFromPlugin

* nomad/state/state_store: use DeleteAlloc not DeleteNodeType

* move CreateTestCSIPlugin to state to avoid an import cycle

* nomad/state/state_store_test: delete a plugin by deleting its jobs

* nomad/*_test: move CreateTestCSIPlugin to state

* nomad/state/state_store: update one plugin per transaction

* command/plugin_status_test: move CreateTestCSIPlugin

* nomad: csi: handle nils CSIPlugin methods, clarity
2020-03-26 17:07:18 -04:00
Lang Martin 3375c92aa0
csi: make volume registration idempotent (#7490)
If not in use and not changing external ids, it should not be an error to register a volume again.

* nomad/state/state_store: make volume registration idempotent
2020-03-26 12:27:19 -04:00
Lang Martin ea80330aaa
csi: nomad/structs: test volume denormalize without plugin (#7472) 2020-03-26 09:43:59 -04:00
Mahmood Ali b33dbe539b tests: TestCSIPluginEndpoint_ACLNamespaceAlloc is ent
TestCSIPluginEndpoint_ACLNamespaceAlloc uses namespace features not
present in OSS.
2020-03-25 08:45:44 -04:00
Mahmood Ali 281fc9837c tests: relax index checks
TestStateStore_Indexes specifically tests for `nodes` index, but asserts
on the exact number of indexes present in the state.  This is fragile
and will break almost everytime we add a state index.
2020-03-25 08:45:38 -04:00
Mahmood Ali ceed57b48f per-task restart policy 2020-03-24 17:00:41 -04:00
Chris Baker ffd79583f6
Merge pull request #7474 from hashicorp/f-scaling-changes-from-review
more testing for scaling API
2020-03-24 15:32:10 -05:00
Chris Baker c638c2c352 update RPC scaling endpoint tests to use renamed 'scale' policy disposition 2020-03-24 20:18:12 +00:00
Chris Baker 5979d6a81e more testing for ScalingPolicy, mainly around parsing and canonicalization for Min/Max 2020-03-24 19:43:50 +00:00
Chris Baker aa5beafe64 Job.Scale should not result in job update or eval create if args.Count == nil
plus tests
2020-03-24 17:36:06 +00:00
Tim Gross 913da68296
csi: remove client from plugin on client node update (#7462)
Plugins track the client nodes where they are placed. On client
updates, remove the client from the plugin tracking if the client is
no longer running an instance of that controller/node plugin.

Extends the state store tests to ensure deregistration works as
expected and that controllers and nodes are being tracked
independently.
2020-03-24 13:26:31 -04:00
Chris Baker 9e530e167d
Merge pull request #7409 from hashicorp/scaling-api
Scaling API changes
2020-03-24 11:02:09 -05:00
Chris Baker 606c79b320 add acl validation to Scaling.ListPolicies and Scaling.GetPolicy 2020-03-24 14:39:05 +00:00
Chris Baker f6ec5f9624 made count optional during job scaling actions
added ACL protection in Job.Scale
in Job.Scale, only perform a Job.Register if the Count was non-nil
2020-03-24 14:39:05 +00:00
Chris Baker 41b002eecc wip: ACL checking for RPC Job.ScaleStatus 2020-03-24 14:39:05 +00:00
Lang Martin bd22afd003
csi: volume deregister fails for volumes actively in use (#7445)
* nomad/structs/csi: add InUse to CSIVolume

* nomad/state/state_store: block volume deregistration for in use vols
2020-03-24 10:10:44 -04:00
Chris Baker 233db5258a changes to Canonicalize, Validate, and api->struct conversion so that tg.Count, tg.Scaling.Min/Max are well-defined with reasonable defaults.
- tg.Count defaults to tg.Scaling.Min if present (falls back on previous default of 1 if Scaling is absent)
- Validate() enforces tg.Scaling.Min <= tg.Count <= tg.Scaling.Max

modification in ApiScalingPolicyToStructs, api.TaskGroup.Validate so that defaults are handled for TaskGroup.Count and
2020-03-24 13:57:17 +00:00
Chris Baker f9876a487e finished Job.ScaleStatus RPC, need to work on http endpoint 2020-03-24 13:57:16 +00:00
Chris Baker 925b59e1d2 wip: scaling status return, almost done 2020-03-24 13:57:15 +00:00
James Rasell f125b5fb2d scaling: ensure min and max int64s are in toplevel of block. 2020-03-24 13:57:15 +00:00
Chris Baker 42270d862c wip: some tests still failing
updating job scaling endpoints to match RFC, cleaning up the API object as well
2020-03-24 13:57:14 +00:00
Chris Baker abc7a52f56 finished refactoring state store, schema, etc 2020-03-24 13:57:14 +00:00
Chris Baker 116aa98ed7 wip: removed some commented junk from scaling poc 2020-03-24 13:57:13 +00:00
Chris Baker 3d54f1feba wip: added Enabled to ScalingPolicyListStub, removed JobID from body of scaling request 2020-03-24 13:57:12 +00:00
Chris Baker 024d203267 wip: added tests for client methods around group scaling 2020-03-24 13:57:11 +00:00
Chris Baker 179ab68258 wip: added job.scale rpc endpoint, needs explicit test (tested via http now) 2020-03-24 13:57:09 +00:00
Chris Baker 8453e667c2 wip: working on job group scaling endpoint 2020-03-24 13:55:20 +00:00
Chris Baker 6665d0bfb0 wip: added policy get endpoint, added UUID to policy 2020-03-24 13:55:20 +00:00
Chris Baker 9c2560ceeb wip: upsert/delete scaling policies on job upsert/delete 2020-03-24 13:55:18 +00:00
Chris Baker 65d92f1fbf WIP: adding ScalingPolicy to api/structs and state store 2020-03-24 13:55:18 +00:00
Tim Gross fa01a6ea59 csi: fix missing health count from volume list stub 2020-03-24 09:42:59 -04:00
Lang Martin 0847cb513c
csi: volume/plugin list should return an empty array, not nil (#7443)
* nomad/csi_endpoint: return an empty list, not nil

* nomad/csi_endpoint_test: volume list returns non-nil
2020-03-23 21:21:40 -04:00
Lang Martin d994990ef0
csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Tim Gross 076fbbf08f
Merge pull request #7012 from hashicorp/f-csi-volumes
Container Storage Interface Support
2020-03-23 14:19:46 -04:00
Lang Martin e100444740 csi: add mount_options to volumes and volume requests (#7398)
Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host.

Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context.

closes #7007

* nomad/structs/volumes: add MountOptions to volume request

* jobspec/test-fixtures/basic.hcl: add mount_options to volume block

* jobspec/parse_test: add expected MountOptions

* api/tasks: add mount_options

* jobspec/parse_group: use hcl decode not mapstructure, mount_options

* client/allocrunner/csi_hook: pass MountOptions through

client/allocrunner/csi_hook: add a VolumeMountOptions

client/allocrunner/csi_hook: drop Options

client/allocrunner/csi_hook: use the structs options

* client/pluginmanager/csimanager/interface: UsageOptions.MountOptions

* client/pluginmanager/csimanager/volume: pass MountOptions in capabilities

* plugins/csi/plugin: remove todo 7007 comment

* nomad/structs/csi: MountOptions

* api/csi: add options to the api for parsing, match structs

* plugins/csi/plugin: move VolumeMountOptions to structs

* api/csi: use specific type for mount_options

* client/allocrunner/csi_hook: merge MountOptions here

* rename CSIOptions to CSIMountOptions

* client/allocrunner/csi_hook

* client/pluginmanager/csimanager/volume

* nomad/structs/csi

* plugins/csi/fake/client: add PrevVolumeCapability

* plugins/csi/plugin

* client/pluginmanager/csimanager/volume_test: remove debugging

* client/pluginmanager/csimanager/volume: fix odd merging logic

* api: rename CSIOptions -> CSIMountOptions

* nomad/csi_endpoint: remove a 7007 comment

* command/alloc_status: show mount options in the volume list

* nomad/structs/csi: include MountOptions in the volume stub

* api/csi: add MountOptions to stub

* command/volume_status_csi: clean up csiVolMountOption, add it

* command/alloc_status: csiVolMountOption lives in volume_csi_status

* command/node_status: display mount flags

* nomad/structs/volumes: npe

* plugins/csi/plugin: npe in ToCSIRepresentation

* jobspec/parse_test: expand volume parse test cases

* command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions

* command/volume_status_csi: copy paste error

* jobspec/test-fixtures/basic: hclfmt

* command/volume_status_csi: clean up csiVolMountOption
2020-03-23 13:59:25 -04:00
Lang Martin 6b6ae6c2bd csi: ACLs for plugin endpoints (#7380)
* acl/policy: add PolicyList for global ACLs

* acl/acl: plugin policy

* acl/acl: maxPrivilege is required to allow "list"

* nomad/csi_endpoint: enforce plugin access with PolicyPlugin

* nomad/csi_endpoint: check job ACL swapped params

* nomad/csi_endpoint_test: test alloc filtering

* acl/policy: add namespace csi-register-plugin

* nomad/job_endpoint: check csi-register-plugin ACL on registration

* nomad/job_endpoint_test: add plugin job cases
2020-03-23 13:59:25 -04:00
Lang Martin b596e67f47 csi: implement volume ACLs (#7339)
* acl/policy: add the volume ACL policies

* nomad/csi_endpoint: enforce ACLs for volume access

* nomad/search_endpoint_oss: volume acls

* acl/acl: add plugin read as a global policy

* acl/policy: add PluginPolicy global cap type

* nomad/csi_endpoint: check the global plugin ACL policy

* nomad/mock/acl: PluginPolicy

* nomad/csi_endpoint: fix list rebase

* nomad/core_sched_test: new test since #7358

* nomad/csi_endpoint_test: use correct permissions for list

* nomad/csi_endpoint: allowCSIMount keeps ACL checks together

* nomad/job_endpoint: check mount permission for jobs

* nomad/job_endpoint_test: need plugin read, too
2020-03-23 13:59:25 -04:00
Lang Martin 3621df1dbf csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Tim Gross 22e9f679c3 csi: implement controller detach RPCs (#7356)
This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.
2020-03-23 13:59:25 -04:00
Tim Gross 0cd2d3cc29 csi: make claims on volumes idempotent for the same alloc (#7328)
Nomad clients will push node updates during client restart which can
cause an extra claim for a volume by the same alloc. If an alloc
already claims a volume, we can allow it to be treated as a valid
claim and continue.
2020-03-23 13:58:30 -04:00
Lang Martin 6750c262a4 csi: use `ExternalID`, when set, to identify volumes for outside RPC calls (#7326)
* nomad/structs/csi: new RemoteID() uses the ExternalID if set

* nomad/csi_endpoint: pass RemoteID to volume request types

* client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
2020-03-23 13:58:30 -04:00
Lang Martin 80619137ab csi: volumes listed in `nomad node status` (#7318)
* api/allocations: GetTaskGroup finds the taskgroup struct

* command/node_status: display CSI volume names

* nomad/state/state_store: new CSIVolumesByNodeID

* nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator

* nomad/csi_endpoint: deal with a slice of volumes

* nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator

* nomad/structs/csi: CSIVolumeListRequest takes a NodeID

* nomad/csi_endpoint: use the return iterator

* command/agent/csi_endpoint: parse query params for CSIVolumes.List

* api/nodes: new CSIVolumes to list volumes by node

* command/node_status: use the new list endpoint to print volumes

* nomad/state/state_store: error messages consider the operator

* command/node_status: include the Provider
2020-03-23 13:58:30 -04:00
Lang Martin de25fc6cf4 csi: csi-hostpath plugin unimplemented error on controller publish (#7299)
* client/allocrunner/csi_hook: tag errors

* nomad/client_csi_endpoint: tag errors

* nomad/client_rpc: remove an unnecessary error tag

* nomad/state/state_store: ControllerRequired fix intent

We use ControllerRequired to indicate that a volume should use the
publish/unpublish workflow, rather than that it has a controller. We
need to check both RequiresControllerPlugin and SupportsAttachDetach
from the fingerprint to check that.

* nomad/csi_endpoint: tag errors

* nomad/csi_endpoint_test: longer error messages, mock fingerprints
2020-03-23 13:58:30 -04:00
Tim Gross b04d23dae0 csi: ensure volume query is idempotent (#7303)
We denormalize the `CSIVolume` struct when we query it from the state
store by getting the plugin and its health. But unless we copy the
volume, this denormalization gets synced back to the state store
without passing through the fsm (which is invalid).
2020-03-23 13:58:30 -04:00
Tim Gross b57df162ce csi: ensure GET for plugin is idempotent (#7298)
We denormalize the `CSIPlugin` struct when we query it from the state
store by getting the current set of allocations that provide the
plugin. But unless we copy the plugin, this denormalization gets
synced back to the state store and each time we query we'll add
another copy of the current allocations.
2020-03-23 13:58:30 -04:00
Tim Gross de4ad6ca38 csi: add Provider field to CSI CLIs and APIs (#7285)
Derive a provider name and version for plugins (and the volumes that
use them) from the CSI identity API `GetPluginInfo`. Expose the vendor
name as `Provider` in the API and CLI commands.
2020-03-23 13:58:30 -04:00
Lang Martin 887e1f28c9 csi: CLI for volume status, registration/deregistration and plugin status (#7193)
* command/csi: csi, csi_plugin, csi_volume

* helper/funcs: move ExtraKeys from parse_config to UnusedKeys

* command/agent/config_parse: use helper.UnusedKeys

* api/csi: annotate CSIVolumes with hcl fields

* command/csi_plugin: add Synopsis

* command/csi_volume_register: use hcl.Decode style parsing

* command/csi_volume_list

* command/csi_volume_status: list format, cleanup

* command/csi_plugin_list

* command/csi_plugin_status

* command/csi_volume_deregister

* command/csi_volume: add Synopsis

* api/contexts/contexts: add csi search contexts to the constants

* command/commands: register csi commands

* api/csi: fix struct tag for linter

* command/csi_plugin_list: unused struct vars

* command/csi_plugin_status: unused struct vars

* command/csi_volume_list: unused struct vars

* api/csi: add allocs to CSIPlugin

* command/csi_plugin_status: format the allocs

* api/allocations: copy Allocation.Stub in from structs

* nomad/client_rpc: add some error context with Errorf

* api/csi: collapse read & write alloc maps to a stub list

* command/csi_volume_status: cleanup allocation display

* command/csi_volume_list: use Schedulable instead of Healthy

* command/csi_volume_status: use Schedulable instead of Healthy

* command/csi_volume_list: sprintf string

* command/csi: delete csi.go, csi_plugin.go

* command/plugin: refactor csi components to sub-command plugin status

* command/plugin: remove csi

* command/plugin_status: remove csi

* command/volume: remove csi

* command/volume_status: split out csi specific

* helper/funcs: add RemoveEqualFold

* command/agent/config_parse: use helper.RemoveEqualFold

* api/csi: do ,unusedKeys right

* command/volume: refactor csi components to `nomad volume`

* command/volume_register: split out csi specific

* command/commands: use the new top level commands

* command/volume_deregister: hardwired type csi for now

* command/volume_status: csiFormatVolumes rescued from volume_list

* command/plugin_status: avoid a panic on no args

* command/volume_status: avoid a panic on no args

* command/plugin_status: predictVolumeType

* command/volume_status: predictVolumeType

* nomad/csi_endpoint_test: move CreateTestPlugin to testing

* command/plugin_status_test: use CreateTestCSIPlugin

* nomad/structs/structs: add CSIPlugins and CSIVolumes search consts

* nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix

* nomad/search_endpoint: add CSIPlugins and CSIVolumes

* command/plugin_status: move the header to the csi specific

* command/volume_status: move the header to the csi specific

* nomad/state/state_store: CSIPluginByID prefix

* command/status: rename the search context to just Plugins/Volumes

* command/plugin,volume_status: test return ids now

* command/status: rename the search context to just Plugins/Volumes

* command/plugin_status: support -json and -t

* command/volume_status: support -json and -t

* command/plugin_status_csi: comments

* command/*_status: clean up text

* api/csi: fix stale comments

* command/volume: make deregister sound less fearsome

* command/plugin_status: set the id length

* command/plugin_status_csi: more compact plugin health

* command/volume: better error message, comment
2020-03-23 13:58:30 -04:00
Tim Gross b3bf64485e csi: remove DevDisableBootstrap flag from tests (#7267)
In #7252 we removed the `DevDisableBootstrap` flag to require tests to
honor only `BootstrapExpect`, in order to reduce a source of test
flakiness. This changeset applies the same fix to the CSI tests.
2020-03-23 13:58:30 -04:00
Lang Martin 369b0e54b9 csi: volumes use `Schedulable` rather than `Healthy` (#7250)
* structs: add ControllerRequired, volume.Name, no plug.Type

* structs: Healthy -> Schedulable

* state_store: Healthy -> Schedulable

* api: add ControllerRequired to api data types

* api: copy csi structs changes

* nomad/structs/csi: include name and external id

* api/csi: include Name and ExternalID

* nomad/structs/csi: comments for the 3 ids
2020-03-23 13:58:30 -04:00
Lang Martin a4784ef258 csi add allocation context to fingerprinting results (#7133)
* structs: CSIInfo include AllocID, CSIPlugins no Jobs

* state_store: eliminate plugin Jobs, delete an empty plugin

* nomad/structs/csi: detect empty plugins correctly

* client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID

* client/pluginmanager/csimanager/instance: allocID

* client/pluginmanager/csimanager/fingerprint: set AllocID

* client/node_updater: split controller and node plugins

* api/csi: remove Jobs

The CSI Plugin API will map plugins to allocations, which allows
plugins to be defined by jobs in many configurations. In particular,
multiple plugins can be defined in the same job, and multiple jobs can
be used to define a single plugin.

Because we now map the allocation context directly from the node, it's
no longer necessary to track the jobs associated with a plugin
directly.

* nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint

* client/dynamicplugins: lift AllocID into the struct from Options

* api/csi_test: remove Jobs test

* nomad/structs/csi: CSIPlugins has an array of allocs

* nomad/state/state_store: implement CSIPluginDenormalize

* nomad/state/state_store: CSIPluginDenormalize npe on missing alloc

* nomad/csi_endpoint_test: defer deleteNodes for clarity

* api/csi_test: disable this test awaiting mocks:
https://github.com/hashicorp/nomad/issues/7123
2020-03-23 13:58:30 -04:00
Danielle Lancashire e75f057df3 csi: Fix Controller RPCs
Currently the handling of CSINode RPCs does not correctly handle
forwarding RPCs to Nodes.

This commit fixes this by introducing a shim RPC
(nomad/client_csi_enpdoint) that will correctly forward the request to
the owning node, or submit the RPC to the client.

In the process it also cleans up handling a little bit by adding the
`CSIControllerQuery` embeded struct for required forwarding state.

The CSIControllerQuery embeding the requirement of a `PluginID` also
means we could move node targetting into the shim RPC if wanted in the
future.
2020-03-23 13:58:30 -04:00
Tim Gross 8bc5641438 csi: volume claim garbage collection (#7125)
When an alloc is marked terminal (and after node unstage/unpublish
have been called), the client syncs the terminal alloc state with the
server via `Node.UpdateAlloc RPC`.

For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC
handler at the server will emit an eval for a new core job to garbage
collect CSI volume claims. When this eval is handled on the core
scheduler, it will call a `volumeReap` method to release the claims
for all terminal allocs on the job.

The volume reap will issue a `ControllerUnpublishVolume` RPC for any
node that has no alloc claiming the volume. Once this returns (or
is skipped), the volume reap will send a new `CSIVolume.Claim` RPC
that releases the volume claim for that allocation in the state store,
making it available for scheduling again.

This same `volumeReap` method will be called from the core job GC,
which gives us a second chance to reclaim volumes during GC if there
were controller RPC failures.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 9d4307a3ef csi_endpoint: Provide AllocID in req, and return Volume
Currently, the client has to ship an entire allocation to the server as
part of performing a VolumeClaim, this has a few problems:

Firstly, it means the client is sending significantly more data than is
required (an allocation contains the entire contents of a Nomad job,
alongside other irrelevant state) which has a non-zero (de)serialization
cost.

Secondly, because the allocation was never re-fetched from the state
store, it means that we were potentially open to issues caused by stale
state on a misbehaving or malicious client.

The change removes both of those issues at the cost of a couple of more
state store lookups, but they should be relatively cheap.

We also now provide the CSIVolume in the response for a claim, so the
client can perform a Claim without first going ahead and fetching all of
the volumes.
2020-03-23 13:58:30 -04:00
Danielle Lancashire c3b1154703 csi: Validate Volumes during registration
This PR implements some intitial support for doing deeper validation of
a volume during its registration with the server. This allows us to
validate the capabilities before users attempt to use the volumes during
most cases, and also prevents registering volumes without first setting
up a plugin, which should help to catch typos and the like during
registration.

This does have the downside of requiring users to wait for (1) instance
of a plugin to be running in their cluster before they can register
volumes.
2020-03-23 13:58:30 -04:00
Tim Gross b03b78b212 csi: server-to-controller publish/unpublish RPCs (#7124)
Nomad servers need to make requests to CSI controller plugins running
on a client for publish/unpublish. The RPC needs to look up the client
node based on the plugin, load balancing across controllers, and then
perform the required client RPC to that node (via server forwarding if
neccessary).
2020-03-23 13:58:30 -04:00
Tim Gross b9b315f8d1 csi: stub methods for server-to-controller RPC calls (#7117) 2020-03-23 13:58:30 -04:00
Danielle Lancashire 77bcaa8183 csi_endpoint: Support No ACLs and restrict Nodes
This commit refactors the ACL code for the CSI endpoint to support
environments that run without acls enabled (e.g developer environments)
and also provides an easy way to restrict which endpoints may be
accessed with a client's SecretID to limit the blast radius of a
malicious client on the state of the environment.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 22e8317a53 csi: Disable validation of volume topology 2020-03-23 13:58:30 -04:00
Tim Gross 01c704ab9d csi: add PublishContext to CSIVolumeClaimResponse (#7113)
The `ControllerPublishVolumeResponse` CSI RPC includes the publish
context intended to be passed by the orchestrator as an opaque value
to the node plugins. This changeset adds it to our response to a
volume claim request to proxy the controller's response back to the
client node.
2020-03-23 13:58:29 -04:00
Tim Gross fb1aad66ee csi: implement releasing volume claims for terminal allocs (#7076)
When an alloc is marked terminal, and after node unstage/unpublish
have been called, the client will sync the terminal alloc state with
the server via `Node.UpdateAlloc` RPC.

This changeset implements releasing the volume claim for each volume
associated with the terminal alloc. It doesn't yet implement the RPC
call we need to make to the `ControllerUnpublishVolume` CSI RPC.
2020-03-23 13:58:29 -04:00
Tim Gross d4cd272de3 csi: implement VolumeClaimRPC (#7048)
When the client receives an allocation which includes a CSI volume,
the alloc runner will block its main `Run` loop. The alloc runner will
issue a `VolumeClaim` RPC to the Nomad servers. This changeset
implements the portions of the `VolumeClaim` RPC endpoint that have
not been previously completed.
2020-03-23 13:58:29 -04:00
Lang Martin 421d7ed2e4 nomad: csi_endpoint send register & deregister requests to raft (#7059) 2020-03-23 13:58:29 -04:00
Lang Martin 7b675f89ac csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049)
* state_store: csi volumes/plugins store the index in the txn

* nomad: csi_endpoint_test require index checks need uint64()

* nomad: other tests using int 0 not uint64(0)

* structs: pass index into New, but not other struct methods

* state_store: csi plugin indexes, use new struct interface

* nomad: csi_endpoint_test check index/query meta (on explicit 0)

* structs: NewCSIVolume takes an index arg now

* scheduler/test: NewCSIVolume takes an index arg now
2020-03-23 13:58:29 -04:00
Lang Martin a0a6766740 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Tim Gross 8673ea5cba csi: add empty CSI volume publication GC to scheduled core jobs (#7014)
This changeset adds a new core job `CoreJobCSIVolumePublicationGC` to
the leader's loop for scheduling core job evals. Right now this is an
empty method body without even a config file stanza. Later changesets
will implement the logic of volume publication GC.
2020-03-23 13:58:29 -04:00
Lang Martin 88316208a0 csi: server-side plugin state tracking and api (#6966)
* structs: CSIPlugin indexes jobs acting as plugins and node updates

* schema: csi_plugins table for CSIPlugin

* nomad: csi_endpoint use vol.Denormalize, plugin requests

* nomad: csi_volume_endpoint: rename to csi_endpoint

* agent: add CSI plugin endpoints

* state_store_test: use generated ids to avoid t.Parallel conflicts

* contributing: add note about registering new RPC structs

* command: agent http register plugin lists

* api: CSI plugin queries, ControllerHealthy -> ControllersHealthy

* state_store: copy on write for volumes and plugins

* structs: copy on write for volumes and plugins

* state_store: CSIVolumeByID returns an unhealthy volume, denormalize

* nomad: csi_endpoint use CSIVolumeDenormalizePlugins

* structs: remove struct errors for missing objects

* nomad: csi_endpoint return nil for missing objects, not errors

* api: return meta from Register to avoid EOF error

* state_store: CSIVolumeDenormalize keep allocs in their own maps

* state_store: CSIVolumeDeregister error on missing volume

* state_store: CSIVolumeRegister set indexes

* nomad: csi_endpoint use CSIVolumeDenormalizePlugins tests
2020-03-23 13:58:29 -04:00
Lang Martin 61cfc806ad csi_volume_endpoint_test: gen uuids to avoid t.Parallel conflicts 2020-03-23 13:58:29 -04:00
Lang Martin 334979a754 nomad/rpc: indicate missing region in error message 2020-03-23 13:58:29 -04:00
Lang Martin 5b31b140c3 csi: do not use namespace specific identifiers 2020-03-23 13:58:29 -04:00
Lang Martin e922531aaf structs: move the content of csi_volumes into csi 2020-03-23 13:58:29 -04:00
Lang Martin 04b6e7c7fb server: rpc register CSIVolume 2020-03-23 13:58:29 -04:00
Lang Martin 8f33fb9a6d csi volume endpoint: new RPC endpoint for CSI volumes 2020-03-23 13:58:29 -04:00
Lang Martin 4bb4dd98eb state_store: CSIVolume insert, get, delete, claim
state_store: change claim counts

state_store: get volumes by all, by driver

state_store: process volume claims

state_store: csi volume register error on update
2020-03-23 13:58:29 -04:00
Lang Martin 0422b967db schema: csi_volumes schema 2020-03-23 13:58:29 -04:00
Lang Martin 857cd37ab5 fsm: dispatch CSIVolume register, deregister, claim 2020-03-23 13:58:29 -04:00
Lang Martin f9d9faf673 structs: eliminate MaxReaders & MaxWriters 2020-03-23 13:58:29 -04:00
Lang Martin 3a7e1b6d14 client structs: move CSIVolumeAttachmentMode and CSIVolumeAccessMode 2020-03-23 13:58:29 -04:00
Lang Martin 637ce9dfad structs: new CSIVolume, request types 2020-03-23 13:58:29 -04:00
Danielle Lancashire 57ae1d2cd6 csimanager: Fingerprint Node Service capabilities 2020-03-23 13:58:29 -04:00
Danielle Lancashire 564f5cec93 csimanager: Fingerprint controller capabilities 2020-03-23 13:58:29 -04:00
Danielle Lancashire 426c26d7c0 CSI Plugin Registration (#6555)
This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:

* A `csi_plugin` stanza as part of a Nomad task configuration, to
  allow a task to expose that it is a plugin.

* A new task runner hook: `csi_plugin_supervisor`. This hook does two
  things. When the `csi_plugin` stanza is detected, it will
  automatically configure the plugin task to receive bidirectional
  mounts to the CSI intermediary directory. At runtime, it will then
  perform an initial heartbeat of the plugin and handle submitting it to
  the new `dynamicplugins.Registry` for further use by the client, and
  then run a lightweight heartbeat loop that will emit task events
  when health changes.

* The `dynamicplugins.Registry` for handling plugins that run
  as Nomad tasks, in contrast to the existing catalog that requires
  `go-plugin` type plugins and to know the plugin configuration in
  advance.

* The `csimanager` which fingerprints CSI plugins, in a similar way to
  `drivermanager` and `devicemanager`. It currently only fingerprints
  the NodeID from the plugin, and assumes that all plugins are
  monolithic.

Missing features

* We do not use the live updates of the `dynamicplugin` registry in
  the `csimanager` yet.

* We do not deregister the plugins from the client when they shutdown
  yet, they just become indefinitely marked as unhealthy. This is
  deliberate until we figure out how we should manage deploying new
  versions of plugins/transitioning them.
2020-03-23 13:58:28 -04:00
Drew Bailey b09abef332
Audit config, seams for enterprise audit features
allow oss to parse sink duration

clean up audit sink parsing

ent eventer config reload

fix typo

SetEnabled to eventer interface

client acl test

rm dead code

fix failing test
2020-03-23 13:47:42 -04:00
Jasmine Dahilig 73a64e4397 change jobspec lifecycle stanza to use sidecar attribute instead of
block_until status
2020-03-21 17:52:57 -04:00
Jasmine Dahilig 1485b342e2 remove deadline code for now 2020-03-21 17:52:56 -04:00
Jasmine Dahilig d54a83afee fix linting errors 2020-03-21 17:52:53 -04:00
Jasmine Dahilig a0fe570317 clean up restore test 2020-03-21 17:52:52 -04:00
Jasmine Dahilig 7ed08eb75a partial test for restore functionality 2020-03-21 17:52:52 -04:00
Jasmine Dahilig 81d051d7e8 fix bug in lifecycle scheduler test mocks 2020-03-21 17:52:51 -04:00
Jasmine Dahilig b7f08c9d13 add appropriate lifecycle deadline default of 120s 2020-03-21 17:52:48 -04:00
Jasmine Dahilig 0cc9212a54 add test cases for scheduler alloc placement with lifecycle resources 2020-03-21 17:52:47 -04:00
Jasmine Dahilig 0d2988652c add lifecycle job mock 2020-03-21 17:52:47 -04:00
Jasmine Dahilig c27223207c update task hook coordinator tests 2020-03-21 17:52:46 -04:00
Mahmood Ali b880607bad update scheduler to account for hooks 2020-03-21 17:52:45 -04:00
Jasmine Dahilig 12393f90e7 add test for lifecycle coordinator 2020-03-21 17:52:42 -04:00
Jasmine Dahilig f6e58d6dad add canonicalize in the right place 2020-03-21 17:52:41 -04:00
Jasmine Dahilig 4498c8c24f add canonicalization 2020-03-21 17:52:39 -04:00
Jasmine Dahilig 67262d841b add validation tests and more validation 2020-03-21 17:52:39 -04:00
Mahmood Ali 214d128bd9 it's running now 2020-03-21 17:52:37 -04:00
Jasmine Dahilig fc13fa9739 change TaskLifecycle RunLevel to Hook and add Deadline time duration 2020-03-21 17:52:37 -04:00
Mahmood Ali 4ebeac721a update structs with lifecycle 2020-03-21 17:52:36 -04:00
Yoan Blanc 67692789b7
vendor: vault api and sdk
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-21 17:57:48 +01:00
Mahmood Ali 53e20e5cc2 Deflake TestRPC_Limits_Streaming test
The test starts enough connections to hit the limit, then closes the
connection and immediately starts one expecting the new one to succeed.

We must wait until the server side recognizes the connection
closing and free up a limits slot.  The current test attempts to achieve
that by waiting to get an error on conn.Read, however, this error is
returned from local client without waiting for server update.

As such, I change the logic so it retries on connection rejection but
force the first non-EOF failure to be a deadline error.
2020-03-20 17:21:43 -04:00
Mahmood Ali 0da7130a1a Protect against args being modified 2020-03-18 08:11:16 -04:00
Mahmood Ali 52fd31af80 server: node connections must not be forwarded
This fixes a bug where a forwarded node update request may be assumed
to be the actual direct client connection if the server just lost
leadership.

When a nomad non-leader server receives a Node.UpdateStatus request, it
forwards the RPC request to the leader, and holds on the request
Yamux connection in a cache to allow for server<->client forwarding.

When the leader handles the request, it must differentiate between a
forwarded connection vs the actual connection.  This is done in
https://github.com/hashicorp/nomad/blob/v0.10.4/nomad/node_endpoint.go#L412

Now, consider if the non-leader server forwards to the connection to a
recently deposed nomad leader, which in turn forwards the RPC request to
the new leader.

Without this change, the deposed leader will mistake the forwarded
connection for the actual client connection and cache it mapped to the
client ID.  If the server attempts to connect to that client, it will
attempt to start a connection/session to the other server instead and
the call will hang forever.

This change ensures that we only add node connection mapping if the
request is not a forwarded request, regardless of circumstances.
2020-03-17 16:39:01 -04:00
Mahmood Ali 9d88f1d568 tests: deflake deploymentwatcher package
This deflake the tests in the deploymentwatcher package.  The package
uses a mock deployment watcher backend, where the watcher in a
background goroutine calls UpdateDeploymentStatus .  If the mock isn't
configured to expect the call, the background goroutine will fail.  One
UpdateDeploymentStatus call is made at the end of the background
goroutine, which may occur after the test completes, thus explaining the
flakiness.
2020-03-12 15:42:01 -04:00
Michael Schurter 2dcc85bed1 jobspec: fixup vault_grace deprecation
Followup to #7170

- Moved canonicalization of VaultGrace back into `api/` package.
- Fixed tests.
- Made docs styling consistent.
2020-03-10 14:58:49 -07:00
Michael Schurter b72b3e765c
Merge pull request #7170 from fredrikhgrelland/consul_template_upgrade
Update consul-template to v0.24.1 and remove deprecated vault grace
2020-03-10 14:15:47 -07:00
Mahmood Ali 005bd37758 tests: deflake TestServer_ReconcileMember
TestServer_ReconcileMember assumes that S3 isn't the leader:
`reconcileMembers` call would fail when attempting to remove itself!
2020-03-06 14:14:41 -05:00
Mahmood Ali 17ee94b52b fix typo 2020-03-03 16:55:54 -05:00
Mahmood Ali acbfeb5815 Simplify Bootstrap logic in tests
This change updates tests to honor `BootstrapExpect` exclusively when
forming test clusters and removes test only knobs, e.g.
`config.DevDisableBootstrap`.

Background:

Test cluster creation is fragile.  Test servers don't follow the
BootstapExpected route like production clusters.  Instead they start as
single node clusters and then get rejoin and may risk causing brain
split or other test flakiness.

The test framework expose few knobs to control those (e.g.
`config.DevDisableBootstrap` and `config.Bootstrap`) that control
whether a server should bootstrap the cluster.  These flags are
confusing and it's unclear when to use: their usage in multi-node
cluster isn't properly documented.  Furthermore, they have some bad
side-effects as they don't control Raft library: If
`config.DevDisableBootstrap` is true, the test server may not
immediately attempt to bootstrap a cluster, but after an election
timeout (~50ms), Raft may force a leadership election and win it (with
only one vote) and cause a split brain.

The knobs are also confusing as Bootstrap is an overloaded term.  In
BootstrapExpect, we refer to bootstrapping the cluster only after N
servers are connected.  But in tests and the knobs above, it refers to
whether the server is a single node cluster and shouldn't wait for any
other server.

Changes:

This commit makes two changes:

First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or
`DevMode` flags.  This change is relatively trivial.

Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped.
This allows us to keep `BootstrapExpected` immutable.  Previously, the
flag was a config value but it gets set to 0 after cluster bootstrap
completes.
2020-03-02 13:47:43 -05:00
Fredrik Hoem Grelland edb3bd0f3f Update consul-template to v0.24.1 and remove deprecated vault_grace (#7170) 2020-02-23 16:24:53 +01:00
Seth Hoenig 0f99cdd0d9
Merge pull request #7192 from hashicorp/b-connect-stanza-ignore
consul/connect: in-place update sidecar service registrations on changes
2020-02-21 09:24:53 -06:00
Seth Hoenig 07b9b24ceb
nomad: note why AddressMode is not part of CSD hash
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2020-02-21 09:24:42 -06:00
Seth Hoenig 54b5173eca consul/connect: in-place update sidecar service registrations on changes
Fix a bug where consul service definitions would not be updated if changes
were made to the service in the Nomad job. Currently this only fixes the
bug for cases where the fix is a matter of updating consul agent's service
registration. There is related bug where destructive changes are required
(see #6877) which will be fixed in another PR.

The enable_tag_override configuration setting for the parent service is
applied to the sidecar service.

Fixes #6459
2020-02-19 13:07:04 -06:00
Mahmood Ali 98ad59b1de update rest of consul packages 2020-02-16 16:25:04 -06:00
Mahmood Ali f492ab6d9e implement MinQuorum 2020-02-16 16:04:59 -06:00
Mahmood Ali 3dcc65d58d Update consul autopilot dependency 2020-02-16 15:41:43 -06:00
Mahmood Ali cf53ee57cd remove unused dropButLastChannel 2020-02-13 18:56:53 -05:00
Mahmood Ali fd51982018 tests: Avoid StartAsLeader raft config flag
It's being deprecated
2020-02-13 18:56:53 -05:00
Mahmood Ali 367133a399 Use latest raft patterns 2020-02-13 18:56:52 -05:00
Seth Hoenig 543354aabe
Merge pull request #7106 from hashicorp/f-ctag-override
client: enable configuring enable_tag_override for services
2020-02-13 12:34:48 -06:00
Michael Schurter 8c332a3757
Merge pull request #7102 from hashicorp/test-limits
Fix some race conditions and flaky tests
2020-02-13 10:19:11 -08:00
Mahmood Ali bc70beeb4a
Merge pull request #7044 from hashicorp/f-use-multiplexv2
rpc: Use MultiplexV2 for connections
2020-02-13 12:07:20 -05:00
Drew Bailey 24a5d36fcf
Merge pull request #7112 from hashicorp/f-include-pro-tag
include pro tag in serveral oss.go files
2020-02-13 11:26:41 -05:00
Seth Hoenig 2829b4cd23
Merge pull request #7129 from hashicorp/b-consistent-ct-name
command: use consistent CONSUL_HTTP_TOKEN name
2020-02-12 12:27:46 -06:00
Seth Hoenig 7f33b92e0b command: use consistent CONSUL_HTTP_TOKEN name
Consul CLI uses CONSUL_HTTP_TOKEN, so Nomad should use the same.
Note that consul-template uses CONSUL_TOKEN, which Nomad also uses,
so be careful to preserve any reference to that in the consul-template
context.
2020-02-12 10:42:33 -06:00
Seth Hoenig ce50345b7a nomad: assert consul token is unset on job register in tests 2020-02-12 10:17:42 -06:00
Seth Hoenig 02151dee45 nomad: unset consul token on job register 2020-02-12 09:58:51 -06:00
Drew Bailey 6bd6c6638c
include pro tag in serveral oss.go files 2020-02-10 15:56:14 -05:00
Seth Hoenig 0e44094d1a client: enable configuring enable_tag_override for services
Consul provides a feature of Service Definitions where the tags
associated with a service can be modified through the Catalog API,
overriding the value(s) configured in the agent's service configuration.

To enable this feature, the flag enable_tag_override must be configured
in the service definition.

Previously, Nomad did not allow configuring this flag, and thus the default
value of false was used. Now, it is configurable.

Because Nomad itself acts as a state machine around the the service definitions
of the tasks it manages, it's worth describing what happens when this feature
is enabled and why.

Consider the basic case where there is no Nomad, and your service is provided
to consul as a boring JSON file. The ultimate source of truth for the definition
of that service is the file, and is stored in the agent. Later, Consul performs
"anti-entropy" which synchronizes the Catalog (stored only the leaders). Then
with enable_tag_override=true, the tags field is available for "external"
modification through the Catalog API (rather than directly configuring the
service definition file, or using the Agent API). The important observation
is that if the service definition ever changes (i.e. the file is changed &
config reloaded OR the Agent API is used to modify the service), those
"external" tag values are thrown away, and the new service definition is
once again the source of truth.

In the Nomad case, Nomad itself is the source of truth over the Agent in
the same way the JSON file was the source of truth in the example above.
That means any time Nomad sets a new service definition, any externally
configured tags are going to be replaced. When does this happen? Only on
major lifecycle events, for example when a task is modified because of an
updated job spec from the 'nomad job run <existing>' command. Otherwise,
Nomad's periodic re-sync's with Consul will now no longer try to restore
the externally modified tag values (as long as enable_tag_override=true).

Fixes #2057
2020-02-10 08:00:55 -06:00
Michael Schurter c5073f61a7 test: add timeout to ease debugging 2020-02-07 15:50:53 -08:00
Michael Schurter 9905dec6a3 test: workaround limits race 2020-02-07 15:50:53 -08:00
Michael Schurter 14c5ef3a8d test: fix race around reused default rpc addr
The default RPC addr was a global which is fine for normal runtime use
when it only has a single user.

However many tests modify it and cause races. Follow our convention of
returning defaults from funcs instead of using globals.
2020-02-07 15:50:53 -08:00
Mahmood Ali e106d373b2 rpc: Use MultiplexV2 for connections
MultiplexV2 is a new connection multiplex header that supports multiplex both
RPC and streaming requests over the same Yamux connection.

MultiplexV2 was added in 0.8.0 as part of
https://github.com/hashicorp/nomad/pull/3892 .  So Nomad 0.11 can expect it to
be supported.  Though, some more rigorous testing is required before merging
this.

I want to call out some implementation details:

First, the current connection pool reuses the Yamux stream for multiple RPC calls,
and doesn't close them until an error is encountered.  This commit doesn't
change it, and sets the `RpcNomad` byte only at stream creation.

Second, the StreamingRPC session gets closed by callers and cannot be reused.
Every StreamingRPC opens a new Yamux session.
2020-02-03 19:31:39 -05:00
Drew Bailey 9a65556211
add state store test to ensure PlacedCanaries is updated 2020-02-03 13:58:01 -05:00
Drew Bailey f51a3d1f37
nomad state store must be modified through raft, rm local state change 2020-02-03 13:57:34 -05:00
Drew Bailey 74779f23e6
keep placed canaries aligned with alloc status 2020-02-03 13:57:33 -05:00
Michael Schurter 9bedd0202e sentinel: copy jobs to prevent mutation
It's unclear whether Sentinel code can mutate values passed to the eval,
so ensure it cannot by copying the job.
2020-02-03 08:48:51 -05:00
Seth Hoenig 6bfa50acdc nomad: remove unused default schedular variable
This is from a merge conflict resolution that went the wrong direction.

I assumed the block had been added, but really it had been removed. Now,
it is removed once again.
2020-01-31 19:06:37 -06:00
Seth Hoenig d3cd6afd7e nomad: min cluster version for connect ACLs is now v0.10.4 2020-01-31 19:06:19 -06:00
Seth Hoenig 587a5d4a8d nomad: make TaskGroup.UsesConnect helper a public helper 2020-01-31 19:05:11 -06:00
Seth Hoenig ee89a754f1 nomad: fix leftover missed refactoring in consul policy checking 2020-01-31 19:05:06 -06:00
Seth Hoenig 4ee55fcd6c nomad,client: apply more comment/style PR tweaks 2020-01-31 19:04:52 -06:00
Seth Hoenig be7c671919 nomad,client: apply smaller PR suggestions
Apply smaller suggestions like doc strings, variable names, etc.

Co-Authored-By: Nick Ethier <nethier@hashicorp.com>
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2020-01-31 19:04:40 -06:00
Seth Hoenig 78a7d1e426 comments: cleanup some leftover debug comments and such 2020-01-31 19:04:35 -06:00
Seth Hoenig 8219c78667 nomad: handle SI token revocations concurrently
Be able to revoke SI token accessors concurrently, and also
ratelimit the requests being made to Consul for the various
ACL API uses.
2020-01-31 19:04:14 -06:00
Seth Hoenig 2c7ac9a80d nomad: fixup token policy validation 2020-01-31 19:04:08 -06:00
Seth Hoenig 9df33f622f nomad: proxy requests for Service Identity tokens between Clients and Consul
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
2020-01-31 19:03:53 -06:00
Seth Hoenig 93cf770edb client: enable nomad client to request and set SI tokens for tasks
When a job is configured with Consul Connect aware tasks (i.e. sidecar),
the Nomad Client should be able to request from Consul (through Nomad Server)
Service Identity tokens specific to those tasks.
2020-01-31 19:03:38 -06:00
Seth Hoenig 2b66ce93bb nomad: ensure a unique ClusterID exists when leader (gh-6702)
Enable any Server to lookup the unique ClusterID. If one has not been
generated, and this node is the leader, generate a UUID and attempt to
apply it through raft.

The value is not yet used anywhere in this changeset, but is a prerequisite
for gh-6701.
2020-01-31 19:03:26 -06:00