open-nomad

Commit Graph

Author	SHA1	Message	Date
Tim Gross	aa8927abb4	volumes: return better error messages for unsupported task drivers (#8030 ) When an allocation runs for a task driver that can't support volume mounts, the mounting will fail in a way that can be hard to understand. With host volumes this usually means failing silently, whereas with CSI the operator gets inscrutable internals exposed in the `nomad alloc status`. This changeset adds a MountConfig field to the task driver Capabilities response. We validate this when the `csi_hook` or `volume_hook` fires and return a user-friendly error. Note that we don't currently have a way to get driver capabilities up to the server, except through attributes. Validating this when the user initially submits the jobspec would be even better than what we're doing here (and could be useful for all our other capabilities), but that's out of scope for this changeset. Also note that the MountConfig enum starts with "supports all" in order to support community plugins in a backwards compatible way, rather than cutting them off from volume mounting unexpectedly.	2020-05-21 09:18:02 -04:00
Tim Gross	065fa7af8b	stats_hook: log normal shutdown condition as debug, not error (#8028 ) The `stats_hook` writes an Error log every time an allocation becomes terminal. This is a normal condition, not an error. A real error condition like a failure to collect the stats is logged later. It just creates log noise, and this is a particularly bad operator experience for heavy batch workloads.	2020-05-20 10:28:30 -04:00
Mahmood Ali	751f337f1c	Update hcl2 vendoring The hcl2 library has moved from http://github.com/hashicorp/hcl2 to https://github.com/hashicorp/hcl/tree/hcl2. This updates Nomad's vendoring to start using hcl2 library. Also updates some related libraries (e.g. `github.com/zclconf/go-cty/cty` and `github.com/apparentlymart/go-textseg`).	2020-05-19 15:00:03 -04:00
Tim Gross	24aa32c503	csi: use a blocking initial connection with timeout The plugin supervisor lazily connects to plugins, but this means we only get "Unavailable" back from the gRPC call in cases where the plugin can never be reached (for example, if the Nomad client has the wrong permissions for the socket). This changeset improves the operator experience by switching to a blocking `DialWithContext`. It eagerly connects so that we can validate the connection is real and get a "failed to open" error in case where Nomad can't establish the initial connection.	2020-05-14 15:59:19 -04:00
Mahmood Ali	543f08c1ae	Deflake TestTaskTemplateManager_BlockedEvents test This change deflakes TestTaskTemplateManager_BlockedEvents test, because it is expecting a number of events without accounting for transitional state. The test TestTaskTemplateManager_BlockedEvents attempts to ensure that a template rendering emits blocked events for missing template ksys. It works by setting a template that requires keys 0,1,2,3,4 and then eventually sets keys 0,1,2,3 and ensures that we get a final event indicating that keys 3 and 4 are still missing. The test waits to get a blocked event for the final state, but it can fail if receives a blocked event for a transitional state (e.g. one reporting 2,3,4,5 are missing). This fixes the test by ensuring that it waits until the final message before assertion. Also, it clarifies the intent of the test with stricter assertions and additional comments.	2020-05-09 14:09:39 -04:00
Tim Gross	083b35d651	csi: checkpoint volume claim garbage collection (#7782 ) Adds a `CSIVolumeClaim` type to be tracked as current and past claims on a volume. Allows for a client RPC failure during node or controller detachment without having to keep the allocation around after the first garbage collection eval. This changeset lays groundwork for moving the actual detachment RPCs into a volume watching loop outside the GC eval.	2020-04-23 11:06:23 -04:00
Anthony Scalisi	9664c6b270	fix spelling errors (#6985 )	2020-04-20 09:28:19 -04:00
Drew Bailey	8bfee62b70	Run task shutdown_delay regardless of service registration task shutdown_delay will currently only run if there are registered services for the task. This implementation detail isn't explicity stated anywhere and is defined outside of the service stanza. This change moves shutdown_delay to be evaluated after prekill hooks are run, outside of any task runner hooks. just use time.sleep	2020-04-10 11:06:26 -04:00
Nick Ethier	44ad5d96d8	ar/bridge: use cni.IsCNINotInitialized helper	2020-04-06 21:44:01 -04:00
Nick Ethier	58fe326090	ar/bridge: better cni status err handling	2020-04-06 21:21:42 -04:00
Nick Ethier	6a286777c7	ar/bridge: ensure cni configuration is always loaded	2020-04-06 21:02:26 -04:00
Nick Ethier	5166806993	Merge pull request #7600 from hashicorp/b-5767 tr/service_hook: prevent Update from running before Poststart finish	2020-04-06 16:52:42 -04:00
Nick Ethier	567609e101	tr/service_hook: reset initialized flag during deregister	2020-04-06 16:05:36 -04:00
Drew Bailey	0d550049e9	ensure shutdown delay can be removed	2020-04-06 11:33:04 -04:00
Drew Bailey	9874e7b21d	Group shutdown delay fixes Group shutdown delay updates were not properly handled in Update hook. This commit also ensures that plan output is displayed.	2020-04-06 11:29:12 -04:00
Seth Hoenig	60c9b73eba	Merge pull request #7602 from hashicorp/b-connect-bootstrap-tls-config connect: set consul TLS options on envoy bootstrap	2020-04-03 08:50:36 -06:00
Tim Gross	f6b3d38eb8	CSI: move node unmount to server-driven RPCs (#7596 ) If a volume-claiming alloc stops and the CSI Node plugin that serves that alloc's volumes is missing, there's no way for the allocrunner hook to send the `NodeUnpublish` and `NodeUnstage` RPCs. This changeset addresses this issue with a redesign of the client-side for CSI. Rather than unmounting in the alloc runner hook, the alloc runner hook will simply exit. When the server gets the `Node.UpdateAlloc` for the terminal allocation that had a volume claim, it creates a volume claim GC job. This job will made client RPCs to a new node plugin RPC endpoint, and only once that succeeds, move on to making the client RPCs to the controller plugin. If the node plugin is unavailable, the GC job will fail and be requeued.	2020-04-02 16:04:56 -04:00
Nick Ethier	3b5d2f8eb8	tr/service_hook: update hook fields during update when poststart hasn't finished	2020-04-02 12:48:19 -04:00
Seth Hoenig	e7fcd281ae	connect: set consul TLS options on envoy bootstrap Fixes #6594 #6711 #6714 #7567 e2e testing is still TBD in #6502 Before, we only passed the Nomad agent's configured Consul HTTP address onto the `consul connect envoy ...` bootstrap command. This meant any Consul setup with TLS enabled would not work with Nomad's Connect integration. This change now sets CLI args and Environment Variables for configuring TLS options for communicating with Consul when doing the envoy bootstrap, as described in https://www.consul.io/docs/commands/connect/envoy.html#usage	2020-04-02 10:30:50 -06:00
Nick Ethier	fa271ff1b3	tr/service_hook: prevent Update from running before Poststart has finished	2020-04-02 12:17:36 -04:00
Seth Hoenig	0266f056b8	connect: enable proxy.passthrough configuration Enable configuration of HTTP and gRPC endpoints which should be exposed by the Connect sidecar proxy. This changeset is the first "non-magical" pass that lays the groundwork for enabling Consul service checks for tasks running in a network namespace because they are Connect-enabled. The changes here provide for full configuration of the connect { sidecar_service { proxy { expose { paths = [{ path = <exposed endpoint> protocol = <http or grpc> local_path_port = <local endpoint port> listener_port = <inbound mesh port> }, ... ] } } } stanza. Everything from `expose` and below is new, and partially implements the precedent set by Consul: https://www.consul.io/docs/connect/registration/service-registration.html#expose-paths-configuration-reference Combined with a task-group level network port-mapping in the form: port "exposeExample" { to = -1 } it is now possible to "punch a hole" through the network namespace to a specific HTTP or gRPC path, with the anticipated use case of creating Consul checks on Connect enabled services. A future PR may introduce more automagic behavior, where we can do things like 1) auto-fill the 'expose.path.local_path_port' with the default value of the 'service.port' value for task-group level connect-enabled services. 2) automatically generate a port-mapping 3) enable an 'expose.checks' flag which automatically creates exposed endpoints for every compatible consul service check (http/grpc checks on connect enabled services).	2020-03-31 17:15:27 -06:00
Tim Gross	14b4712f01	csi: annotate remaining missing cancellation contexts (#7552 )	2020-03-30 16:46:43 -04:00
Mahmood Ali	884d18f068	Merge pull request #7383 from hashicorp/b-health-detect-failing-tasks health: detect failing tasks	2020-03-25 06:30:05 -04:00
Mahmood Ali	a5b024fdea	tests: restart restartpolicy for all tasks in tests	2020-03-24 21:52:48 -04:00
Mahmood Ali	7565ac34c0	tests: populate task restart policy properly	2020-03-24 21:44:37 -04:00
Mahmood Ali	5ed346bf05	tests: update AR task restart policy	2020-03-24 17:00:42 -04:00
Mahmood Ali	ceed57b48f	per-task restart policy	2020-03-24 17:00:41 -04:00
Lang Martin	e100444740	csi: add mount_options to volumes and volume requests (#7398 ) Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host. Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context. closes #7007 * nomad/structs/volumes: add MountOptions to volume request * jobspec/test-fixtures/basic.hcl: add mount_options to volume block * jobspec/parse_test: add expected MountOptions * api/tasks: add mount_options * jobspec/parse_group: use hcl decode not mapstructure, mount_options * client/allocrunner/csi_hook: pass MountOptions through client/allocrunner/csi_hook: add a VolumeMountOptions client/allocrunner/csi_hook: drop Options client/allocrunner/csi_hook: use the structs options * client/pluginmanager/csimanager/interface: UsageOptions.MountOptions * client/pluginmanager/csimanager/volume: pass MountOptions in capabilities * plugins/csi/plugin: remove todo 7007 comment * nomad/structs/csi: MountOptions * api/csi: add options to the api for parsing, match structs * plugins/csi/plugin: move VolumeMountOptions to structs * api/csi: use specific type for mount_options * client/allocrunner/csi_hook: merge MountOptions here * rename CSIOptions to CSIMountOptions * client/allocrunner/csi_hook * client/pluginmanager/csimanager/volume * nomad/structs/csi * plugins/csi/fake/client: add PrevVolumeCapability * plugins/csi/plugin * client/pluginmanager/csimanager/volume_test: remove debugging * client/pluginmanager/csimanager/volume: fix odd merging logic * api: rename CSIOptions -> CSIMountOptions * nomad/csi_endpoint: remove a 7007 comment * command/alloc_status: show mount options in the volume list * nomad/structs/csi: include MountOptions in the volume stub * api/csi: add MountOptions to stub * command/volume_status_csi: clean up csiVolMountOption, add it * command/alloc_status: csiVolMountOption lives in volume_csi_status * command/node_status: display mount flags * nomad/structs/volumes: npe * plugins/csi/plugin: npe in ToCSIRepresentation * jobspec/parse_test: expand volume parse test cases * command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions * command/volume_status_csi: copy paste error * jobspec/test-fixtures/basic: hclfmt * command/volume_status_csi: clean up csiVolMountOption	2020-03-23 13:59:25 -04:00
Tim Gross	5a0bcd39d1	csi: dynamically update plugin registration (#7386 ) Allow for faster updates to plugin status when allocations become terminal by listening for register/deregister events from the dynamic plugin registry (which in turn are triggered by the plugin supervisor hook). The deregistration function closures that we pass up to the CSI plugin manager don't properly close over the name and type of the registration, causing monolith-type plugins to deregister only one of their two plugins on alloc shutdown. Rebind plugin supervisor deregistration targets to fix that. Includes log message and comment improvements	2020-03-23 13:59:25 -04:00
Tim Gross	fe926e899e	volumes: add task environment interpolation to volume_mount (#7364 )	2020-03-23 13:59:25 -04:00
Tim Gross	1cf7ef44ed	csi: docstring and log message fixups (#7327 ) Fix some docstring typos and fix noisy log message during client restarts. A log for the common case where the plugin socket isn't ready yet isn't actionable by the operator so having it at info is just noise.	2020-03-23 13:58:30 -04:00
Lang Martin	de25fc6cf4	csi: csi-hostpath plugin unimplemented error on controller publish (#7299 ) * client/allocrunner/csi_hook: tag errors * nomad/client_csi_endpoint: tag errors * nomad/client_rpc: remove an unnecessary error tag * nomad/state/state_store: ControllerRequired fix intent We use ControllerRequired to indicate that a volume should use the publish/unpublish workflow, rather than that it has a controller. We need to check both RequiresControllerPlugin and SupportsAttachDetach from the fingerprint to check that. * nomad/csi_endpoint: tag errors * nomad/csi_endpoint_test: longer error messages, mock fingerprints	2020-03-23 13:58:30 -04:00
Tim Gross	de4ad6ca38	csi: add Provider field to CSI CLIs and APIs (#7285 ) Derive a provider name and version for plugins (and the volumes that use them) from the CSI identity API `GetPluginInfo`. Expose the vendor name as `Provider` in the API and CLI commands.	2020-03-23 13:58:30 -04:00
Lang Martin	a4784ef258	csi add allocation context to fingerprinting results (#7133 ) * structs: CSIInfo include AllocID, CSIPlugins no Jobs * state_store: eliminate plugin Jobs, delete an empty plugin * nomad/structs/csi: detect empty plugins correctly * client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID * client/pluginmanager/csimanager/instance: allocID * client/pluginmanager/csimanager/fingerprint: set AllocID * client/node_updater: split controller and node plugins * api/csi: remove Jobs The CSI Plugin API will map plugins to allocations, which allows plugins to be defined by jobs in many configurations. In particular, multiple plugins can be defined in the same job, and multiple jobs can be used to define a single plugin. Because we now map the allocation context directly from the node, it's no longer necessary to track the jobs associated with a plugin directly. * nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint * client/dynamicplugins: lift AllocID into the struct from Options * api/csi_test: remove Jobs test * nomad/structs/csi: CSIPlugins has an array of allocs * nomad/state/state_store: implement CSIPluginDenormalize * nomad/state/state_store: CSIPluginDenormalize npe on missing alloc * nomad/csi_endpoint_test: defer deleteNodes for clarity * api/csi_test: disable this test awaiting mocks: https://github.com/hashicorp/nomad/issues/7123	2020-03-23 13:58:30 -04:00
Danielle Lancashire	5b05baf9f6	csi: Add /dev mounts to CSI Plugins CSI Plugins that manage devices need not just access to the CSI directory, but also to manage devices inside `/dev`. This commit introduces a `/dev:/dev` mount to the container so that they may do so.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	1b70fb1398	hook resources: Init with empty resources during setup	2020-03-23 13:58:30 -04:00
Danielle Lancashire	511b7775a6	csi: Claim CSI Volumes during csi_hook.Prerun This commit is the initial implementation of claiming volumes from the server and passes through any publishContext information as appropriate. There's nothing too fancy here.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	da4f6b60a2	csi: Pass through usage options to the csimanager The CSI Spec requires us to attach and stage volumes based on different types of usage information when it may effect how they are bound. Here we pass through some basic usage options in the CSI Hook (specifically the volume aliases ReadOnly field), and the attachment/access mode from the volume. We pass the attachment/access mode seperately from the volume as it simplifies some handling and doesn't necessarily force every attachment to use the same mode should more be supported (I.e if we let each `volume "foo" {}` specify an override in the future).	2020-03-23 13:58:30 -04:00
Danielle Lancashire	a62a90e03c	csi: Unpublish volumes during ar.Postrun This commit introduces initial support for unmounting csi volumes. It takes a relatively simplistic approach to performing NodeUnpublishVolume calls, optimising for cleaning up any leftover state rather than terminating early in the case of errors. This is because it happens during an allocation's shutdown flow and may not always have a corresponding call to `NodePublishVolume` that succeeded.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	6665bdec2e	taskrunner/volume_hook: Cleanup arg order of prepareHostVolumes	2020-03-23 13:58:30 -04:00
Danielle Lancashire	8692ca86bb	taskrunner/volume_hook: Mounts for CSI Volumes This commit implements support for creating driver mounts for CSI Volumes. It works by fetching the created mounts from the allocation resources and then iterates through the volume requests, creating driver mount configs as required. It's a little bit messy primarily because there's _so_ much terminology overlap and it's a bit difficult to follow.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	7a33864edf	volume_hook: Loosen validation in host volume prep	2020-03-23 13:58:30 -04:00
Danielle Lancashire	d8334cf884	allocrunner: Push state from hooks to taskrunners This commit is an initial (read: janky) approach to forwarding state from an allocrunner hook to a taskrunner using a similar `hookResources` approach that tr's use internally. It should eventually probably be replaced with something a little bit more message based, but for things that only come from pre-run hooks, and don't change, it's probably fine for now.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	3ef41fbb86	csi_hook: Stage/Mount volumes as required This commit introduces the first stage of volume mounting for an allocation. The csimanager.VolumeMounter interface manages the blocking and actual minutia of the CSI implementation allowing this hook to do the minimal work of volume retrieval and creating mount info. In the future the `CSIVolume.Get` request should be replaced by `CSIVolume.Claim(Batch?)` to minimize the number of RPCs and to handle external triggering of a ControllerPublishVolume request as required. We also need to ensure that if pre-run hooks fail, we still get a full unwinding of any publish and staged volumes to ensure that there are no hanging references to volumes. That is not handled in this commit.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	4a2492ecb1	client: Pass an RPC Client to AllocRunners As part of introducing support for CSI, AllocRunner hooks need to be able to communicate with Nomad Servers for validation of and interaction with storage volumes. Here we create a small RPCer interface and pass the client (rpc client) to the AR in preparation for making these RPCs.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	3bff9fefae	csi: Provide plugin-scoped paths during RPCs When providing paths to plugins, the path needs to be in the scope of the plugins container, rather than that of the host. Here we enable that by providing the mount point through the plugin registration and then use it when constructing request target paths.	2020-03-23 13:58:29 -04:00
Danielle Lancashire	1a10433b97	csi: Add VolumeManager (#6920 ) This changeset is some pre-requisite boilerplate that is required for introducing CSI volume management for client nodes. It extracts out fingerprinting logic from the csi instance manager. This change is to facilitate reusing the csimanager to also manage the node-local CSI functionality, as it is the easiest place for us to guaruntee health checking and to provide additional visibility into the running operations through the fingerprinter mechanism and goroutine. It also introduces the VolumeMounter interface that will be used to manage staging/publishing unstaging/unpublishing of volumes on the host.	2020-03-23 13:58:29 -04:00
Danielle Lancashire	de5d373001	csi: Setup gRPC Clients with a logger	2020-03-23 13:58:29 -04:00
Danielle Lancashire	426c26d7c0	CSI Plugin Registration (#6555 ) This changeset implements the initial registration and fingerprinting of CSI Plugins as part of #5378. At a high level, it introduces the following: * A `csi_plugin` stanza as part of a Nomad task configuration, to allow a task to expose that it is a plugin. * A new task runner hook: `csi_plugin_supervisor`. This hook does two things. When the `csi_plugin` stanza is detected, it will automatically configure the plugin task to receive bidirectional mounts to the CSI intermediary directory. At runtime, it will then perform an initial heartbeat of the plugin and handle submitting it to the new `dynamicplugins.Registry` for further use by the client, and then run a lightweight heartbeat loop that will emit task events when health changes. * The `dynamicplugins.Registry` for handling plugins that run as Nomad tasks, in contrast to the existing catalog that requires `go-plugin` type plugins and to know the plugin configuration in advance. * The `csimanager` which fingerprints CSI plugins, in a similar way to `drivermanager` and `devicemanager`. It currently only fingerprints the NodeID from the plugin, and assumes that all plugins are monolithic. Missing features * We do not use the live updates of the `dynamicplugin` registry in the `csimanager` yet. * We do not deregister the plugins from the client when they shutdown yet, they just become indefinitely marked as unhealthy. This is deliberate until we figure out how we should manage deploying new versions of plugins/transitioning them.	2020-03-23 13:58:28 -04:00
Mahmood Ali	07a30580ac	health: fail health if any task is pending Fixes a bug where an allocation is considered healthy if some of the tasks are being restarted and as such, their checks aren't tracked by consul agent client. Here, we fix the immediate case by ensuring that an alloc is healthy only if tasks are running and the registered checks at the time are healthy. Previously, health tracker tracked task "health" independently from checks and leads to problems when a task restarts. Consider the following series of events: 1. all tasks start running -> `tracker.tasksHealthy` is true 2. one task has unhealthy checks and get restarted 3. remaining checks are healthy -> `tracker.checksHealthy` is true 4. propagate health status now that `tracker.tasksHealthy` and `tracker.checksHealthy`. This change ensures that we accurately use the latest status of tasks and checks regardless of their status changes. Also, ensures that we only consider check health after tasks are considered healthy, otherwise we risk trusting incomplete checks. This approach accomodates task dependencies well. Service jobs can have prestart short-lived tasks that will terminate before main process runs. These dead tasks that complete successfully will not negate health status.	2020-03-22 11:13:41 -04:00

1 2 3 4 5 ...

445 Commits