Commit graph

140 commits

Author SHA1 Message Date
Thomas Wunderlich 960e192359
Quick and dirty hack to get interpolated dns values working 2022-04-28 17:09:53 -04:00
James Rasell 042bf0fa57
client: hookup service wrapper for use within client hooks. 2022-03-21 10:29:57 +01:00
James Rasell 7cd28a6fb6
client: refactor common service registration objects from Consul.
This commit performs refactoring to pull out common service
registration objects into a new `client/serviceregistration`
package. This new package will form the base point for all
client specific service registration functionality.

The Consul specific implementation is not moved as it also
includes non-service registration implementations; this reduces
the blast radius of the changes as well.
2022-03-15 09:38:30 +01:00
Seth Hoenig db2347a86c cleanup: prevent leaks from time.After
This PR replaces use of time.After with a safe helper function
that creates a time.Timer to use instead. The new function returns
both a time.Timer and a Stop function that the caller must handle.

Unlike time.NewTimer, the helper function does not panic if the duration
set is <= 0.
2022-02-02 14:32:26 -06:00
Alessandro De Blasis e647549ecf
metrics: added mapped_file metric (#11500)
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Nate <37554478+servusdei2018@users.noreply.github.com>
2022-01-10 15:35:19 -05:00
Tim Gross a0cf5db797
provide -no-shutdown-delay flag for job/alloc stop (#11596)
Some operators use very long group/task `shutdown_delay` settings to
safely drain network connections to their workloads after service
deregistration. But during incident response, they may want to cause
that drain to be skipped so they can quickly shed load.

Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and
`nomad job stop` commands that bypasses the delay. This sets a new
desired transition state on the affected allocations that the
allocation/task runner will identify during pre-kill on the client.

Note (as documented here) that using this flag will almost always
result in failed inbound network connections for workloads as the
tasks will exit before clients receive updated service discovery
information and won't be gracefully drained.
2021-12-13 14:54:53 -05:00
James Rasell b6813f1221
chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
Ben Buzbee e247f8806b Don't treat a failed recover + successful destroy as a successful
recover

This code just seems incorrect. As it stands today it reports a
successful restore if RecoverTask fails and then DestroyTask succeeds.

This can result in a really annoying bug where it then calls RecoverTask
again, whereby it will probably get ErrTaskNotFound and call DestroyTask
once more.

I think the only reason this has not been noticed so far is because most
drivers like Docker will return Success, then nomad will call
RecoverTask, get an error (not found) and call DestroyTask again, and
get a ErrTasksNotFound err.
2021-07-03 01:46:36 +00:00
Mahmood Ali 067fd86a8c
drivers: Capture exit code when task is killed (#10494)
This commit ensures Nomad captures the task code more reliably even when the task is killed. This issue affect to `raw_exec` driver, as noted in https://github.com/hashicorp/nomad/issues/10430 .

We fix this issue by ensuring that the TaskRunner only calls `driver.WaitTask` once. The TaskRunner monitors the completion of the task by calling `driver.WaitTask` which should return the task exit code on completion. However, it also could return a "context canceled" error if the agent/executor is shutdown.

Previously, when a task is to be stopped, the killTask path makes two WaitTask calls, and the second returns "context canceled" occasionally because of a "race" in task shutting down and depending on driver, and how fast it shuts down after task completes.

By having a single WaitTask call and consistently waiting for the task, we ensure we capture the exit code reliably before the executor is shutdown or the contexts expired.

I opted to change the TaskRunner implementation to avoid changing the driver interface or requiring 3rd party drivers to update.

Additionally, the PR ensures that attempts to kill the task terminate when the task "naturally" dies. Without this change, if the task dies at the right moment, the `killTask` call may retry to kill an already-dead task for up to 5 minutes before giving up.
2021-05-04 10:54:00 -04:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Nick Ethier 110f982eb3 plugins/drivers: fix deprecated fields 2021-04-16 14:13:29 -04:00
Nick Ethier 1e09ca5cd7 tr: set cpuset cpus if reserved 2021-04-15 13:31:51 -04:00
Nick Ethier 155a2ca5fb client/ar: thread through cpuset manager 2021-04-13 13:28:36 -04:00
Mahmood Ali 2fd9eafc28
only publish measured metrics (#10376) 2021-04-13 11:39:33 -04:00
Mahmood Ali 95d85b9cac oversubscription: set the linux memory limit
Use the MemoryMaxMB as the LinuxResources limit. This is intended to ease
drivers implementation and adoption of the features: drivers that use
`resources.LinuxResources.MemoryLimitBytes` don't need to be updated.

Drivers that use NomadResources will need to updated to track the new
field value. Given that tasks aren't guaranteed to use up the excess
memory limit, this is a reasonable compromise.
2021-03-30 16:55:58 -04:00
Adrian Todorov 47e1cb11df
driver/docker: add extra labels ( job name, task and task group name) 2021-03-08 08:59:52 -05:00
Jasmine Dahilig d6110cbed4
lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
Chris Baker 1d35578bed removed backwards-compatible/untagged metrics deprecated in 0.7 2020-10-13 20:18:39 +00:00
Seth Hoenig ed13e5723f consul/connect: dynamically select envoy sidecar at runtime
As newer versions of Consul are released, the minimum version of Envoy
it supports as a sidecar proxy also gets bumped. Starting with the upcoming
Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current
versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the
default implementation of Connect sidecar proxy.

This PR introduces a change such that each Nomad Client will query its
local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545)
and then launch the Connect sidecar proxy task using the latest supported version
of Envoy. If the `SupportedProxies` API component is not available from
Consul, Nomad will fallback to the old version of Envoy supported by old
versions of Consul.

Setting the meta configuration option `meta.connect.sidecar_image` or
setting the `connect.sidecar_task` stanza will take precedence as is
the current behavior for sidecar proxies.

Setting the meta configuration option `meta.connect.gateway_image`
will take precedence as is the current behavior for connect gateways.

`meta.connect.sidecar_image` and `meta.connect.gateway_image` may make
use of the special `${NOMAD_envoy_version}` variable interpolation, which
resolves to the newest version of Envoy supported by the Consul agent.

Addresses #8585 #7665
2020-10-13 09:14:12 -05:00
Nick Ethier e39574be59
docker: support group allocated ports and host_networks (#8623)
* docker: support group allocated ports

* docker: add new ports driver config to specify which group ports are mapped

* docker: update port mapping docs
2020-08-11 18:30:22 -04:00
Nick Ethier 0bc0403cc3 Task DNS Options (#7661)
Co-Authored-By: Tim Gross <tgross@hashicorp.com>
Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>
2020-06-18 11:01:31 -07:00
Mahmood Ali 751f337f1c Update hcl2 vendoring
The hcl2 library has moved from http://github.com/hashicorp/hcl2 to https://github.com/hashicorp/hcl/tree/hcl2.

This updates Nomad's vendoring to start using hcl2 library.  Also
updates some related libraries (e.g. `github.com/zclconf/go-cty/cty` and
`github.com/apparentlymart/go-textseg`).
2020-05-19 15:00:03 -04:00
Drew Bailey 8bfee62b70
Run task shutdown_delay regardless of service registration
task shutdown_delay will currently only run if there are registered
services for the task. This implementation detail isn't explicity stated
anywhere and is defined outside of the service stanza.

This change moves shutdown_delay to be evaluated after prekill hooks are
run, outside of any task runner hooks.

just use time.sleep
2020-04-10 11:06:26 -04:00
Mahmood Ali ceed57b48f per-task restart policy 2020-03-24 17:00:41 -04:00
Danielle Lancashire d8334cf884 allocrunner: Push state from hooks to taskrunners
This commit is an initial (read: janky) approach to forwarding state
from an allocrunner hook to a taskrunner using a similar `hookResources`
approach that tr's use internally.

It should eventually probably be replaced with something a little bit
more message based, but for things that only come from pre-run hooks,
and don't change, it's probably fine for now.
2020-03-23 13:58:30 -04:00
Danielle Lancashire 1a10433b97 csi: Add VolumeManager (#6920)
This changeset is some pre-requisite boilerplate that is required for
introducing CSI volume management for client nodes.

It extracts out fingerprinting logic from the csi instance manager.
This change is to facilitate reusing the csimanager to also manage the
node-local CSI functionality, as it is the easiest place for us to
guaruntee health checking and to provide additional visibility into the
running operations through the fingerprinter mechanism and goroutine.

It also introduces the VolumeMounter interface that will be used to
manage staging/publishing unstaging/unpublishing of volumes on the host.
2020-03-23 13:58:29 -04:00
Danielle Lancashire 426c26d7c0 CSI Plugin Registration (#6555)
This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:

* A `csi_plugin` stanza as part of a Nomad task configuration, to
  allow a task to expose that it is a plugin.

* A new task runner hook: `csi_plugin_supervisor`. This hook does two
  things. When the `csi_plugin` stanza is detected, it will
  automatically configure the plugin task to receive bidirectional
  mounts to the CSI intermediary directory. At runtime, it will then
  perform an initial heartbeat of the plugin and handle submitting it to
  the new `dynamicplugins.Registry` for further use by the client, and
  then run a lightweight heartbeat loop that will emit task events
  when health changes.

* The `dynamicplugins.Registry` for handling plugins that run
  as Nomad tasks, in contrast to the existing catalog that requires
  `go-plugin` type plugins and to know the plugin configuration in
  advance.

* The `csimanager` which fingerprints CSI plugins, in a similar way to
  `drivermanager` and `devicemanager`. It currently only fingerprints
  the NodeID from the plugin, and assumes that all plugins are
  monolithic.

Missing features

* We do not use the live updates of the `dynamicplugin` registry in
  the `csimanager` yet.

* We do not deregister the plugins from the client when they shutdown
  yet, they just become indefinitely marked as unhealthy. This is
  deliberate until we figure out how we should manage deploying new
  versions of plugins/transitioning them.
2020-03-23 13:58:28 -04:00
Mahmood Ali e1f53347e9 tr: proceed to mark other tasks as dead if alloc fails 2020-03-21 17:52:58 -04:00
Jasmine Dahilig b9a258ed7b incorporate lifecycle into restart tracker 2020-03-21 17:52:40 -04:00
Mahmood Ali d7354b8920 Add a coordinator for alloc runners 2020-03-21 17:52:38 -04:00
Seth Hoenig 93cf770edb client: enable nomad client to request and set SI tokens for tasks
When a job is configured with Consul Connect aware tasks (i.e. sidecar),
the Nomad Client should be able to request from Consul (through Nomad Server)
Service Identity tokens specific to those tasks.
2020-01-31 19:03:38 -06:00
Mahmood Ali b1b714691c address review comments 2020-01-15 08:57:05 -05:00
Mahmood Ali 4e5d867644 client: stop using alloc.TaskResources
Now that alloc.Canonicalize() is called in all alloc sources in the
client (i.e. on state restore and RPC fetching), we no longer need to
check alloc.TaskResources.

alloc.AllocatedResources is always non-nil through alloc runner.
Though, early on, we check for alloc validity, so NewTaskRunner and
TaskEnv must still check.  `TestClient_AddAllocError` test validates
that behavior.
2020-01-09 09:25:07 -05:00
Chris Dickson 4d8ba272d1 client: expose allocated CPU per task (#6784) 2019-12-09 15:40:22 -05:00
Preetha be4a51d5b8
Merge pull request #6349 from hashicorp/b-host-stats
client: Return empty values when host stats fail
2019-11-20 10:13:02 -06:00
Michael Schurter 9fed8d1bed client: fix panic from 0.8 -> 0.10 upgrade
makeAllocTaskServices did not do a nil check on AllocatedResources
which causes a panic when upgrading directly from 0.8 to 0.10. While
skipping 0.9 is not supported we intend to fix serious crashers caused
by such upgrades to prevent cluster outages.

I did a quick audit of the client package and everywhere else that
accesses AllocatedResources appears to be properly guarded by a nil
check.
2019-11-01 07:47:03 -07:00
Danielle Lancashire 4f2343e1c0
client: Return empty values when host stats fail
Currently, there is an issue when running on Windows whereby under some
circumstances the Windows stats API's will begin to return errors (such
as internal timeouts) when a client is under high load, and potentially
other forms of resource contention / system states (and other unknown
cases).

When an error occurs during this collection, we then short circuit
further metrics emission from the client until the next interval.

This can be problematic if it happens for a sustained number of
intervals, as our metrics aggregator will begin to age out older
metrics, and we will eventually stop emitting various types of metrics
including `nomad.client.unallocated.*` metrics.

However, when metrics collection fails on Linux, gopsutil will in many cases
(e.g cpu.Times) silently return 0 values, rather than an error.

Here, we switch to returning empty metrics in these failures, and
logging the error at the source. This brings the behaviour into line
with Linux/Unix platforms, and although making aggregation a little
sadder on intermittent failures, will result in more desireable overall
behaviour of keeping metrics available for further investigation if
things look unusual.
2019-09-19 01:22:07 +02:00
Mahmood Ali f66169cd6a
Merge pull request #6065 from hashicorp/b-nil-driver-exec
Check if driver handle is nil before execing
2019-08-02 09:48:28 -05:00
Mahmood Ali a4670db9b7 Check if driver handle is nil before execing
Defend against tr.getDriverHandle being nil.  Exec handler checks if
task is running, but it may be stopped between check and driver handler
fetching.
2019-08-02 10:07:41 +08:00
Nick Ethier 66c514a388
Add network lifecycle management
Adds a new Prerun and Postrun hooks to manage set up of network namespaces
on linux. Work still needs to be done to make the code platform agnostic and
support Docker style network initalization.
2019-07-31 01:03:17 -04:00
Jasmine Dahilig 2157f6ddf1
add formatting for hcl parsing error messages (#5972) 2019-07-19 10:04:39 -07:00
Mahmood Ali f10201c102 run post-run/post-stop task runner hooks
Handle when prestart failed while restoring a task, to prevent
accidentally leaking consul/logmon processes.
2019-07-02 18:38:32 +08:00
Mahmood Ali 4afd7835e3 Fail alloc if alloc runner prestart hooks fail
When an alloc runner prestart hook fails, the task runners aren't invoked
and they remain in a pending state.

This leads to terrible results, some of which are:
* Lockup in GC process as reported in https://github.com/hashicorp/nomad/pull/5861
* Lockup in shutdown process as TR.Shutdown() waits for WaitCh to be closed
* Alloc not being restarted/rescheduled to another node (as it's still in
  pending state)
* Unexpected restart of alloc on a client restart, potentially days/weeks after
  alloc expected start time!

Here, we treat all tasks to have failed if alloc runner prestart hook fails.
This fixes the lockups, and permits the alloc to be rescheduled on another node.

While it's desirable to retry alloc runner in such failures, I opted to treat it
out of scope.  I'm afraid of some subtles about alloc and task runners and their
idempotency that's better handled in a follow up PR.

This might be one of the root causes for
https://github.com/hashicorp/nomad/issues/5840 .
2019-07-02 18:35:47 +08:00
Mahmood Ali 7bfad051b9 address review comments 2019-07-02 14:53:50 +08:00
Mahmood Ali 3d89ae0f1e task runner to avoid running task if terminal
This change fixes a bug where nomad would avoid running alloc tasks if
the alloc is client terminal but the server copy on the client isn't
marked as running.

Here, we fix the case by having task runner uses the
allocRunner.shouldRun() instead of only checking the server updated
alloc.

Here, we preserve much of the invariants such that `tr.Run()` is always
run, and don't change the overall alloc runner and task runner
lifecycles.

Fixes https://github.com/hashicorp/nomad/issues/5883
2019-06-27 11:27:34 +08:00
Chris Baker a2dc351fd0 formatting and clarity 2019-06-18 14:00:57 +00:00
Chris Baker e0170e1c67 metrics: add namespace label to allocation metrics 2019-06-17 20:50:26 +00:00
Mahmood Ali 2acf30fdd3 Fallback to alloc.TaskResources for old allocs
When a client is running against an old server (e.g. running 0.8),
`alloc.AllocatedResources` may be nil, and we need to check the
deprecated `alloc.TaskResources` instead.

Fixes https://github.com/hashicorp/nomad/issues/5810
2019-06-11 10:32:53 -04:00
Danielle Lancashire 27583ed8c1 client: Pass servers contacted ch to allocrunner
This fixes an issue where batch and service workloads would never be
restarted due to indefinitely blocking on a nil channel.

It also raises the restoration logging message to `Info` to simplify log
analysis.
2019-05-22 13:47:35 +02:00
Michael Schurter 2fe0768f3b docs: changelog entry for #5669 and fix comment 2019-05-14 10:54:00 -07:00