Commit graph

440 commits

Author SHA1 Message Date
Seth Hoenig 5dd8aa3e27 client: enforce max_kill_timeout client configuration
This PR fixes a bug where client configuration max_kill_timeout was
not being enforced. The feature was introduced in 9f44780 but seems
to have been removed during the major drivers refactoring.

We can make sure the value is enforced by pluming it through the DriverHandler,
which now uses the lesser of the task.killTimeout or client.maxKillTimeout.
Also updates Event.SetKillTimeout to require both the task.killTimeout and
client.maxKillTimeout so that we don't make the mistake of using the wrong
value - as it was being given only the task.killTimeout before.
2022-07-06 15:29:38 -05:00
Grant Griffiths 99896da443
CSI: make plugin health_timeout configurable in csi_plugin stanza (#13340)
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2022-06-14 10:04:16 -04:00
Derek Strickland 12f3ee46ea
alloc_runner: stop sidecar tasks last (#13055)
alloc_runner: stop sidecar tasks last
2022-06-07 11:35:19 -04:00
Michael Schurter 2965dc6a1a
artifact: fix numerous go-getter security issues
Fix numerous go-getter security issues:

- Add timeouts to http, git, and hg operations to prevent DoS
- Add size limit to http to prevent resource exhaustion
- Disable following symlinks in both artifacts and `job run`
- Stop performing initial HEAD request to avoid file corruption on
  retries and DoS opportunities.

**Approach**

Since Nomad has no ability to differentiate a DoS-via-large-artifact vs
a legitimate workload, all of the new limits are configurable at the
client agent level.

The max size of HTTP downloads is also exposed as a node attribute so
that if some workloads have large artifacts they can specify a high
limit in their jobspecs.

In the future all of this plumbing could be extended to enable/disable
specific getters or artifact downloading entirely on a per-node basis.
2022-05-24 16:29:39 -04:00
Seth Hoenig 65f7abf2f4 cli: update default redis and use nomad service discovery
Closes #12927
Closes #12958

This PR updates the version of redis used in our examples from 3.2 to 7.
The old version is very not supported anymore, and we should be setting
a good example by using a supported version.

The long-form example job is now fixed so that the service stanza uses
nomad as the service discovery provider, and so now the job runs without
a requirement of having Consul running and configured.
2022-05-17 10:24:19 -05:00
Seth Hoenig 26b5c01431
Merge pull request #12817 from twunderlich-grapl/fix-network-interpolation
Fix network.dns interpolation
2022-05-17 09:31:32 -05:00
Eng Zer Jun 97d1bc735c
test: use T.TempDir to create temporary test directory (#12853)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix TestLogmon_Start_restart on Windows

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestConsul_Integration

t.TempDir fails to perform the cleanup properly because the folder is
still in use

testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-05-12 11:42:40 -04:00
Seth Hoenig 96ec19788d cgroups: make sure cgroup still exists after task restart
This PR modifies raw_exec and exec to ensure the cgroup for a task
they are driving still exists during a task restart. These drivers
have the same bug but with different root cause.

For raw_exec, we were removing the cgroup in 2 places - the cpuset
manager, and in the unix containment implementation (the thing that
uses freezer cgroup to clean house). During a task restart, the
containment would remove the cgroup, and when the task runner hooks
went to start again would block on waiting for the cgroup to exist,
which will never happen, because it gets created by the cpuset manager
which only runs as an alloc pre-start hook. The fix here is to simply
not delete the cgroup in the containment implementation; killing the
PIDs is enough. The removal happens in the cpuset manager later anyway.

For exec, it's the same idea, except DestroyTask is called on task
failure, which in turn calls into libcontainer, which in turn deletes
the cgroup. In this case we do not have control over the deletion of
the cgroup, so instead we hack the cgroup back into life after the
call to DestroyTask.

All of this only applies to cgroups v2.
2022-05-05 09:51:03 -05:00
Thomas Wunderlich 245d2a463b
Fix formatting 2022-04-29 10:02:20 -04:00
Thomas Wunderlich c86e287de9
Remove debug log lines 2022-04-28 19:14:31 -04:00
Thomas Wunderlich 960e192359
Quick and dirty hack to get interpolated dns values working 2022-04-28 17:09:53 -04:00
Tim Gross 3d630a3629
CSI: enforce one plugin supervisor loop via sync.Once (#12785)
We enforce exactly one plugin supervisor loop by checking whether
`running` is set and returning early. This works but is fairly
subtle. It can briefly result in two goroutines where one quickly
exits before doing any work. Clarify the intent by using
`sync.Once`. The goroutine we've spawned only exits when the entire
task runner is being torn down, and not when the task driver restarts
the workload, so it should never be re-run.
2022-04-26 10:38:50 -04:00
Tim Gross 766025cde7
CSI: plugin supervisor prestart should not mark itself done (#12752)
The task runner hook `Prestart` response object includes a `Done`
field that's intended to tell the client not to run the hook
again. The plugin supervisor creates mount points for the task during
prestart and saves these mounts in the hook resources. But if a client
restarts the hook resources will not be populated. If the plugin task
restarts at any time after the client restarts, it will fail to have
the correct mounts and crash loop until restart attempts run out.

Fix this by not returning `Done` in the response, just as we do for
the `volume_mount_hook`.
2022-04-22 13:07:47 -04:00
Gowtham 1ff8b5f759
Add Concurrent Download Support for artifacts (#11531)
* add concurrent download support - resolves #11244

* format imports

* mark `wg.Done()` via `defer`

* added tests for successful and failure cases and resolved some goleak

* docs: add changelog for #11531

* test typo fixes and improvements

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-04-20 10:15:56 -07:00
Seth Hoenig d1bda4a954 ci: fixup task runner chroot test
This PR is 2 fixes for the flaky TestTaskRunner_TaskEnv_Chroot test.

And also the TestTaskRunner_Download_ChrootExec test.

- Use TinyChroot to stop copying gigabytes of junk, which causes GHA
to fail to create the environment in time.

- Pre-create cgroups on V2 systems. Normally the cgroup directory is
managed by the cpuset manager, but that is not active in taskrunner tests,
so create it by hand in the test framework.
2022-04-19 10:37:46 -05:00
James Rasell 9bc16b1333
client: account for service provider namespace updates in hooks. (#12479)
When a service is updated, the service hooks update a number of
internal fields which helps generate the new workload. This also
needs to update the namespace for the service provider. It is
possible for these to be different, and in the case of Nomad and
Consul running OSS, this is to be expected.
2022-04-06 19:26:22 +02:00
James Rasell 431c153cd9
client: add Nomad template service functionality to runner. (#12458)
This change modifies the template task runner to utilise the
new consul-template which includes Nomad service lookup template
funcs.

In order to provide security and auth to consul-template, we use
a custom HTTP dialer which is passed to consul-template when
setting up the runner. This method follows Vault implementation.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-04-06 19:17:05 +02:00
Seth Hoenig 9670adb6c6 cleanup: purge github.com/pkg/errors 2022-04-01 19:24:02 -05:00
Michael Schurter 33fe04ff6a
template: fix comments and docs
Review notes from @lgfa29

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-03-29 09:25:23 -07:00
Michael Schurter 7a28fcb8af template: disallow writeToFile by default
Resolves #12095 by WONTFIXing it.

This approach disables `writeToFile` as it allows arbitrary host
filesystem writes and is only a small quality of life improvement over
multiple `template` stanzas.

This approach has the significant downside of leaving people who have
altered their `template.function_denylist` *still vulnerable!* I added
an upgrade note, but we should have implemented the denylist as a
`map[string]bool` so that new funcs could be denied without overriding
custom configurations.

This PR also includes a bug fix that broke enabling all consul-template
funcs. We repeatedly failed to differentiate between a nil (unset)
denylist and an empty (allow all) one.
2022-03-28 17:05:42 -07:00
James Rasell 96d8512c85
test: move remaining tests to use ci.Parallel. 2022-03-24 08:45:13 +01:00
James Rasell a646333263
Merge branch 'main' into f-1.3-boogie-nights 2022-03-23 09:41:25 +01:00
Tim Gross e687a21da9
CSI: set plugin CSI_ENDPOINT env var only if unset by user (#12257)
* Use unix:// prefix for CSI_ENDPOINT variable by default
* Some plugins have strict validation over the format of the
  `CSI_ENDPOINT` variable, and unfortunately not all plugins
  agree. Allow the user to override the `CSI_ENDPOINT` to workaround
  those cases.
* Update all demos and tests with CSI_ENDPOINT
2022-03-21 11:48:47 -04:00
James Rasell 042bf0fa57
client: hookup service wrapper for use within client hooks. 2022-03-21 10:29:57 +01:00
Seth Hoenig 2631659551 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
James Rasell 7cd28a6fb6
client: refactor common service registration objects from Consul.
This commit performs refactoring to pull out common service
registration objects into a new `client/serviceregistration`
package. This new package will form the base point for all
client specific service registration functionality.

The Consul specific implementation is not moved as it also
includes non-service registration implementations; this reduces
the blast radius of the changes as well.
2022-03-15 09:38:30 +01:00
Tim Gross f4dfaec589
CSI: set plugin socket path on restore (#12149)
The Prestart hook for task runner hooks doesn't get called when we
restore a task, because the task is already running. The Postrun hook
for CSI plugin supervisors needs the socket path to have been
populated so that the client has a valid path.
2022-03-01 10:22:52 -05:00
Tim Gross 246db87a74
CSI: allow for concurrent plugin allocations (#12078)
The dynamic plugin registry assumes that plugins are singletons, which
matches the behavior of other Nomad plugins. But because dynamic
plugins like CSI are implemented by allocations, we need to handle the
possibility of multiple allocations for a given plugin type + ID, as
well as behaviors around interleaved allocation starts and stops.

Update the data structure for the dynamic registry so that more recent
allocations take over as the instance manager singleton, but we still
preserve the previous running allocations so that restores work
without racing.

Multiple allocations can run on a client for the same plugin, even if
only during updates. Provide each plugin task a unique path for the
control socket so that the tasks don't interfere with each other.
2022-02-23 15:23:07 -05:00
Michael Schurter 48aaa2c7d9
Merge pull request #11975 from hashicorp/f-connect-debugging
connect: write envoy bootstrap debugging info
2022-02-18 13:56:22 -08:00
Seth Hoenig 6550c90198 connect: bootstrap envoy using -proxy-id
This PR modifies the Consul CLI arguments used to bootstrap envoy for
Connect sidecars to make use of '-proxy-id' instead of '-sidecar-for'.

Nomad registers the sidecar service, so we know what ID it has. The
'-sidecar-for' was intended for use when you only know the name of the
service for which the sidecar is being created.

The improvement here is that using '-proxy-id' does not require an underlying
request for listing Consul services. This will make make the interaction
between Nomad and Consul more efficient.

Closes #10452
2022-02-18 14:58:23 -06:00
Michael Schurter 27b8112123 connect: write envoy bootstrap debugging info
When Consul Connect just works, it's wonderful. When it doesn't work it
can be exceeding difficult to debug: operators have to check task
events, Nomad logs, Consul logs, Consul APIs, and even then critical
information is missing.

Using Consul to generate a bootstrap config for Envoy is notoriously
difficult. Nomad doesn't even log stderr, so operators are left trying
to piece together what went wrong.

This patch attempts to provide *maximal* context which unfortunately
includes secrets. **Secrets are always restricted to the secrets/
directory.** This makes debugging a little harder, but allows operators
to know exactly what operation Nomad was trying to perform.

What's added:

- stderr is sent to alloc/logs/envoy_bootstrap.stderr.0
- the CLI is written to secrets/.envoy_bootstrap.cmd
- the environment is written to secrets/.envoy_bootstrap.env as JSON

Accessing this information is unfortunately awkward:
```
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.env
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.cmd
nomad alloc fs b36a alloc/logs/envoy_bootstrap.stderr.0
```

The above assumes an alloc id that starts with `b36a` and a Connect
sidecar proxy for a service named `count-countdash`.

If the alloc is unable to start successfully, the debugging files are
only accessible from the host filesystem.
2022-02-18 12:02:36 -08:00
Tim Gross 27bb2da5ee
CSI: make gRPC client creation more robust (#12057)
Nomad communicates with CSI plugin tasks via gRPC. The plugin
supervisor hook uses this to ping the plugin for health checks which
it emits as task events. After the first successful health check the
plugin supervisor registers the plugin in the client's dynamic plugin
registry, which in turn creates a CSI plugin manager instance that has
its own gRPC client for fingerprinting the plugin and sending mount
requests.

If the plugin manager instance fails to connect to the plugin on its
first attempt, it exits. The plugin supervisor hook is unaware that
connection failed so long as its own pings continue to work. A
transient failure during plugin startup may mislead the plugin
supervisor hook into thinking the plugin is up (so there's no need to
restart the allocation) but no fingerprinter is started.

* Refactors the gRPC client to connect on first use. This provides the
  plugin manager instance the ability to retry the gRPC client
  connection until success.
* Add a 30s timeout to the plugin supervisor so that we don't poll
  forever waiting for a plugin that will never come back up.

Minor improvements:
* The plugin supervisor hook creates a new gRPC client for every probe
  and then throws it away. Instead, reuse the client as we do for the
  plugin manager.
* The gRPC client constructor has a 1 second timeout. Clarify that this
  timeout applies to the connection and not the rest of the client
  lifetime.
2022-02-15 16:57:29 -05:00
James Rasell 15205b5408
Merge pull request #12052 from hashicorp/b-taskrunner-track-deregistered-call
client: track service deregister call so it's only called once.
2022-02-14 09:01:26 +01:00
Tim Gross 8ffe7aa76f
csi: provide CSI_ENDPOINT env var to plugins (#12050)
The CSI specification says:
> The CO SHALL provide the listen-address for the Plugin by way of the
`CSI_ENDPOINT` environment variable.

Note that plugins without filesystem isolation won't have the plugin
dir bind-mounted to their alloc dir, but we can provide a path to the
socket anyways.

Refactor to use opts struct for plugin supervisor hook config.
The parameter list for configuring the plugin supervisor hook has
grown enough where is makes sense to use an options struct similiar to
many of the other task runner hooks (ex. template).
2022-02-11 08:46:21 -05:00
James Rasell 222592a07e
client: track service deregister call so it's only called once.
In certain task lifecycles the taskrunner service deregister call
could be called three times for a task that is exiting. Whilst
each hook caller of deregister has its own purpose, we should try
and ensure it is only called once during the shutdown lifecycle of
a task.

This change therefore tracks when deregister has been called, so
that subsequent calls are noop. In the event the task is
restarting, the deregister value is reset to ensure proper
operation.
2022-02-11 09:29:38 +01:00
Luiz Aoqui 3bf6036487 Version 1.2.6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j
 BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG
 ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT
 J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E
 H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3
 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/
 cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp
 tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex
 FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V
 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD
 ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/
 7lXogv98CYLbMy4X4if0
 =NY1Z
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL
 tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf
 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ
 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA
 uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX
 M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk
 kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss
 Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1
 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE
 JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk
 hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg
 KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk=
 =5EG2
 -----END PGP SIGNATURE-----

Merge tag 'v1.2.6' into merge-release-1.2.6-branch

Version 1.2.6
2022-02-10 14:55:34 -05:00
Seth Hoenig de078e7ac6
client: fix race condition in use of go-getter
go-getter creates a circular dependency between a Client and Getter,
which means each is inherently thread-unsafe if you try to re-use
on or the other.

This PR fixes Nomad to no longer make use of the default Getter objects
provided by the go-getter package. Nomad must create a new Client object
on every artifact download, as the Client object controls the Src and Dst
among other things. When Caling Client.Get, the Getter modifies its own
Client reference, creating the circular reference and race condition.

We can still achieve most of the desired connection caching behavior by
re-using a shared HTTP client with transport pooling enabled.
2022-02-09 19:48:28 -05:00
Karthick Ramachandran 0600bc32e2
improve error message on service length (#12012) 2022-02-04 19:39:34 -05:00
Seth Hoenig db2347a86c cleanup: prevent leaks from time.After
This PR replaces use of time.After with a safe helper function
that creates a time.Timer to use instead. The new function returns
both a time.Timer and a Stop function that the caller must handle.

Unlike time.NewTimer, the helper function does not panic if the duration
set is <= 0.
2022-02-02 14:32:26 -06:00
Seth Hoenig 04f84bcdfe deps: import libtime the normal way
Previously we copied this library by hand to avoid vendor-ing a bunch of
files related to minimock. Now that we no longer vendor, just import the
library normally.

Also we might use more of the library for handling `time.After` uses,
for which this library provides a Context-based solution.
2022-01-31 14:49:05 -06:00
James Rasell 7205b3f08e
Merge pull request #11402 from hashicorp/document-client-initial-vault-renew
taskrunner: add clarifying initial vault token renew comment.
2022-01-13 16:21:58 +01:00
Alessandro De Blasis e647549ecf
metrics: added mapped_file metric (#11500)
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Nate <37554478+servusdei2018@users.noreply.github.com>
2022-01-10 15:35:19 -05:00
grembo edd3b8a20c
Un-break templates when using vault stanza change_mode noop (#11783)
Templates in nomad jobs make use of the vault token defined in
the vault stanza when issuing credentials like client certificates.

When using change_mode "noop" in the vault stanza, consul-template
is not informed in case a vault token is re-issued (which can
happen from time to time for various reasons, as described
in https://www.nomadproject.io/docs/job-specification/vault).

As a result, consul-template will keep using the old vault token
to renew credentials and - once the token expired - stop renewing
credentials. The symptom of this problem is a vault_token
file that is newer than the issued credential (e.g., TLS certificate)
in a job's /secrets directory.

This change corrects this, so that h.updater.updatedVaultToken(token)
is called, which will inform stakeholders about the new
token and make sure, the new token is used by consul-template.

Example job template fragment:

    vault {
        policies = ["nomad-job-policy"]
        change_mode = "noop"
    }

    template {
      data = <<-EOH
        {{ with secret "pki_int/issue/nomad-job"
        "common_name=myjob.service.consul" "ttl=90m"
        "alt_names=localhost" "ip_sans=127.0.0.1"}}
        {{ .Data.certificate }}
        {{ .Data.private_key }}
        {{ .Data.issuing_ca }}
        {{ end }}
      EOH
      destination = "${NOMAD_SECRETS_DIR}/myjob.crt"
      change_mode = "noop"
    }

This fix does not alter the meaning of the three change modes of vault

- "noop" - Take no action
- "restart" - Restart the job
- "signal" - send a signal to the task

as the switch statement following line 232 contains the necessary
logic.

It is assumed that "take no action" was never meant to mean "don't tell
consul-template about the new vault token".

Successfully tested in a staging cluster consisting of multiple
nomad client nodes.
2022-01-10 14:41:38 -05:00
Derek Strickland 0a8e03f0f7
Expose Consul template configuration parameters (#11606)
This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza.

- `wait`

It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza.

- `max_stale`
- `block_query_wait`
- `consul_retry`
- `vault_retry` 
- `wait` 

Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure.

- `wait_bounds`

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-01-10 10:19:07 -05:00
Tim Gross 265e488ab4
task runner: fix goroutine leak in prestart hook (#11741)
The task runner prestart hooks take a `joincontext` so they have the
option to exit early if either of two contexts are canceled: from
killing the task or client shutdown. Some tasks exit without being
shutdown from the server, so neither of the joined contexts ever gets
canceled and we leak the `joincontext` (48 bytes) and its internal
goroutine. This primarily impacts batch jobs and any task that fails
or completes early such as non-sidecar prestart lifecycle tasks.
Cancel the `joincontext` after the prestart call exits to fix the
leak.
2021-12-23 11:50:51 -05:00
James Rasell 45f4689f9c
chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
Tim Gross a0cf5db797
provide -no-shutdown-delay flag for job/alloc stop (#11596)
Some operators use very long group/task `shutdown_delay` settings to
safely drain network connections to their workloads after service
deregistration. But during incident response, they may want to cause
that drain to be skipped so they can quickly shed load.

Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and
`nomad job stop` commands that bypasses the delay. This sets a new
desired transition state on the affected allocations that the
allocation/task runner will identify during pre-kill on the client.

Note (as documented here) that using this flag will almost always
result in failed inbound network connections for workloads as the
tasks will exit before clients receive updated service discovery
information and won't be gracefully drained.
2021-12-13 14:54:53 -05:00
James Rasell e3537a06bb
taskrunner: add clarifying initial vault token renew comment. 2021-10-28 17:09:22 +02:00
Michael Schurter fd68bbc342 test: update tests to properly use AllocDir
Also use t.TempDir when possible.
2021-10-19 10:49:07 -07:00
Michael Schurter 10c3bad652 client: never embed alloc_dir in chroot
Fixes #2522

Skip embedding client.alloc_dir when building chroot. If a user
configures a Nomad client agent so that the chroot_env will embed the
client.alloc_dir, Nomad will happily infinitely recurse while building
the chroot until something horrible happens. The best case scenario is
the filesystem's path length limit is hit. The worst case scenario is
disk space is exhausted.

A bad agent configuration will look something like this:

```hcl
data_dir = "/tmp/nomad-badagent"

client {
  enabled = true

  chroot_env {
    # Note that the source matches the data_dir
    "/tmp/nomad-badagent" = "/ohno"
    # ...
  }
}
```

Note that `/ohno/client` (the state_dir) will still be created but not
`/ohno/alloc` (the alloc_dir).
While I cannot think of a good reason why someone would want to embed
Nomad's client (and possibly server) directories in chroots, there
should be no cause for harm. chroots are only built when Nomad runs as
root, and Nomad disables running exec jobs as root by default. Therefore
even if client state is copied into chroots, it will be inaccessible to
tasks.

Skipping the `data_dir` and `{client,server}.state_dir` is possible, but
this PR attempts to implement the minimum viable solution to reduce risk
of unintended side effects or bugs.

When running tests as root in a vm without the fix, the following error
occurs:

```
=== RUN   TestAllocDir_SkipAllocDir
    alloc_dir_test.go:520:
                Error Trace:    alloc_dir_test.go:520
                Error:          Received unexpected error:
                                Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long
                Test:           TestAllocDir_SkipAllocDir
--- FAIL: TestAllocDir_SkipAllocDir (22.76s)
```

Also removed unused Copy methods on AllocDir and TaskDir structs.

Thanks to @eveld for not letting me forget about this!
2021-10-18 09:22:01 -07:00