Commit graph

121 commits

Author SHA1 Message Date
Kris Hicks 0cf9cae656
Apply some suggested fixes from staticcheck (#9598) 2020-12-10 07:29:18 -08:00
Kris Hicks 0a3a748053
Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Tim Gross d286d941dc docker: kill signal API should include timeout context
When the Docker driver kills as task, we send a request via the Docker API for
dockerd to fire the signal. We send that signal and then block for the
`kill_timeout` waiting for the container to exit. But if the Docker API
blocks, we will block indefinitely because we haven't configured the API call
with the same timeout.

This changeset is a minimal intervention to add the timeout to the Docker API
call _only_ when we have the `kill_timeout` set. Future work should examine
whether we should be threading contexts through other `go-dockerclient` API
calls.
2020-12-02 16:51:57 -05:00
Nick Ethier c9bd7e89ca command: use correct port mapping syntax in examples 2020-11-23 10:25:30 -06:00
Shishir Mahajan 572c398187 Fix review comments. 2020-11-11 12:30:00 -08:00
Shishir Mahajan c30fea5cd3 Add cpuset_cpus to docker driver. 2020-11-11 12:30:00 -08:00
Tim Gross 0ef0b17b82
docker: disallow volume mounts from host by default (#9321)
The default behavior for `docker.volumes.enabled` is intended to be `false`,
but the HCL schema defaults to `true` if the value is unset. Set the default
literal value to `true`.

Additionally, Docker driver mounts of type "volume" (but not "bind") are not
being properly sandboxed with that setting. Disable Docker mounts with type
"volume" entirely whenever the `docker.volumes.enabled` flag is set to
false. Note this is unrelated to the `volume_mount` feature, which is
constrained to preconfigured host volumes or whatever is mounted by a CSI
plugin.

This changeset includes updates to unit tests that should have been failing
under the documented behavior but were not.
2020-11-11 10:03:46 -05:00
Yoan Blanc 891accb89a
use allow/deny instead of the colored alternatives (#9019)
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-10-12 08:47:05 -04:00
Seth Hoenig a8869bd304 docs: document docker signal fix, add tests
This PR adds a version specific upgrade note about the docker stop
signal behavior. Also adds test for the signal logic in docker driver.

Closes #8932 which was fixed in #8933
2020-10-02 10:06:43 -05:00
Jeffrey 'jf' Lim b84d63c4ba drivers/docker/driver.go: change default signal for docker driver to SIGTERM? 2020-09-20 03:09:07 +08:00
Nick Ethier 1849a20b66
docker: use Nomad managed resolv.conf when DNS options are set (#8600) 2020-08-17 10:22:08 -04:00
James Rasell dab8282be5
Merge pull request #8589 from hashicorp/f-gh-5718
driver/docker: allow configurable pull context timeout setting.
2020-08-14 16:07:59 +02:00
James Rasell bc42cd2e5e
driver/docker: allow configurable pull context timeout setting.
Pulling large docker containers can take longer than the default
context timeout. Without a way to change this it is very hard for
users to utilise Nomad properly without hacky work arounds.

This change adds an optional pull_timeout config parameter which
gives operators the possibility to account for increase pull times
where needed. The infra docker image also has the option to set a
custom timeout to keep consistency.
2020-08-12 08:58:07 +01:00
Nick Ethier e39574be59
docker: support group allocated ports and host_networks (#8623)
* docker: support group allocated ports

* docker: add new ports driver config to specify which group ports are mapped

* docker: update port mapping docs
2020-08-11 18:30:22 -04:00
Nick Ethier 0bc0403cc3 Task DNS Options (#7661)
Co-Authored-By: Tim Gross <tgross@hashicorp.com>
Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>
2020-06-18 11:01:31 -07:00
Niam Jen Wei d2de515f0c
Fix docker driver MemorySwap value
Fixes an incorrect value being assigned to MemorySwap when `memory_hard_limit` flag is being used.

Issue raised in https://github.com/hashicorp/nomad/issues/8153
2020-06-12 20:11:28 +01:00
Seth Hoenig 675f50b502 driver/docker: use pointer parameter on driver because locks 2020-06-01 09:35:17 -05:00
Seth Hoenig ad91ba865c driver/docker: enable setting hard/soft memory limits
Fixes #2093

Enable configuring `memory_hard_limit` in the docker config stanza for tasks.
If set, this field will be passed to the container runtime as `--memory`, and
the `memory` configuration from the task resource configuration will be passed
as `--memory_reservation`, creating hard and soft memory limits for tasks using
the docker task driver.
2020-06-01 09:22:45 -05:00
Mahmood Ali 2588b3bc98 cleanup driver eventor goroutines
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.

This change makes few modifications:

First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting.  Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.

Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
2020-05-26 11:04:04 -04:00
Mahmood Ali 9721fd22f9 docker: Fix docker image gc tracking
This fixes a bug where docker images may not be GCed.  The cause of the
bug is that we track the task using `task.ID+task.Name` on task start
but remove on plain `task.ID`.

This haromize the two paths by using `task.ID`, as it's unique enough
and it's also used in the `loadImage` path (path when loading an image
from a local tarball instead of dockerhub).
2020-05-13 12:33:17 -04:00
Mahmood Ali 04a3cfbeff
Merge pull request #7932 from hashicorp/f-docker-custom-runtimes
Docker runtimes
2020-05-12 11:59:36 -04:00
Mahmood Ali 182b95f7b1 use allow_runtimes for consistency
Other allow lists use allow_ prefix (e.g. allow_caps, allow_privileged).
2020-05-12 11:03:08 -04:00
Mahmood Ali 54565e3836
Apply suggestions from code review
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-05-12 10:56:47 -04:00
Mahmood Ali 0d692f0931 Add a knob to restrict docker runtimes 2020-05-12 10:14:43 -04:00
Juan Larriba a0df437c62
Run Linux Images (LCOW) and Windows Containers side by side (#7850)
Makes it possible to run Linux Containers On Windows with Nomad alongside Windows Containers. Fingerprint prevents only to run Nomad in Windows 10 with Linux Containers
2020-05-04 13:08:47 -04:00
Mahmood Ali dff071c3b9 driver/docker: protect against nil container
Protect against a panic when we attempt to start a container with a name
that conflicts with an existing one.  If the existing one is being
deleted while nomad first attempts to create the container, the
createContainer will fail with `container already exists`, but we get
nil container reference from the `containerByName` lookup, and cause a
crash.

I'm not certain how we get into the state, except for being very
unlucky.  I suspect that this case may be the result of a concurrent
restart or the docker engine API not being fully consistent (e.g. an
earlier call purged the container, but docker didn't free up resources
yet to create a new container with the same name immediately yet).

If that's the case, then re-attempting creation will hopefully succeed,
or we'd at least fail enough times for the alloc to be rescheduled to
another node.
2020-04-19 15:34:45 -04:00
Ben Buzbee 769a3cd8b3 Rename OCIRuntime to Runtime; allow gpu conflicts is they are the same runtime; add conflict test 2020-04-03 12:15:11 -07:00
Ben Buzbee d4f26d1eee Support custom docker runtimes
This enables customers who want to use gvisor and have it configured on their clients.
2020-04-03 11:07:37 -07:00
Mahmood Ali db4c263180
Merge pull request #7554 from benbuzbee/benbuz/fix-seccomp-file
Parse security_opts before sending them to docker daemon
2020-03-31 11:54:17 -04:00
Ben Buzbee 4f6ea87ec4 Parse security_opts before sending them to docker daemon
Fixes #6720

Copy the parsing function from the docker CLI. Docker daemon expects to see JSON for seccomp file not a path.
2020-03-31 08:34:41 -07:00
Mahmood Ali 452a057a8c driver/docker: fix memory swapping
MemorySwappiness can only be set in non-Windows options: https://ci.appveyor.com/project/hashicorp/nomad/builds/31832149

Also fixes https://github.com/hashicorp/nomad/issues/6085
2020-03-30 16:51:16 -04:00
Mahmood Ali 8f57f78087 vendors: update fsouza/go-docker-client to v.1.6.3 2020-03-30 15:10:53 -04:00
Mahmood Ali fd5d033e32
Revert "vendor: fsouza/go-docker-client v1.6.3" 2020-03-23 10:48:47 -04:00
Yoan Blanc ed8dcccb54
docker: disable swap in Windows only
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-23 08:35:09 +01:00
Yoan Blanc d9ea68e807
fixup! fixup! vendor: fsouza/go-docker-client v1.6.3
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-22 10:04:52 +01:00
Yoan Blanc 8e744d1877
vendor: fsouza/go-docker-client v1.6.3
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-22 09:25:46 +01:00
John Schlederer 8b35c75206 Making pull activity timeout configurable in Docker
* Making pull activity timeout configurable in Docker plugin config, first pass

* Fixing broken function call

* Fixing broken tests

* Fixing linter suggestion

* Adding documentation on new parameter in Docker plugin config

* Adding unit test

* Setting min value for pull_activity_timeout, making pull activity duration a private var
2019-12-18 12:58:53 +01:00
Mahmood Ali 0b7085ba3a driver: allow disabling log collection
Operators commonly have docker logs aggregated using various tools and
don't need nomad to manage their docker logs.  Worse, Nomad uses a
somewhat heavy docker api call to collect them and it seems to cause
problems when a client runs hundreds of log collections.

Here we add a knob to disable log aggregation completely for nomad.
When log collection is disabled, we avoid running logmon and
docker_logger for the docker tasks in this implementation.

The downside here is once disabled, `nomad logs ...` commands and API
no longer return logs and operators must corrolate alloc-ids with their
aggregated log info.

This is meant as a stop gap measure.  Ideally, we'd follow up with at
least two changes:

First, we should optimize behavior when we can such that operators don't
need to disable docker log collection.  Potentially by reverting to
using pre-0.9 syslog aggregation in linux environments, though with
different trade-offs.

Second, when/if logs are disabled, nomad logs endpoints should lookup
docker logs api on demand.  This ensures that the cost of log collection
is paid sparingly.
2019-12-08 14:15:03 -05:00
Tim Gross b1b20cd479
remove misleading networking log line (#6588)
When a job has a task group network, this log line ends up being
misleading if you're trying to debug networking issues. We really only
care about this when there's no port map set, in which case we get the
error returned anyways.
2019-10-30 13:23:33 -04:00
Mahmood Ali 414e01b6a6 only set a single label for now
Other labels aren't strictly necessary here, and we may follow up with a
better way to customize.
2019-10-18 15:31:13 -04:00
Mahmood Ali 3aec7b56ea Only start reconciler once in main driver
driver.SetConfig is not appropriate for starting up reconciler
goroutine.  Some ephemeral driver instances are created for validating
config and we ought not to side-effecting goroutines for those.

We currently lack a lifecycle hook to inject these, so I picked the
`Fingerprinter` function for now, and reconciler should only run after
fingerprinter started.

Use `sync.Once` to ensure that we only start reconciler loop once.
2019-10-18 14:43:23 -04:00
Mahmood Ali e24c3fac56 add docker labels 2019-10-17 10:45:12 -04:00
Mahmood Ali c01c6de481 address code review comments 2019-10-17 08:36:02 -04:00
Danielle Lancashire 4fbcc668d0
volumes: Add support for mount propagation
This commit introduces support for configuring mount propagation when
mounting volumes with the `volume_mount` stanza on Linux targets.

Similar to Kubernetes, we expose 3 options for configuring mount
propagation:

- private, which is equivalent to `rprivate` on Linux, which does not allow the
           container to see any new nested mounts after the chroot was created.

- host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts
                that have been created _outside of the container_ to be visible
                inside the container after the chroot is created.

- bidirectional, which is equivalent to `rshared` on Linux, which allows both
                 the container to see new mounts created on the host, but
                 importantly _allows the container to create mounts that are
                 visible in other containers an don the host_

private and host-to-task are safe, but bidirectional mounts can be
dangerous, as if the code inside a container creates a mount, and does
not clean it up before tearing down the container, it can cause bad
things to happen inside the kernel.

To add a layer of safety here, we require that the user has ReadWrite
permissions on the volume before allowing bidirectional mounts, as a
defense in depth / validation case, although creating mounts should also require
a priviliged execution environment inside the container.
2019-10-14 14:09:58 +02:00
Tim Gross d965a15490 driver/networking: don't recreate existing network namespaces 2019-09-25 14:58:17 -04:00
Mahmood Ali 1d945994d0 docker: remove containers on creation failures
The docker creation API calls may fail with http errors (e.g. timeout)
even if container was successfully created.

Here, we force remove container if we got unexpected failure.  We
already do this in some error handlers, and this commit updates all
paths.

I stopped short from a more aggressive refactoring, as the code is ripe
for refactoring and would rather do that in another PR.
2019-09-18 08:45:59 -04:00
Mahmood Ali 75ede5a685 add exponential backoff for docker api calls 2019-09-18 08:12:54 -04:00
Mahmood Ali ac329a5e07 retry transient docker errors within function 2019-09-13 15:25:31 -04:00
Mahmood Ali e8d73e3d72 docker: defensive against failed starts
This handles a bug where we may start a container successfully, yet we
fail due to retries and startContainer not being idempotent call.

Here, we ensure that when starting a container fails with 500 error,
the retry succeeds if container was started successfully.
2019-09-13 13:02:35 -04:00
Mahmood Ali 87f0457973 fix qemu and update docker with tests 2019-09-04 11:27:51 -04:00