The error returned from the stdlib's `exec` package is always a message with
the exit code of the exec'd process, not any error message that process might
have given us. This results in opaque failures for the Nomad user. Cast to an
`ExitError` so that we can access the output from stderr.
If the docker engine is running on cgroup-v2 host, then RSS and Max
Usage doesn't get reported.
Using a heauristic here to avoid adding more API calls to the Docker
Engine to infer cgroups version. Also, opted to avoid coordinating stats
collection with fingerprinting, which adds concurrency complexities.
The test assertion that we don't have a delete future remaining races with the
code its testing, because the removal of the image and the removal of the
future are not atomic. Move this assertion into a `WaitForResult` to avoid
test flakes which we're seeing on CI on Windows in particular.
* Fixup uses of `sanity`
* Remove unnecessary comments.
These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.
* Update nomad/fsm_test.go
This PR adds pid_mode and ipc_mode options to the exec and java task
driver config options. By default these will defer to the default_pid_mode
and default_ipc_mode agent plugin options created in #9969. Setting
these values to "host" mode disables isolation for the task. Doing so
is not recommended, but may be necessary to support legacy job configurations.
Closes#9970
This PR adds default_pid_mode and default_ipc_mode options to the exec and java
task drivers. By default these will default to "private" mode, enabling PID and
IPC isolation for tasks. Setting them to "host" mode disables isolation. Doing
so is not recommended, but may be necessary to support legacy job configurations.
Closes#9969
This has to have been unused because the HasPrefix operation is
backwards, meaning a Command.Env that includes PATH= never would have
worked; the default path was always used.
Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label:
```hcl
config {
image = "..."
mount {
type = "..."
target = "target-path"
volume_options { ... }
}
}
```
The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 .
The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could.
This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .
When the Docker driver kills as task, we send a request via the Docker API for
dockerd to fire the signal. We send that signal and then block for the
`kill_timeout` waiting for the container to exit. But if the Docker API
blocks, we will block indefinitely because we haven't configured the API call
with the same timeout.
This changeset is a minimal intervention to add the timeout to the Docker API
call _only_ when we have the `kill_timeout` set. Future work should examine
whether we should be threading contexts through other `go-dockerclient` API
calls.
Use targetted ignore comments for the cases where we are bound by
backward compatibility.
I've left some file based linters, especially when the file is riddled
with linter voilations (e.g. enum names), or if it's a property of the
file (e.g. package and file names).
I encountered an odd behavior related to RPC_REQUEST_RESPONSE_UNIQUE and
RPC_REQUEST_STANDARD_NAME. Apparently, if they target a `stream` type,
we must separate them into separate lines so that the ignore comment
targets the type specifically.
Fix#9210 .
This update the executor so it honors the User when using nomad alloc exec. The bug was that the exec task didn't honor the init command when execing.
When raw_exec is configured with [`no_cgroups`](https://www.nomadproject.io/docs/drivers/raw_exec#no_cgroups), raw_exec shouldn't attempt to create a cgroup.
Prior to this change, we accidentally always required freezer cgroup to do stats PID tracking. We already have the proper fallback in place for metrics, so only need to ensure that we don't create a cgroup for the task.
Fixes https://github.com/hashicorp/nomad/issues/8565
The default behavior for `docker.volumes.enabled` is intended to be `false`,
but the HCL schema defaults to `true` if the value is unset. Set the default
literal value to `true`.
Additionally, Docker driver mounts of type "volume" (but not "bind") are not
being properly sandboxed with that setting. Disable Docker mounts with type
"volume" entirely whenever the `docker.volumes.enabled` flag is set to
false. Note this is unrelated to the `volume_mount` feature, which is
constrained to preconfigured host volumes or whatever is mounted by a CSI
plugin.
This changeset includes updates to unit tests that should have been failing
under the documented behavior but were not.
Previously, it was required that you `go get github.com/hashicorp/nomad` to be
able to build protos, as the protoc invocation added an include directive that
pointed to `$GOPATH/src`, which is how dependent protos were discovered. As
Nomad now uses Go modules, it won't necessarily be cloned to `$GOPATH`.
(Additionally, if you _had_ go-gotten Nomad at some point, protoc compilation
would have possibly used the _wrong_ protos, as those wouldn't necessarily be
the most up-to-date ones.)
This change modifies the proto files and the `protoc` invocation to handle
discovering dependent protos via protoc plugin modifier statements that are
specific to the protoc plugin being used.
In this change, `make proto` was run to recompile the protos, which results in
changes only to the gzipped `FileDescriptorProto`.
This fixes a bug where pre-0.9 executors fail to recover after an
upgrade.
The bug is that legacyExecutorWrappers didn't get updated with
ExecStreaming function, and thus failed to implement the Executor
function. Sadly, this meant that all recovery attempts fail, as the
runtime check in
b312aacbc9/drivers/shared/executor/utils.go (L103-L110)
.
Dockerhub is going to rate limit unauthenticated pulls.
Use our HashiCorp internal mirror for builds run through CircleCI.
Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>
In the Docker driver plugin config for garbage collection, the `image_delay`
field was missing from the default we set if the entire `gc` stanza is
missing. This results in a default of 0s and immediate GC of Docker images.
Expanded docker gc config test fields.
This PR adds a version specific upgrade note about the docker stop
signal behavior. Also adds test for the signal logic in docker driver.
Closes#8932 which was fixed in #8933
My latest Vagrant box contains an empty cgroup name that isn't used for
isolation:
```
$ cat /proc/self/cgroup | grep ::
0::/user.slice/user-1000.slice/session-17.scope
```
Symlinking busybox may fail when the test code and the test temporary
directory live on different volumes/partitions; so we should copy
instead. This situation arises in the Vagrant setup, where the code
repository live on special file sharing volume.
Somewhat unrelated, remove `f.Sync()` invocation from a test copyFile
helper function. Sync is useful only for crash recovery, and isn't
necessary in our test setup. The sync invocation is a significant
overhead as it requires the OS to flush any cached writes to disk.
The 'docker.config.infra_image' would default to an amd64 container.
It is possible to reference the correct image for a platform using
the `runtime.GOARCH` variable, eliminating the need to explicitly set
the `infra_image` on non-amd64 platforms.
Also upgrade to Google's pause container version 3.1 from 3.0, which
includes some enhancements around process management.
Fixes#8926
Pulling large docker containers can take longer than the default
context timeout. Without a way to change this it is very hard for
users to utilise Nomad properly without hacky work arounds.
This change adds an optional pull_timeout config parameter which
gives operators the possibility to account for increase pull times
where needed. The infra docker image also has the option to set a
custom timeout to keep consistency.
* docker: support group allocated ports
* docker: add new ports driver config to specify which group ports are mapped
* docker: update port mapping docs
Fixes#2093
Enable configuring `memory_hard_limit` in the docker config stanza for tasks.
If set, this field will be passed to the container runtime as `--memory`, and
the `memory` configuration from the task resource configuration will be passed
as `--memory_reservation`, creating hard and soft memory limits for tasks using
the docker task driver.
Fix some docker test flakiness where image cleanup process may
contaminate other tests. A clean up process may attempt to delete an
image while it's used by another test.
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.
This change makes few modifications:
First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting. Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.
Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
When an allocation runs for a task driver that can't support volume mounts,
the mounting will fail in a way that can be hard to understand. With host
volumes this usually means failing silently, whereas with CSI the operator
gets inscrutable internals exposed in the `nomad alloc status`.
This changeset adds a MountConfig field to the task driver Capabilities
response. We validate this when the `csi_hook` or `volume_hook` fires and
return a user-friendly error.
Note that we don't currently have a way to get driver capabilities up to the
server, except through attributes. Validating this when the user initially
submits the jobspec would be even better than what we're doing here (and could
be useful for all our other capabilities), but that's out of scope for this
changeset.
Also note that the MountConfig enum starts with "supports all" in order to
support community plugins in a backwards compatible way, rather than cutting
them off from volume mounting unexpectedly.
This fixes a bug where docker images may not be GCed. The cause of the
bug is that we track the task using `task.ID+task.Name` on task start
but remove on plain `task.ID`.
This haromize the two paths by using `task.ID`, as it's unique enough
and it's also used in the `loadImage` path (path when loading an image
from a local tarball instead of dockerhub).