Closes#12927Closes#12958
This PR updates the version of redis used in our examples from 3.2 to 7.
The old version is very not supported anymore, and we should be setting
a good example by using a supported version.
The long-form example job is now fixed so that the service stanza uses
nomad as the service discovery provider, and so now the job runs without
a requirement of having Consul running and configured.
This PR introduces support for using Nomad on systems with cgroups v2 [1]
enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux
distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems
for Nomad users.
Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer,
but not so for managing cpuset cgroups. Before, Nomad has been making use of
a feature in v1 where a PID could be a member of more than one cgroup. In v2
this is no longer possible, and so the logic around computing cpuset values
must be modified. When Nomad detects v2, it manages cpuset values in-process,
rather than making use of cgroup heirarchy inheritence via shared/reserved
parents.
Nomad will only activate the v2 logic when it detects cgroups2 is mounted at
/sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2
mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to
use the v1 logic, and should operate as before. Systems that do not support
cgroups v2 are also not affected.
When v2 is activated, Nomad will create a parent called nomad.slice (unless
otherwise configured in Client conifg), and create cgroups for tasks using
naming convention <allocID>-<task>.scope. These follow the naming convention
set by systemd and also used by Docker when cgroups v2 is detected.
Client nodes now export a new fingerprint attribute, unique.cgroups.version
which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by
Nomad.
The new cpuset management strategy fixes#11705, where docker tasks that
spawned processes on startup would "leak". In cgroups v2, the PIDs are
started in the cgroup they will always live in, and thus the cause of
the leak is eliminated.
[1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.htmlCloses#11289Fixes#11705#11773#11933
Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label:
```hcl
config {
image = "..."
mount {
type = "..."
target = "target-path"
volume_options { ... }
}
}
```
The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 .
The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could.
This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .
In the Docker driver plugin config for garbage collection, the `image_delay`
field was missing from the default we set if the entire `gc` stanza is
missing. This results in a default of 0s and immediate GC of Docker images.
Expanded docker gc config test fields.
Pulling large docker containers can take longer than the default
context timeout. Without a way to change this it is very hard for
users to utilise Nomad properly without hacky work arounds.
This change adds an optional pull_timeout config parameter which
gives operators the possibility to account for increase pull times
where needed. The infra docker image also has the option to set a
custom timeout to keep consistency.
* docker: support group allocated ports
* docker: add new ports driver config to specify which group ports are mapped
* docker: update port mapping docs
Fixes#2093
Enable configuring `memory_hard_limit` in the docker config stanza for tasks.
If set, this field will be passed to the container runtime as `--memory`, and
the `memory` configuration from the task resource configuration will be passed
as `--memory_reservation`, creating hard and soft memory limits for tasks using
the docker task driver.
* Making pull activity timeout configurable in Docker plugin config, first pass
* Fixing broken function call
* Fixing broken tests
* Fixing linter suggestion
* Adding documentation on new parameter in Docker plugin config
* Adding unit test
* Setting min value for pull_activity_timeout, making pull activity duration a private var
Operators commonly have docker logs aggregated using various tools and
don't need nomad to manage their docker logs. Worse, Nomad uses a
somewhat heavy docker api call to collect them and it seems to cause
problems when a client runs hundreds of log collections.
Here we add a knob to disable log aggregation completely for nomad.
When log collection is disabled, we avoid running logmon and
docker_logger for the docker tasks in this implementation.
The downside here is once disabled, `nomad logs ...` commands and API
no longer return logs and operators must corrolate alloc-ids with their
aggregated log info.
This is meant as a stop gap measure. Ideally, we'd follow up with at
least two changes:
First, we should optimize behavior when we can such that operators don't
need to disable docker log collection. Potentially by reverting to
using pre-0.9 syslog aggregation in linux environments, though with
different trade-offs.
Second, when/if logs are disabled, nomad logs endpoints should lookup
docker logs api on demand. This ensures that the cost of log collection
is paid sparingly.
This ensures that `port_map` along with other block like attribute
declarations (e.g. ulimit, labels, etc) can handle various hcl and json
syntax that was supported in 0.8.
In 0.8.7, the following declarations are effectively equivalent:
```
// hcl block
port_map {
http = 80
https = 443
}
// hcl assignment
port_map = {
http = 80
https = 443
}
// json single element array of map (default in API response)
{"port_map": [{"http": 80, "https": 443}]}
// json array of individual maps (supported accidentally iiuc)
{"port_map: [{"http": 80}, {"https": 443}]}
```
We achieve compatbility by using `NewAttr("...", "list(map(string))",
false)` to be serialized to a `map[string]string` wrapper, instead of using
`BlockAttrs` declaration. The wrapper merges the list of maps
automatically, to ease driver development.
This approach is closer to how v0.8.7 implemented the fields [1][2], and
despite its verbosity, seems to perserve 0.8.7 behavior in hcl2.
This is only required for built-in types that have backward
compatibility constraints. External drivers should use `BlockAttrs`
instead, as they see fit.
[1] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L216
[2] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L698-L700