open-nomad/website/content/docs/configuration/client.mdx

455 lines
18 KiB
Plaintext
Raw Normal View History

---
2020-02-06 23:45:31 +00:00
layout: docs
page_title: client Stanza - Agent Configuration
description: |-
The "client" stanza configures the Nomad agent to accept jobs as assigned by
the Nomad server, join the cluster, and specify driver-specific configuration.
---
# `client` Stanza
2020-02-06 23:45:31 +00:00
<Placement groups={['client']} />
The `client` stanza configures the Nomad agent to accept jobs as assigned by
the Nomad server, join the cluster, and specify driver-specific configuration.
```hcl
client {
enabled = true
servers = ["1.2.3.4:4647", "5.6.7.8:4647"]
}
```
## `client` Parameters
- `alloc_dir` `(string: "[data_dir]/alloc")` - Specifies the directory to use
2016-11-02 23:26:10 +00:00
for allocation data. By default, this is the top-level
2020-02-06 23:45:31 +00:00
[data_dir](/docs/configuration#data_dir) suffixed with
2018-08-30 21:20:32 +00:00
"alloc", like `"/opt/nomad/alloc"`. This must be an absolute path.
2016-11-02 23:26:10 +00:00
- `chroot_env` <code>([ChrootEnv](#chroot_env-parameters): nil)</code> -
Specifies a key-value mapping that defines the chroot environment for jobs
using the Exec and Java drivers.
- `enabled` `(bool: false)` - Specifies if client mode is enabled. All other
client configuration options depend on this value.
- `max_kill_timeout` `(string: "30s")` - Specifies the maximum amount of time a
job is allowed to wait to exit. Individual jobs may customize their own kill
timeout, but it may not exceed this value.
- `disable_remote_exec` `(bool: false)` - Specifies if the client should disable
remote task execution to tasks running on this client.
2016-12-08 18:35:40 +00:00
- `meta` `(map[string]string: nil)` - Specifies a key-value map that annotates
with user-defined metadata.
- `network_interface` `(string: varied)` - Specifies the name of the interface
2017-11-15 20:49:22 +00:00
to force network fingerprinting on. When run in dev mode, this defaults to the
loopback interface. When not in dev mode, the interface attached to the
2019-11-12 22:32:21 +00:00
default route is used. The scheduler chooses from these fingerprinted IP
addresses when allocating ports for tasks. This value support [go-sockaddr/template
format][go-sockaddr/template].
2020-02-06 23:45:31 +00:00
If no non-local IP addresses are found, Nomad could fingerprint link-local IPv6
addresses depending on the client's
2020-03-26 20:21:24 +00:00
[`"fingerprint.network.disallow_link_local"`](#fingerprint-network-disallow_link_local)
2020-02-06 23:45:31 +00:00
configuration value.
- `cpu_total_compute` `(int: 0)` - Specifies an override for the total CPU
2017-03-14 21:15:49 +00:00
compute. This value should be set to `# Cores * Core MHz`. For example, a
2020-02-06 23:45:31 +00:00
quad-core running at 2 GHz would have a total compute of 8000 (4 \* 2000). Most
clients can determine their total CPU compute automatically, and thus in most
2017-03-14 21:15:49 +00:00
cases this should be left unset.
2018-03-28 15:15:33 +00:00
- `memory_total_mb` `(int:0)` - Specifies an override for the total memory. If set,
this value overrides any detected memory.
- `min_dynamic_port` `(int:20000)` - Specifies the minimum dynamic port to be
assigned. Individual ports and ranges of ports may be excluded from dynamic
port assignment via [`reserved`](#reserved-parameters) parameters.
- `max_dynamic_port` `(int:32000)` - Specifies the maximum dynamic port to be
assigned. Individual ports and ranges of ports may be excluded from dynamic
port assignment via [`reserved`](#reserved-parameters) parameters.
- `node_class` `(string: "")` - Specifies an arbitrary string used to logically
group client nodes by user-defined class. This can be used during job
placement as a filter.
- `options` <code>([Options](#options-parameters): nil)</code> - Specifies a
key-value mapping of internal configuration for clients, such as for driver
configuration.
- `reserved` <code>([Reserved](#reserved-parameters): nil)</code> - Specifies
that Nomad should reserve a portion of the node's resources from receiving
tasks. This can be used to target a certain capacity usage for the node. For
example, a value equal to 20% of the node's CPU could be reserved to target
a CPU utilization of 80%.
- `servers` `(array<string>: [])` - Specifies an array of addresses to the Nomad
servers this client should join. This list is used to register the client with
the server nodes and advertise the available resources so that the agent can
receive work. This may be specified as an IP address or DNS, with or without
the port. If the port is omitted, the default port of `4647` is used.
2018-05-31 17:49:19 +00:00
- `server_join` <code>([server_join][server-join]: nil)</code> - Specifies
how the Nomad client will connect to Nomad servers. The `start_join` field
is not supported on the client. The retry_join fields may directly specify
the server address or use go-discover syntax for auto-discovery. See the
documentation for more detail.
2018-05-21 17:17:35 +00:00
- `state_dir` `(string: "[data_dir]/client")` - Specifies the directory to use
2020-02-06 23:45:31 +00:00
to store client state. By default, this is - the top-level
[data_dir](/docs/configuration#data_dir) suffixed with
"client", like `"/opt/nomad/client"`. This must be an absolute path.
2017-02-01 00:30:50 +00:00
- `gc_interval` `(string: "1m")` - Specifies the interval at which Nomad
attempts to garbage collect terminal allocation directories.
2017-02-01 00:30:50 +00:00
- `gc_disk_usage_threshold` `(float: 80)` - Specifies the disk usage percent which
Nomad tries to maintain by garbage collecting terminal allocations.
- `gc_inode_usage_threshold` `(float: 70)` - Specifies the inode usage percent
which Nomad tries to maintain by garbage collecting terminal allocations.
- `gc_max_allocs` `(int: 50)` - Specifies the maximum number of allocations
which a client will track before triggering a garbage collection of terminal
2020-02-06 23:45:31 +00:00
allocations. This will _not_ limit the number of allocations a node can run at
a time, however after `gc_max_allocs` every new allocation will cause terminal
allocations to be GC'd.
- `gc_parallel_destroys` `(int: 2)` - Specifies the maximum number of
parallel destroys allowed by the garbage collector. This value should be
relatively low to avoid high resource usage during garbage collections.
- `no_host_uuid` `(bool: true)` - By default a random node UUID will be
generated, but setting this to `false` will use the system's UUID. Before
Nomad 0.6 the default was to use the system UUID.
- `cni_path` `(string: "/opt/cni/bin")` - Sets the search path that is used for
CNI plugin discovery. Multiple paths can be searched using colon delimited
paths
- `cni_config_dir` `(string: "/opt/cni/config")` - Sets the directory where CNI
network configuration is located. The client will use this path when fingerprinting
CNI networks. Filenames should use the `.conflist` extension.
- `bridge_network_name` `(string: "nomad")` - Sets the name of the bridge to be
created by nomad for allocations running with bridge networking mode on the
client.
- `bridge_network_subnet` `(string: "172.26.64.0/20")` - Specifies the subnet
which the client will use to allocate IP addresses from.
2017-02-27 21:42:37 +00:00
- `artifact` <code>([Artifact](#artifact-parameters): varied)</code> -
Specifies controls on the behavior of task
[`artifact`](/docs/job-specification/artifact) stanzas.
- `template` <code>([Template](#template-parameters): nil)</code> - Specifies
2019-11-12 22:32:21 +00:00
controls on the behavior of task
2020-02-06 23:45:31 +00:00
[`template`](/docs/job-specification/template) stanzas.
- `host_volume` <code>([host_volume](#host_volume-stanza): nil)</code> - Exposes
paths from the host as volumes that can be mounted into jobs.
- `host_network` <code>([host_network](#host_network-stanza): nil)</code> - Registers
additional host networks with the node that can be selected when port mapping.
2021-05-04 17:58:23 +00:00
- `cgroup_parent` `(string: "/nomad")` - Specifies the cgroup parent for which cgroup
subsystems managed by Nomad will be mounted under. Currently this only applies to the
`cpuset` subsystems. This field is ignored on non Linux platforms.
### `chroot_env` Parameters
2020-02-06 23:45:31 +00:00
Drivers based on [isolated fork/exec](/docs/drivers/exec) implement file
system isolation using chroot on Linux. The `chroot_env` map allows the chroot
environment to be configured using source paths on the host operating system.
The mapping format is:
```text
source_path -> dest_path
```
The following example specifies a chroot which contains just enough to run the
`ls` utility:
```hcl
client {
chroot_env {
"/bin/ls" = "/bin/ls"
"/etc/ld.so.cache" = "/etc/ld.so.cache"
"/etc/ld.so.conf" = "/etc/ld.so.conf"
"/etc/ld.so.conf.d" = "/etc/ld.so.conf.d"
"/etc/passwd" = "/etc/passwd"
"/lib" = "/lib"
"/lib64" = "/lib64"
}
}
```
When `chroot_env` is unspecified, the `exec` driver will use a default chroot
environment with the most commonly used parts of the operating system. Please
2020-02-06 23:45:31 +00:00
see the [Nomad `exec` driver documentation](/docs/drivers/exec#chroot) for
the full list.
client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!
2021-10-15 23:56:14 +00:00
As of Nomad 1.2, Nomad will never attempt to embed the `alloc_dir` in the
chroot as doing so would cause infinite recursion.
### `reserved` Parameters
- `cpu` `(int: 0)` - Specifies the amount of CPU to reserve, in MHz.
- `cores` `(int: 0)` - Specifies the number of CPU cores to reserve.
- `memory` `(int: 0)` - Specifies the amount of memory to reserve, in MB.
- `disk` `(int: 0)` - Specifies the amount of disk to reserve, in MB.
- `reserved_ports` `(string: "")` - Specifies a comma-separated list of ports to
reserve on all fingerprinted network devices. Ranges can be specified by using
a hyphen separated the two inclusive ends.
### `artifact` Parameters
- `http_read_timeout` `(string: "30m")` - Specifies the maximum duration in
which an HTTP download request must complete before it is canceled. Set to
`0` to not enforce a limit.
- `http_max_size` `(string: "100GB")` - Specifies the maximum size allowed for
artifacts downloaded via HTTP. Set to `0` to not enforce a limit.
- `gcs_timeout` `(string: "30m")` - Specifies the maximum duration in which a
Google Cloud Storate operation must complete before it is canceled. Set to
`0` to not enforce a limit.
- `git_timeout` `(string: "30m")` - Specifies the maximum duration in which a
Git operation must complete before it is canceled. Set to `0` to not enforce
a limit.
- `hg_timeout` `(string: "30m")` - Specifies the maximum duration in which a
Mercurial operation must complete before it is canceled. Set to `0` to not
enforce a limit.
- `s3_timeout` `(string: "30m")` - Specifies the maximum duration in which an
S3 operation must complete before it is canceled. Set to `0` to not enforce a
limit.
### `template` Parameters
- `function_denylist` `([]string: ["plugin", "writeToFile"])` - Specifies a
list of template rendering functions that should be disallowed in job specs.
By default the `plugin` and `writeToFile` functions are disallowed as they
allow unrestricted root access to the host.
- `disable_file_sandbox` `(bool: false)` - Allows templates access to arbitrary
files on the client host via the `file` function. By default, templates can
access files only within the [task working directory].
- `max_stale` `(string: "")` - # This is the maximum interval to allow "stale"
data. By default, only the Consul leader will respond to queries. Requests to
a follower will forward to the leader. In large clusters with many requests,
this is not as scalable. This option allows any follower to respond to a query,
so long as the last-replicated data is within this bound. Higher values result
in less cluster load, but are more likely to have outdated data.
- `wait` `(Code: nil)` - Defines the minimum and maximum amount of time to wait
for the Consul cluster to reach a consistent state before rendering a template.
This is useful to enable in systems where network connectivity to Consul is degraded,
because it will reduce the number of times a template is rendered. This configuration is
also exposed in the _task template stanza_ to allow overrides per task.
```hcl
wait {
min = "5s"
max = "10s"
}
```
- `wait_bounds` `(Code: nil)` - Defines client level lower and upper bounds for
per-template `wait` configuration. If the individual template configuration has
a `min` lower than `wait_bounds.min` or a `max` greater than the `wait_bounds.max`,
the bounds will be enforced, and the template `wait` will be adjusted before being
sent to `consul-template`.
```hcl
wait_bounds {
min = "5s"
max = "10s"
}
```
- `block_query_wait` `(string: "60s")` - This is amount of time in seconds to wait
for the results of a blocking query. Many endpoints in Consul support a feature known as
"blocking queries". A blocking query is used to wait for a potential change
using long polling.
- `consul_retry` `(Code: nil)` - This controls the retry behavior when an error is
returned from Consul. Consul Template is highly fault tolerant, meaning it does
not exit in the face of failure. Instead, it uses exponential back-off and retry
functions to wait for the cluster to become available, as is customary in distributed
systems.
```hcl
consul_retry {
# This specifies the number of attempts to make before giving up. Each
# attempt adds the exponential backoff sleep time. Setting this to
# zero will implement an unlimited number of retries.
attempts = 12
# This is the base amount of time to sleep between retry attempts. Each
# retry sleeps for an exponent of 2 longer than this base. For 5 retries,
# the sleep times would be: 250ms, 500ms, 1s, 2s, then 4s.
backoff = "250ms"
# This is the maximum amount of time to sleep between retry attempts.
# When max_backoff is set to zero, there is no upper limit to the
# exponential sleep between retry attempts.
# If max_backoff is set to 10s and backoff is set to 1s, sleep times
# would be: 1s, 2s, 4s, 8s, 10s, 10s, ...
max_backoff = "1m"
}
```
- `vault_retry` `(Code: nil)` - This controls the retry behavior when an error is
returned from Vault. Consul Template is highly fault tolerant, meaning it does
not exit in the face of failure. Instead, it uses exponential back-off and retry
functions to wait for the cluster to become available, as is customary in distributed
systems.
```hcl
vault_retry {
# This specifies the number of attempts to make before giving up. Each
# attempt adds the exponential backoff sleep time. Setting this to
# zero will implement an unlimited number of retries.
attempts = 12
# This is the base amount of time to sleep between retry attempts. Each
# retry sleeps for an exponent of 2 longer than this base. For 5 retries,
# the sleep times would be: 250ms, 500ms, 1s, 2s, then 4s.
backoff = "250ms"
# This is the maximum amount of time to sleep between retry attempts.
# When max_backoff is set to zero, there is no upper limit to the
# exponential sleep between retry attempts.
# If max_backoff is set to 10s and backoff is set to 1s, sleep times
# would be: 1s, 2s, 4s, 8s, 10s, 10s, ...
max_backoff = "1m"
}
```
### `host_volume` Stanza
The `host_volume` stanza is used to make volumes available to jobs.
2019-11-12 22:32:21 +00:00
The key of the stanza corresponds to the name of the volume for use in the
2020-02-06 23:45:31 +00:00
`source` parameter of a `"host"` type [`volume`](/docs/job-specification/volume)
and ACLs.
```hcl
client {
host_volume "ca-certificates" {
path = "/etc/ssl/certs"
read_only = true
}
}
```
#### `host_volume` Parameters
- `path` `(string: "", required)` - Specifies the path on the host that should
be used as the source when this volume is mounted into a task. The path must
exist on client startup.
- `read_only` `(bool: false)` - Specifies whether the volume should only ever be
allowed to be mounted `read_only`, or if it should be writeable.
### `host_network` Stanza
The `host_network` stanza is used to register additional host networks with
the node that can be used when port mapping.
The key of the stanza corresponds to the name of the network used in the
[`host_network`](/docs/job-specification/network#host-networks).
```hcl
client {
host_network "public" {
cidr = "203.0.113.0/24"
reserved_ports = "22,80"
}
}
```
#### `host_network` Parameters
- `cidr` `(string: "")` - Specifies a cidr block of addresses to match against.
2020-09-30 13:48:40 +00:00
If an address is found on the node that is contained by this cidr block, the
host network will be registered with it.
- `interface` `(string: "")` - Filters searching of addresses to a specific interface.
- `reserved_ports` `(string: "")` - Specifies a comma-separated list of ports to
reserve on all fingerprinted network devices. Ranges can be specified by using
a hyphen separating the two inclusive ends.
## `client` Examples
### Common Setup
This example shows the most basic configuration for a Nomad client joined to a
cluster.
```hcl
client {
enabled = true
2018-05-25 20:04:32 +00:00
server_join {
retry_join = [ "1.1.1.1", "2.2.2.2" ]
retry_max = 3
retry_interval = "15s"
}
}
```
### Reserved Resources
This example shows a sample configuration for reserving resources to the client.
This is useful if you want to allocate only a portion of the client's resources
to jobs.
```hcl
client {
enabled = true
reserved {
cpu = 500
memory = 512
disk = 1024
reserved_ports = "22,80,8500-8600"
}
}
```
### Custom Metadata, Network Speed, and Node Class
This example shows a client configuration which customizes the metadata, network
2019-11-12 22:32:21 +00:00
speed, and node class. The scheduler can use this information while processing
[constraints][metadata_constraint]. The metadata is completely user configurable;
the values below are for illustrative purposes only.
```hcl
client {
enabled = true
node_class = "prod"
meta {
2020-12-10 16:33:13 +00:00
owner = "ops"
cached_binaries = "redis,apache,nginx,jq,cypress,nodejs"
rack = "rack-12-1"
}
}
```
2019-11-12 22:32:21 +00:00
add plugin content (docs) (#5186) * call out pluggable drivers in task drivers section and link/add info to plugin stanza * fix hyphenation * removing page and nav that tells users drivers are not pluggable * show new syntax for configuring raw_exec plugin on client * enabled option value for raw_exec is boolean * add plugin options section and mark client options as soon to be deprecated * fix typos * add plugin options for rkt task drivers and place deprecation warning in client options * add some plugin options with plugin configuration example + mark client options as soon to be deprecated * modify deprecation warning * replace colon with - for options * add docker plugin options * update links within docker task driver to point to plugin options * fix typo and clarify config options for lxc task driver * replace raw_exec plugin syntax example with docker example * create external section * restructure lxc docs and add backward incompatibility warning * update lxc driver doc * add redirect for lxc driver doc * call out plugin options and mark client config options for drivers as deprecated * add placeholder for lxc driver binary download * update data_dir/plugins reference with plugin_dir reference * Update website/source/docs/external/lxc.html.md Co-Authored-By: Omar-Khawaja <Omar-Khawaja@users.noreply.github.com> * corrections * remove lxc from built-in drivers navigation * reorganize doc structure and fix redirect * add detail about 0.9 changes * implement suggestions/fixes * removed extraneous punctuation * add official lxc driver link
2019-01-29 20:53:05 +00:00
[plugin-options]: #plugin-options
2020-02-06 23:45:31 +00:00
[plugin-stanza]: /docs/configuration/plugin
[server-join]: /docs/configuration/server_join 'Server Join'
[metadata_constraint]: /docs/job-specification/constraint#user-specified-metadata 'Nomad User-Specified Metadata Constraint Example'
2020-10-15 14:20:56 +00:00
[task working directory]: /docs/runtime/environment#task-directories 'Task directories'
[go-sockaddr/template]: https://godoc.org/github.com/hashicorp/go-sockaddr/template