open-nomad/website/pages/docs/upgrade/upgrade-specific.mdx
Tim Gross 1785822386
template: trigger change_mode for dynamic secrets on restore (#9636)
When a task is restored after a client restart, the template runner will
create a new lease for any dynamic secret (ex. Consul or PKI secrets
engines). But because this lease is being created in the prestart hook, we
don't trigger the `change_mode`.

This changeset uses the the existence of the task handle to detect a
previously running task that's been restored, so that we can trigger the
template `change_mode` if the template is changed, as it will be only with
dynamic secrets.
2020-12-16 13:36:19 -05:00

988 lines
39 KiB
Plaintext

---
layout: docs
page_title: Upgrade Guides
sidebar_title: Specific Version Details
description: |-
Specific versions of Nomad may have additional information about the upgrade
process beyond the standard flow.
---
# Upgrade Guides
The [upgrading page](/docs/upgrade) covers the details of doing a standard
upgrade. However, specific versions of Nomad may have more details provided for
their upgrades as a result of new features or changed behavior. This page is
used to document those details separately from the standard upgrade flow.
## Nomad 1.0.2
#### Dynamic secrets trigger template changes on client restart
Nomad 1.0.2 changed the behavior of template `change_mode` triggers when a
client node restarts. In Nomad 1.0.1 and earlier, the first rendering of a
template after a client restart would not trigger the `change_mode`. For
dynamic secrets such as the Vault PKI secrets engine, this resulted in the
secret being updated but not restarting or signalling the task. When the
secret's lease expired at some later time, the task workload might fail
because of the stale secret. For example, a web server's SSL certificate would
be expired and browsers would be unable to connect.
In Nomad 1.0.2, when a client node is restarted any task with Vault secrets
that are generated or have expired will have its `change_mode` triggered. If
`change_mode = "restart"` this will result in the task being restarted, to
avoid the task failing unexpectedly at some point in the future. This change
only impacts tasks using dynamic Vault secrets engines such as [PKI][pki], or
when secrets are rotated. Secrets that don't change in Vault will not trigger
a `change_mode` on client restart.
## Nomad 1.0.1
#### Envoy worker threads
Nomad v1.0.0 changed the default behavior around the number of worker threads
created by the Envoy when being used as a sidecar for Consul Connect. In Nomad
v1.0.1, the same default setting of [`--concurrency=1`][envoy_concurrency] is set for Envoy when used
as a Connect gateway. As before, the [`meta.connect.proxy_concurrency`][proxy_concurrency]
property can be set in client configuration to override the default value.
## Nomad 1.0.0
### HCL2 for Job specification
Nomad v1.0.0 adopts HCL2 for parsing the job spec. HCL2 extends HCL with more
expression and reuse support, but adds some stricter schema for HCL blocks
(a.k.a. stanzas). Check [HCL](/docs/job-specification/hcl2) for more details.
### Signal used when stopping Docker tasks
When stopping tasks running with the Docker task driver, Nomad documents that a
`SIGTERM` will be issued (unless configured with `kill_signal`). However, recent
versions of Nomad would issue `SIGINT` instead. Starting again with Nomad v1.0.0
`SIGTERM` will be sent by default when stopping Docker tasks.
### Deprecated metrics have been removed
Nomad v0.7.0 added supported for tagged metrics and deprecated untagged metrics.
There was support for configuring backwards-compatible metrics. This support has
been removed with v1.0.0, and all metrics will be emitted with tags.
### Null characters in region, datacenter, job name/ID, task group name, and task names
Starting with Nomad v1.0.0, jobs will fail validation if any of the following
contain null character: the job ID or name, the task group name, or the task
name. Any jobs meeting this requirement should be modified before an update to
v1.0.0. Similarly, client and server config validation will prohibit either the
region or the datacenter from containing null characters.
### EC2 CPU characteristics may be different
Starting with Nomad v1.0.0, the AWS fingerprinter uses data derived from the
official AWS EC2 API to determine default CPU performance characteristics,
including core count and core speed. This data should be accurate for each
instance type per region. Previously, Nomad used a hand-made lookup table that
was not region aware and may have contained inaccurate or incomplete data. As
part of this change, the AWS fingerprinter no longer sets the `cpu.modelname`
attribute.
As before, `cpu_total_compute` can be used to override the discovered CPU
resources available to the Nomad client.
### Inclusive language
Starting with Nomad v1.0.0, the terms `blacklist` and `whitelist` have been
deprecated from client configuration and driver configuration. The existing
configuration values are permitted but will be removed in a future version of
Nomad. The specific configuration values replaced are:
- Client `driver.blacklist` is replaced with `driver.denylist`.
- Client `driver.whitelist` is replaced with `driver.allowlist`.
- Client `env.blacklist` is replaced with `env.denylist`.
- Client `fingerprint.blacklist` is replaced with `fingerprint.denylist`.
- Client `fingerprint.whitelist` is replaced with `fingerprint.allowlist`.
- Client `user.blacklist` is replaced with `user.denylist`.
- Client `template.function_blacklist` is replaced with
`template.function_denylist`.
- Docker driver `docker.caps.whitelist` is replaced with
`docker.caps.allowlist`.
### Consul Connect
Nomad 1.0's Consul Connect integration works best with Consul 1.9 or later. The
ideal upgrade path is:
1. Create a new Nomad client image with Nomad 1.0 and Consul 1.9 or later.
2. Add new hosts based on the image.
3. [Drain][drain-cli] and shutdown old Nomad client nodes.
While inplace upgrades and older versions of Consul are supported by Nomad 1.0,
Envoy proxies will drop and stop accepting connections while the Nomad agent is
restarting. Nomad 1.0 with Consul 1.9 do not have this limitation.
#### Envoy proxy versions
Nomad v1.0.0 changes the behavior around the selection of Envoy version used for
Connect sidecar proxies. Previously, Nomad always defaulted to Envoy v1.11.2 if
neither the `meta.connect.sidecar_image` parameter or `sidecar_task` stanza were
explicitly configured. Likewise the same version of Envoy would be used for
Connect ingress gateways if `meta.connect.gateway_image` was unset. Starting
with Nomad v1.0.0, each Nomad Client will query Consul for a list of supported
Envoy versions. Nomad will make use of the latest version of Envoy supported by
the Consul agent when launching Envoy as a Connect sidecar proxy. If the version
of the Consul agent is older than v1.7.8, v1.8.4, or v1.9.0, Nomad will fallback
to the v1.11.2 version of Envoy. As before, if the `meta.connect.sidecar_image`,
`meta.connect.gateway_image`, or `sidecar_task` stanza are set, those settings
take precedence.
When upgrading Nomad Clients from a previous version to v1.0.0 and above, it is
recommended to also upgrade the Consul agents to v1.7.8, 1.8.4, or v1.9.0 or
newer. Upgrading Nomad and Consul to versions that support the new behavior
while also doing a full [node drain][] at the time of the upgrade for each node
will ensure Connect workloads are properly rescheduled onto nodes in such a way
that the Nomad Clients, Consul agents, and Envoy sidecar tasks maintain
compatibility with one another.
#### Envoy worker threads
Nomad v1.0.0 changes the default behavior around the number of worker threads
created by the Envoy sidecar proxy when using Consul Connect. Previously, the
Envoy [`--concurrency`][envoy_concurrency] argument was left unset, which caused
Envoy to spawn as many worker threads as logical cores available on the CPU. The
`--concurrency` value now defaults to `1` and can be configured by setting the
[`meta.connect.proxy_concurrency`][proxy_concurrency] property in client
configuration.
## Nomad 0.12.8
### Docker volume mounts
Nomad 0.12.8 includes security fixes for the handling of Docker volume mounts:
- The `docker.volumes.enabled` flag now defaults to `false` as documented.
- Docker driver mounts of type "volume" (but not "bind") were not sandboxed and
could mount arbitrary locations from the client host. The
`docker.volumes.enabled` configuration will now disable Docker mounts with
type "volume" when set to `false` (the default).
This change Docker impacts jobs that use a `mounts` with type "volume", as shown
below. This job will fail when placed unless `docker.volumes.enabled = true`.
```hcl
mounts = [
{
type = "volume"
target = "/path/in/container"
source = "docker_volume"
volume_options = {
driver_config = {
name = "local"
options = [
{
device = "/"
o = "ro,bind"
type = "ext4"
}
]
}
}
}
]
```
## Nomad 0.12.6
### Artifact and Template Paths
Nomad 0.12.6 includes security fixes for privilege escalation vulnerabilities
in handling of job `template` and `artifact` stanzas:
- The `template.source` and `template.destination` fields are now protected by
the file sandbox introduced in 0.9.6. These paths are now restricted to fall
inside the task directory by default. An operator can opt-out of this
protection with the [`template.disable_file_sandbox`][] field in the client
configuration.
- The paths for `template.source`, `template.destination`, and
`artifact.destination` are validated on job submission to ensure the paths do
not escape the file sandbox. It was possible to use interpolation to bypass
this validation. The client now interpolates the paths before checking if they
are in the file sandbox.
~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.12.6, the
`template.destination` and `artifact.destination` paths do not support
absolute paths, including the interpolated `NOMAD_SECRETS_DIR`,
`NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in
v0.12.9. To work around the bug, use a relative path.
## Nomad 0.12.0
### `mbits` and Task Network Resource deprecation
Starting in Nomad 0.12.0 the `mbits` field of the network resource block has
been deprecated and is no longer considered when making scheduling decisions.
This is in part because we felt that `mbits` didn't accurately account network
bandwidth as a resource.
Additionally the use of the `network` block inside of a task's `resource` block
is also deprecated. Users are advised to move their `network` block to the
`group` block. Recent networking features have only been added to group based
network configuration. If any usecase or feature which was available with task
network resource is not fulfilled with group network configuration, please open
an issue detailing the missing capability.
### Enterprise Licensing
Enterprise binaries for Nomad are now publicly available via
[releases.hashicorp.com](https://releases.hashicorp.com/nomad/). By default all
enterprise features are enabled for 6 hours. During that time enterprise users
should apply their license with the [`nomad license put ...`](/docs/commands/license/put) command.
Once the 6 hour demonstration period expires, Nomad will shutdown. If restarted
Nomad will shutdown in a very short amount of time unless a valid license is
applied.
~> **Warning:** Due to a [bug][gh-8457] in Nomad v0.12.0, existing clusters
that are upgraded will **not** have 6 hours to apply a license. The minimal
grace period should be sufficient to apply a valid license, but enterprise
users are encouraged to delay upgrading until Nomad v0.12.1 is released and
fixes the issue.
### Docker access host filesystem
Nomad 0.12.0 disables Docker tasks access to the host filesystem, by default.
Prior to Nomad 0.12, Docker tasks may mount and then manipulate any host file
and may pose a security risk.
Operators now must explicitly allow tasks to access host filesystem. [Host
Volumes](/docs/configuration/client#host_volume-stanza) provide a fine tune
access to individual paths.
To restore pre-0.12.0 behavior, you can enable [Docker
`volume`](/docs/drivers/docker#enabled-1) to allow binding host paths, by adding
the following to the nomad client config file:
```hcl
plugin "docker" {
config {
volumes {
enabled = true
}
}
}
```
### QEMU images
Nomad 0.12.0 restricts the paths the QEMU tasks can load an image from. A QEMU
task may download an image to the allocation directory to load. But images
outside the allocation directories must be explicitly allowed by operators in
the client agent configuration file.
For example, you may allow loading QEMU images from `/mnt/qemu-images` by
adding the following to the agent configuration file:
```hcl
plugin "qemu" {
config {
image_paths = ["/mnt/qemu-images"]
}
}
```
## Nomad 0.11.7
### Docker volume mounts
Nomad 0.11.7 includes a security fix for the handling of Docker volume
mounts. Docker driver mounts of type "volume" (but not "bind") were not
sandboxed and could mount arbitrary locations from the client host. The
`docker.volumes.enabled` configuration will now disable Docker mounts with
type "volume" when set to `false`.
This change Docker impacts jobs that use a `mounts` with type "volume", as
shown below. This job will fail when placed unless `docker.volumes.enabled = true`.
```hcl
mounts = [
{
type = "volume"
target = "/path/in/container"
source = "docker_volume"
volume_options = {
driver_config = {
name = "local"
options = [
{
device = "/"
o = "ro,bind"
type = "ext4"
}
]
}
}
}
]
```
## Nomad 0.11.5
### Artifact and Template Paths
Nomad 0.11.5 includes backported security fixes for privilege escalation
vulnerabilities in handling of job `template` and `artifact` stanzas:
- The `template.source` and `template.destination` fields are now protected by
the file sandbox introduced in 0.9.6. These paths are now restricted to fall
inside the task directory by default. An operator can opt-out of this
protection with the
[`template.disable_file_sandbox`](/docs/configuration/client#template-parameters)
field in the client configuration.
- The paths for `template.source`, `template.destination`, and
`artifact.destination` are validated on job submission to ensure the paths
do not escape the file sandbox. It was possible to use interpolation to
bypass this validation. The client now interpolates the paths before
checking if they are in the file sandbox.
~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.11.5, the
`template.destination` and `artifact.destination` paths do not support
absolute paths, including the interpolated `NOMAD_SECRETS_DIR`,
`NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in
v0.11.6. To work around the bug, use a relative path.
## Nomad 0.11.3
Nomad 0.11.3 fixes a critical bug causing the nomad agent to become
unresponsive. The issue is due to a [Go 1.14.1 runtime
bug](https://github.com/golang/go/issues/38023) and affects Nomad 0.11.1 and
0.11.2.
## Nomad 0.11.2
### Scheduler Scoring Changes
Prior to Nomad 0.11.2 the scheduler algorithm used a [node's reserved
resources][reserved]
incorrectly during scoring. The result of this bug was that scoring biased in
favor of nodes with reserved resources vs nodes without reserved resources.
Placements will be more correct but slightly different in v0.11.2 vs earlier
versions of Nomad. Operators do _not_ need to take any actions as the impact of
the bug fix will only minimally affect scoring.
Feasibility (whether a node is capable of running a job at all) is _not_
affected.
### Periodic Jobs and Daylight Saving Time
Nomad 0.11.2 fixed a long outstanding bug affecting periodic jobs that are
scheduled to run during Daylight Saving Time transitions.
Nomad 0.11.2 provides a more defined behavior: Nomad evaluates the cron
expression with respect to specified time zone during transition. A 2:30am
nightly job with `America/New_York` time zone will not run on the day daylight
saving time starts; similarly, a 1:30am nightly job will run twice on the day
daylight saving time ends. See the [Daylight Saving Time][dst] documentation
for details.
## Nomad 0.11.0
### client.template: `vault_grace` deprecation
Nomad 0.11.0 updates
[consul-template](https://github.com/hashicorp/consul-template) to v0.24.1. This
library deprecates the [`vault_grace`][vault_grace] option for templating
included in Nomad. The feature has been ignored since Vault 0.5 and as long as
you are running a more recent version of Vault, you can safely remove
`vault_grace` from your Nomad jobs.
### Rkt Task Driver Removed
The `rkt` task driver has been deprecated and removed from Nomad. While the code
is available in an external repository,
<https://github.com/hashicorp/nomad-driver-rkt>, it will not be maintained as
`rkt` is [no longer being developed upstream](https://github.com/rkt/rkt). We
encourage all `rkt` users to find a new task driver as soon as possible.
## Nomad 0.10.8
### Docker volume mounts
Nomad 0.10.8 includes a security fix for the handling of Docker volume mounts.
Docker driver mounts of type "volume" (but not "bind") were not sandboxed and
could mount arbitrary locations from the client host. The
`docker.volumes.enabled` configuration will now disable Docker mounts with type
"volume" when set to `false`.
This change Docker impacts jobs that use a `mounts` with type "volume", as shown
below. This job will fail when placed unless `docker.volumes.enabled = true`.
```hcl
mounts = [
{
type = "volume"
target = "/path/in/container"
source = "docker_volume"
volume_options = {
driver_config = {
name = "local"
options = [
{
device = "/"
o = "ro,bind"
type = "ext4"
}
]
}
}
}
]
```
## Nomad 0.10.6
### Artifact and Template Paths
Nomad 0.10.6 includes backported security fixes for privilege escalation
vulnerabilities in handling of job `template` and `artifact` stanzas:
- The `template.source` and `template.destination` fields are now protected by
the file sandbox introduced in 0.9.6. These paths are now restricted to fall
inside the task directory by default. An operator can opt-out of this
protection with the
[`template.disable_file_sandbox`](/docs/configuration/client#template-parameters)
field in the client configuration.
- The paths for `template.source`, `template.destination`, and
`artifact.destination` are validated on job submission to ensure the paths
do not escape the file sandbox. It was possible to use interpolation to
bypass this validation. The client now interpolates the paths before
checking if they are in the file sandbox.
~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.10.6, the
`template.destination` and `artifact.destination` paths do not support
absolute paths, including the interpolated `NOMAD_SECRETS_DIR`,
`NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in
v0.10.7. To work around the bug, use a relative path.
## Nomad 0.10.4
### Same-Node Scheduling Penalty Removed
Nomad 0.10.4 includes a fix to the scheduler that removes the same-node penalty
for allocations that have not previously failed. In earlier versions of Nomad,
the node where an allocation was running was penalized from receiving updated
versions of that allocation, resulting in a higher chance of the allocation
being placed on a new node. This was changed so that the penalty only applies to
nodes where the previous allocation has failed or been rescheduled, to reduce
the risk of correlated failures on a host. Scheduling weighs a number of
factors, but this change should reduce movement of allocations that are being
updated from a healthy state. You can view the placement metrics for an
allocation with `nomad alloc status -verbose`.
### Additional Environment Variable Filtering
Nomad will by default prevent certain environment variables set in the client
process from being passed along into launched tasks. The `CONSUL_HTTP_TOKEN`
environment variable has been added to the default list. More information can
be found in the `env.blacklist` [configuration](/docs/configuration/client#env-blacklist) .
## Nomad 0.10.3
### mTLS Certificate Validation
Nomad 0.10.3 includes a fix for a privilege escalation vulnerability in
validating TLS certificates for RPC with mTLS. Nomad RPC endpoints validated
that TLS client certificates had not expired and were signed by the same CA as
the Nomad node, but did not correctly check the certificate's name for the role
and region as described in the [Securing Nomad with TLS][tls-guide] guide. This
allows trusted operators with a client certificate signed by the CA to send RPC
calls as a Nomad client or server node, bypassing access control and accessing
any secrets available to a client.
Nomad clusters configured for mTLS following the [Securing Nomad with
TLS][tls-guide] guide or the [Vault PKI Secrets Engine
Integration][tls-vault-guide] guide should already have certificates that will
pass validation. Before upgrading to Nomad 0.10.3, operators using mTLS with
`verify_server_hostname = true` should confirm that the common name or SAN of
all Nomad client node certs is `client.<region>.nomad`, and that the common name
or SAN of all Nomad server node certs is `server.<region>.nomad`.
### Connection Limits Added
Nomad 0.10.3 introduces the [limits][] agent configuration parameters for
mitigating denial of service attacks from users who are not authenticated via
mTLS. The default limits stanza is:
```hcl
limits {
https_handshake_timeout = "5s"
http_max_conns_per_client = 100
rpc_handshake_timeout = "5s"
rpc_max_conns_per_client = 100
}
```
If your Nomad agent's endpoints are protected from unauthenticated users via
other mechanisms these limits may be safely disabled by setting them to `0`.
However the defaults were chosen to be safe for a wide variety of Nomad
deployments and may protect against accidental abuses of the Nomad API that
could cause unintended resource usage.
## Nomad 0.10.2
### Preemption Panic Fixed
Nomad 0.9.7 and 0.10.2 fix a [server crashing bug][gh-6787] present in scheduler
preemption since 0.9.0. Users unable to immediately upgrade Nomad can [disable
preemption][preemption-api] to avoid the panic.
### Dangling Docker Container Cleanup
Nomad 0.10.2 addresses an issue occurring in heavily loaded clients, where
containers are started without being properly managed by Nomad. Nomad 0.10.2
introduced a reaper that detects and kills such containers.
Operators may opt to run reaper in a dry-mode or disabling it through a client
config.
For more information, see [Docker Dangling containers][dangling-containers].
## Nomad 0.10.0
### Deployments
Nomad 0.10 enables rolling deployments for service jobs by default and adds a
default update stanza when a service job is created or updated. This does not
affect jobs with an update stanza.
In pre-0.10 releases, when updating a service job without an update stanza, all
existing allocations are stopped while new allocations start up, and this may
cause a service degradation or an outage. You can regain this behavior and
disable deployments by setting `max_parallel` to 0.
For more information, see [`update` stanza][update].
## Nomad 0.9.5
### Template Rendering
Nomad 0.9.5 includes security fixes for privilege escalation vulnerabilities in
handling of job `template` stanzas:
- The client host's environment variables are now cleaned before rendering the
template. If a template includes the `env` function, the job should include an
[`env`](/docs/job-specification/env) stanza to allow access to the variable in
the template.
- The `plugin` function is no longer permitted by default and will raise an
error if used in a template. Operator can opt-in to permitting this function
with the new
[`template.function_blacklist`](/docs/configuration/client#template-parameters)
field in the client configuration.
- The `file` function has been changed to restrict paths to fall inside the task
directory by default. Paths that used the `NOMAD_TASK_DIR` environment
variable to prefix file paths should work unchanged. Relative paths or
symlinks that point outside the task directory will raise an error. An
operator can opt-out of this protection with the new
[`template.disable_file_sandbox`](/docs/configuration/client#template-parameters)
field in the client configuration.
## Nomad 0.9.0
### Preemption
Nomad 0.9 adds preemption support for system jobs. If a system job is submitted
that has a higher priority than other running jobs on the node, and the node
does not have capacity remaining, Nomad may preempt those lower priority
allocations to place the system job. See [preemption][preemption] for more
details.
### Task Driver Plugins
All task drivers have become [plugins][plugins] in Nomad 0.9.0. There are two
user visible differences between 0.8 and 0.9 drivers:
- [LXC][lxc] is now community supported and distributed independently.
- Task driver [`config`][task-config] stanzas are no longer validated by
the [`nomad job validate`][validate] command. This is a regression that will
be fixed in a future release.
There is a new method for client driver configuration options, but existing
`client.options` settings are supported in 0.9. See [plugin
configuration][plugin-stanza] for details.
#### LXC
LXC is now an external plugin and must be installed separately. See [the LXC
driver's documentation][lxc] for details.
### Structured Logging
Nomad 0.9.0 switches to structured logging. Any log processing on the pre-0.9
log output will need to be updated to match the structured output.
Structured log lines have the format:
```
# <Timestamp> [<Level>] <Component>: <Message>: <KeyN>=<ValueN> ...
2019-01-29T05:52:09.221Z [INFO ] client.plugin: starting plugin manager: plugin-type=device
```
Values containing whitespace will be quoted:
```
... starting plugin: task=redis args="[/opt/gopath/bin/nomad logmon]"
```
### HCL2 Transition
Nomad 0.9.0 begins a transition to [HCL2][hcl2], the next version of the
HashiCorp configuration language. While Nomad has begun integrating HCL2, users
will need to continue to use HCL1 in Nomad 0.9.0 as the transition is
incomplete.
If you interpolate variables in your [`task.config`][task-config] containing
consecutive dots in their name, you will need to change your job specification
to use the `env` map. See the following example:
```hcl
env {
# Note the multiple consecutive dots
image...version = "3.2"
# Valid in both v0.8 and v0.9
image.version = "3.2"
}
# v0.8 task config stanza:
task {
driver = "docker"
config {
image = "redis:${image...version}"
}
}
# v0.9 task config stanza:
task {
driver = "docker"
config {
image = "redis:${env["image...version"]}"
}
}
```
This only affects users who interpolate unusual variables with multiple
consecutive dots in their task `config` stanza. All other interpolation is
unchanged.
Since HCL2 uses dotted object notation for interpolation users should transition
away from variable names with multiple consecutive dots.
### Downgrading clients
Due to the large refactor of the Nomad client in 0.9, downgrading to a previous
version of the client after upgrading it to Nomad 0.9 is not supported. To
downgrade safely, users should erase the Nomad client's data directory.
### `port_map` Environment Variable Changes
Before Nomad 0.9.0 ports mapped via a task driver's `port_map` stanza could be
interpolated via the `NOMAD_PORT_<label>` environment variables.
However, in Nomad 0.9.0 no parameters in a driver's `config` stanza, including
its `port_map`, are available for interpolation. This means `{{ env NOMAD_PORT_<label> }}` in a `template` stanza or `HTTP_PORT = "${NOMAD_PORT_http}"` in an `env` stanza will now interpolate the _host_ ports,
not the container's.
Nomad 0.10 introduced Task Group Networking which natively supports port mapping
without relying on task driver specific `port_map` fields. The
[`to`](/docs/job-specification/network#to) field on group network port stanzas
will be interpolated properly. Please see the
[`network`](/docs/job-specification/network/) stanza documentation for details.
## Nomad 0.8.0
### Raft Protocol Version Compatibility
When upgrading to Nomad 0.8.0 from a version lower than 0.7.0, users will need
to set the [`raft_protocol`](/docs/configuration/server#raft_protocol) option in
their `server` stanza to 1 in order to maintain backwards compatibility with the
old servers during the upgrade. After the servers have been migrated to version
0.8.0, `raft_protocol` can be moved up to 2 and the servers restarted to match
the default.
The Raft protocol must be stepped up in this way; only adjacent version numbers
are compatible (for example, version 1 cannot talk to version 3). Here is a
table of the Raft Protocol versions supported by each Nomad version:
<table>
<thead>
<tr>
<th>Version</th>
<th>Supported Raft Protocols</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.6 and earlier</td>
<td>0</td>
</tr>
<tr>
<td>0.7</td>
<td>1</td>
</tr>
<tr>
<td>0.8 and later</td>
<td>1, 2, 3</td>
</tr>
</tbody>
</table>
In order to enable all
[Autopilot](https://learn.hashicorp.com/tutorials/nomad/autopilot) features, all
servers in a Nomad cluster must be running with Raft protocol version 3 or
later.
#### Upgrading to Raft Protocol 3
This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and
higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all
servers in order to work. See [Raft Protocol Version
Compatibility](/docs/upgrade/upgrade-specific#raft-protocol-version-compatibility)
for more details. Also the format of `peers.json` used for outage recovery is
different when running with the latest Raft protocol. See [Manual Recovery Using
peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson)
for a description of the required format.
Please note that the Raft protocol is different from Nomad's internal protocol
as shown in commands like `nomad server members`. To see the version of the Raft
protocol in use on each server, use the `nomad operator raft list-peers`
command.
The easiest way to upgrade servers is to have each server leave the cluster,
upgrade its `raft_protocol` version in the `server` stanza, and then add it
back. Make sure the new server joins successfully and that the cluster is stable
before rolling the upgrade forward to the next server. It's also possible to
stand up a new set of servers, and then slowly stand down each of the older
servers in a similar fashion.
When using Raft protocol version 3, servers are identified by their `node-id`
instead of their IP address when Nomad makes changes to its internal Raft quorum
configuration. This means that once a cluster has been upgraded with servers all
running Raft protocol version 3, it will no longer allow servers running any
older Raft protocol versions to be added. If running a single Nomad server,
restarting it in-place will result in that server not being able to elect itself
as a leader. To avoid this, either set the Raft protocol back to 2, or use
[Manual Recovery Using
peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson)
to map the server to its node ID in the Raft quorum configuration.
### Node Draining Improvements
Node draining via the [`node drain`][drain-cli] command or the [drain
API][drain-api] has been substantially changed in Nomad 0.8. In Nomad 0.7.1 and
earlier draining a node would immediately stop all allocations on the node
being drained. Nomad 0.8 now supports a [`migrate`][migrate] stanza in job
specifications to control how many allocations may be migrated at once and the
default will be used for existing jobs.
The `drain` command now blocks until the drain completes. To get the Nomad 0.7.1
and earlier drain behavior use the command: `nomad node drain -enable -force -detach <node-id>`
See the [`migrate` stanza documentation][migrate] and [Decommissioning Nodes
guide](https://learn.hashicorp.com/tutorials/nomad/node-drain) for details.
### Periods in Environment Variable Names No Longer Escaped
_Applications which expect periods in environment variable names to be replaced
with underscores must be updated._
In Nomad 0.7 periods (`.`) in environment variables names were replaced with an
underscore in both the [`env`](/docs/job-specification/env) and
[`template`](/docs/job-specification/template) stanzas.
In Nomad 0.8 periods are _not_ replaced and will be included in environment
variables verbatim.
For example the following stanza:
```text
env {
registry.consul.addr = "${NOMAD_IP_http}:8500"
}
```
In Nomad 0.7 would be exposed to the task as
`registry_consul_addr=127.0.0.1:8500`. In Nomad 0.8 it will now appear exactly
as specified: `registry.consul.addr=127.0.0.1:8500`.
### Client APIs Unavailable on Older Nodes
Because Nomad 0.8 uses a new RPC mechanism to route node-specific APIs like
[`nomad alloc fs`](/docs/commands/alloc/fs) through servers to the node,
0.8 CLIs are incompatible using these commands on clients older than 0.8.
To access these commands on older clients either continue to use a pre-0.8
version of the CLI, or upgrade all clients to 0.8.
### CLI Command Changes
Nomad 0.8 has changed the organization of CLI commands to be based on
subcommands. An example of this change is the change from `nomad alloc-status`
to `nomad alloc status`. All commands have been made to be backwards compatible,
but operators should update any usage of the old style commands to the new style
as the old style will be deprecated in future versions of Nomad.
### RPC Advertise Address
The behavior of the [advertised RPC address](/docs/configuration#rpc-1) has
changed to be only used to advertise the RPC address of servers to client nodes.
Server to server communication is done using the advertised Serf address.
Existing cluster's should not be effected but the advertised RPC address may
need to be updated to allow connecting client's over a NAT.
## Nomad 0.6.0
### Default `advertise` address changes
When no `advertise` address was specified and Nomad's `bind_addr` was loopback
or `0.0.0.0`, Nomad attempted to resolve the local hostname to use as an
advertise address.
Many hosts cannot properly resolve their hostname, so Nomad 0.6 defaults
`advertise` to the first private IP on the host (e.g. `10.1.2.3`).
If you manually configure `advertise` addresses no changes are necessary.
## Nomad Clients
The change to the default, advertised IP also effect clients that do not specify
which network_interface to use. If you have several routable IPs, it is advised
to configure the client's [network
interface](/docs/configuration/client#network_interface) such that tasks bind to
the correct address.
## Nomad 0.5.5
### Docker `load` changes
Nomad 0.5.5 has a backward incompatible change in the `docker` driver's
configuration. Prior to 0.5.5 the `load` configuration option accepted a list
images to load, in 0.5.5 it has been changed to a single string. No
functionality was changed. Even if more than one item was specified prior to
0.5.5 only the first item was used.
To do a zero-downtime deploy with jobs that use the `load` option:
- Upgrade servers to version 0.5.5 or later.
- Deploy new client nodes on the same version as the servers.
- Resubmit jobs with the `load` option fixed and a constraint to only run on
version 0.5.5 or later:
```hcl
constraint {
attribute = "${attr.nomad.version}"
operator = "version"
value = ">= 0.5.5"
}
```
- Drain and shutdown old client nodes.
### Validation changes
Due to internal job serialization and validation changes you may run into
issues using 0.5.5 command line tools such as `nomad run` and `nomad validate`
with 0.5.4 or earlier agents.
It is recommended you upgrade agents before or alongside your command line
tools.
## Nomad 0.4.0
Nomad 0.4.0 has backward incompatible changes in the logic for Consul
deregistration. When a Task which was started by Nomad v0.3.x is uncleanly shut
down, the Nomad 0.4 Client will no longer clean up any stale services. If an
in-place upgrade of the Nomad client to 0.4 prevents the Task from gracefully
shutting down and deregistering its Consul-registered services, the Nomad Client
will not clean up the remaining Consul services registered with the 0.3
Executor.
We recommend draining a node before upgrading to 0.4.0 and then re-enabling the
node once the upgrade is complete.
## Nomad 0.3.1
Nomad 0.3.1 removes artifact downloading from driver configurations and places them as
a first class element of the task. As such, jobs will have to be rewritten in
the proper format and resubmitted to Nomad. Nomad clients will properly
re-attach to existing tasks but job definitions must be updated before they can
be dispatched to clients running 0.3.1.
## Nomad 0.3.0
Nomad 0.3.0 has made several substantial changes to job files included a new
`log` block and variable interpretation syntax (`${var}`), a modified `restart`
policy syntax, and minimum resources for tasks as well as validation. These
changes require a slight change to the default upgrade flow.
After upgrading the version of the servers, all previously submitted jobs must
be resubmitted with the updated job syntax using a Nomad 0.3.0 binary.
- All instances of `$var` must be converted to the new syntax of `${var}`
- All tasks must provide their required resources for CPU, memory and disk as
well as required network usage if ports are required by the task.
- Restart policies must be updated to indicate whether it is desired for the
task to restart on failure or to fail using `mode = "delay"` or `mode = "fail"` respectively.
- Service names that include periods will fail validation. To fix, remove any
periods from the service name before running the job.
After updating the Servers and job files, Nomad Clients can be upgraded by first
draining the node so no tasks are running on it. This can be verified by running
`nomad node status <node-id>` and verify there are no tasks in the `running`
state. Once that is done the client can be killed, the `data_dir` should be
deleted and then Nomad 0.3.0 can be launched.
[dangling-containers]: /docs/drivers/docker#dangling-containers
[drain-api]: /api-docs/nodes#drain-node
[drain-cli]: /docs/commands/node/drain
[dst]: /docs/job-specification/periodic#daylight-saving-time
[envoy_concurrency]: https://www.envoyproxy.io/docs/envoy/latest/operations/cli#cmdoption-concurrency
[gh-6787]: https://github.com/hashicorp/nomad/issues/6787
[gh-8457]: https://github.com/hashicorp/nomad/issues/8457
[gh-9148]: https://github.com/hashicorp/nomad/issues/9148
[hcl2]: https://github.com/hashicorp/hcl2
[limits]: /docs/configuration#limits
[lxc]: /docs/drivers/external/lxc
[migrate]: /docs/job-specification/migrate
[plugin-stanza]: /docs/configuration/plugin
[plugins]: /docs/drivers/external
[preemption-api]: /api-docs/operator#update-scheduler-configuration
[preemption]: /docs/internals/scheduling/preemption
[proxy_concurrency]: /docs/job-specification/sidecar_task#proxy_concurrency
[reserved]: /docs/configuration/client#reserved-parameters
[task-config]: /docs/job-specification/task#config
[tls-guide]: https://learn.hashicorp.com/tutorials/nomad/security-enable-tls
[tls-vault-guide]: https://learn.hashicorp.com/tutorials/nomad/vault-pki-nomad
[update]: /docs/job-specification/update
[validate]: /docs/commands/job/validate
[vault_grace]: /docs/job-specification/template
[node drain]: https://www.nomadproject.io/docs/upgrade#5-upgrade-clients
[`template.disable_file_sandbox`]: /docs/configuration/client#template-parameters
[pki]: https://www.vaultproject.io/docs/secrets/pki