2016-02-25 04:11:16 +00:00
---
2020-03-20 21:04:10 +00:00
layout: docs
2020-02-11 21:25:02 +00:00
page_title: 'Upgrading'
sidebar_current: 'guides-upgrade'
description: |-
Learn how to upgrade Nomad.
2016-02-25 04:11:16 +00:00
---
2018-06-22 20:53:16 +00:00
# Upgrading
2016-02-25 04:11:16 +00:00
2020-02-11 21:25:02 +00:00
Nomad is designed to be flexible and resilient when upgrading from one Nomad
version to the next. Upgrades should cause neither a Nomad nor a service
outage. However, there are some restrictions to be aware of before upgrading:
2016-02-25 04:11:16 +00:00
2020-02-11 21:25:02 +00:00
- Nomad strives to be backward compatible for at least 1 point release, so
Nomad v0.10 hosts work with v0.9 hosts. Upgrading 2 point releases (eg v0.8
to v0.10) may work but is untested and unsupported.
2016-03-19 03:43:31 +00:00
2020-02-11 21:25:02 +00:00
- Nomad does _not_ support downgrading at this time. Downgrading clients
requires draining allocations and removing the [data directory][data_dir].
Downgrading servers safely requires re-provisioning the cluster.
2019-06-11 18:16:37 +00:00
2020-02-11 21:25:02 +00:00
- New features are unlikely to work correctly until all nodes have been
upgraded.
- Check the [version upgrade details page][upgrade-specific] for important
changes and backward incompatibilities.
- When upgrading a Nomad Client, if it takes longer than the
[`heartbeat_grace`][heartbeat_grace] (10s by default) period to restart, all
allocations on that node may be rescheduled.
Nomad supports upgrading in place or by rolling in new servers:
- In Place: The Nomad binary can be updated on existing hosts. Running
allocations will continue running uninterrupted.
2016-02-25 04:11:16 +00:00
2020-02-11 21:25:02 +00:00
- Rolling: New hosts containing the new Nomad version may be added followed by
the removal of old hosts. The old nodes must be drained to migrate running
allocations to the new nodes.
2016-03-19 03:43:31 +00:00
2020-02-11 21:25:02 +00:00
This guide describes both approaches.
## Upgrade Process
2020-03-12 22:05:34 +00:00
Once you have checked the [upgrade details for the new
version][upgrade-specific], the upgrade process is as simple as updating the
binary on each host and restarting the Nomad service.
2016-03-19 03:43:31 +00:00
At a high level we complete the following steps to upgrade Nomad:
2020-02-06 23:45:31 +00:00
- **Add the new version**
- **Check cluster health**
- **Remove the old version**
- **Check cluster health**
- **Upgrade clients**
2016-03-19 03:43:31 +00:00
### 1. Add the new version to the existing cluster
2020-02-11 21:25:02 +00:00
While it is possible to upgrade Nomad client nodes before servers, this guide
recommends upgrading servers first as many new client features will not work
until servers are upgraded.
2023-01-25 17:31:14 +00:00
In a [federated cluster](/nomad/tutorials/manage-clusters/federation),
2021-07-09 13:42:17 +00:00
new features are not guaranteed to work until all agents in a region and the
server nodes in the authoritative region are upgraded.
2020-02-11 21:25:02 +00:00
Whether you are replacing Nomad in place on existing systems or bringing up new
servers you should make changes incrementally, verifying cluster health at each
step of the upgrade.
2016-03-19 03:43:31 +00:00
2020-02-06 23:45:31 +00:00
On a single server, install the new version of Nomad. You can do this by
2016-03-19 03:43:31 +00:00
joining a new server to the cluster or by replacing or upgrading the binary
2020-02-11 21:25:02 +00:00
locally and restarting the Nomad service.
2016-03-19 03:43:31 +00:00
2023-04-14 19:35:32 +00:00
Note that if you have [`leave_on_terminate`][] or [`leave_on_interrupt`][] set,
you should ensure you're using the expected signal for your upgrade process. For
example, if you have `leave_on_terminate` set and you intend on updating a
server in-place, you should `SIGINT` and not `SIGTERM` when shutting down the
server before restarting it.
2016-03-19 03:43:31 +00:00
### 2. Check cluster health
2020-02-11 21:25:02 +00:00
[Monitor the Nomad logs][monitor] on the remaining servers to check that the
new server has joined the cluster correctly.
2016-03-19 03:43:31 +00:00
2020-02-11 21:25:02 +00:00
Run `nomad agent-info` on the new servers and check that the `last_log_index`
is of a similar value to the other servers. This step ensures that changes have
been replicated to the new server.
2016-03-19 03:43:31 +00:00
2020-05-18 20:53:06 +00:00
```shell-session
2016-03-19 03:43:31 +00:00
ubuntu@nomad-server-10-1-1-4:~$ nomad agent-info
nomad
bootstrap = false
known_regions = 1
leader = false
server = true
raft
applied_index = 53460
commit_index = 53460
fsm_pending = 0
last_contact = 54.512216ms
last_log_index = 53460
last_log_term = 1
last_snapshot_index = 49511
last_snapshot_term = 1
num_peers = 2
...
```
2020-02-11 21:25:02 +00:00
Continue with the upgrades across the servers making sure to do a single Nomad
server at a time. You can check state of the servers with [`nomad server members`][server-members], and the state of the client nodes with [`nomad node status`][node-status].
2016-03-19 03:43:31 +00:00
### 3. Remove the old versions from servers
If you are doing an in place upgrade on existing servers this step is not
necessary as the version was changed in place.
If you are doing an upgrade by adding new servers and removing old servers
from the fleet you need to ensure that the server has left the fleet safely.
1. Stop the service on the existing host
2020-03-12 22:05:34 +00:00
2. On another server issue a `nomad server members` and check the status, if
2020-02-06 23:45:31 +00:00
the server is now in a left state you are safe to continue.
2020-03-12 22:05:34 +00:00
3. If the server is not in a left state, issue a `nomad server force-leave <server id>`
2020-02-06 23:45:31 +00:00
to remove the server from the cluster.
2016-03-19 03:43:31 +00:00
Monitor the logs of the other hosts in the Nomad cluster over this period.
### 4. Check cluster health
Use the same actions in step #2 above to confirm cluster health.
### 5. Upgrade clients
2016-02-25 04:11:16 +00:00
2020-03-12 22:05:34 +00:00
Following the successful upgrade of the servers you can now update your
clients using a similar process as the servers. You may either upgrade clients
in-place or start new nodes on the new version. See the [Workload Migration
2023-01-25 17:31:14 +00:00
Guide](/nomad/tutorials/manage-clusters/node-drain) for instructions on how to migrate running
allocations from the old nodes to the new nodes with the [`nomad node drain`](/nomad/docs/commands/node/drain) command.
2016-02-25 04:11:16 +00:00
2020-12-03 00:02:03 +00:00
## Done
2016-02-25 04:11:16 +00:00
2016-03-19 03:43:31 +00:00
You are now running the latest Nomad version. You can verify all
2018-03-22 20:39:18 +00:00
Clients joined by running `nomad node status` and checking all the clients
2016-03-19 03:43:31 +00:00
are in a `ready` state.
2017-09-25 21:59:41 +00:00
## Upgrading to Nomad Enterprise
2023-03-30 21:40:19 +00:00
Before upgrading servers to Nomad Enterprise versions 1.6.0 and later,
you should validate your enterprise license with the
[`nomad license inspect` command](/nomad/docs/commands/license/inspect)
using the binary that you are upgrading to.
See the [licensing FAQ](/nomad/docs/enterprise/license/faq)
for more information.
After that, the process of upgrading to a Nomad Enterprise version is identical to upgrading
2017-09-25 21:59:41 +00:00
between versions of open source Nomad. The same guidance above should be
followed and as always, prior to starting the upgrade please check the [specific
2023-01-25 17:31:14 +00:00
version details](/nomad/docs/upgrade/upgrade-specific) page as some version
2017-09-25 21:59:41 +00:00
differences may require specific steps.
2020-02-11 21:25:02 +00:00
2023-01-25 17:31:14 +00:00
[data_dir]: /nomad/docs/configuration#data_dir
[heartbeat_grace]: /nomad/docs/configuration/server#heartbeat_grace
[monitor]: /nomad/docs/commands/monitor
[node-status]: /nomad/docs/commands/node/status
[server-members]: /nomad/docs/commands/server/members
[upgrade-specific]: /nomad/docs/upgrade/upgrade-specific
2022-02-03 20:03:12 +00:00
## Upgrading to Raft Protocol 3
This section provides details on upgrading to Raft Protocol 3. Raft
protocol version 3 requires Nomad running 0.8.0 or newer on all
servers in order to work. Raft protocol version 2 will be removed in
Nomad 1.4.0.
To see the version of the Raft protocol in use on each server, use the
`nomad operator raft list-peers` command.
Note that the format of `peers.json` used for outage recovery is
different when running with the latest Raft protocol. See [Manual
Recovery Using
2023-01-25 17:31:14 +00:00
peers.json](/nomad/tutorials/manage-clusters/outage-recovery#manual-recovery-using-peersjson)
2022-02-03 20:03:12 +00:00
for a description of the required format.
When using Raft protocol version 3, servers are identified by their
`node-id` instead of their IP address when Nomad makes changes to its
internal Raft quorum configuration. This means that once a cluster has
been upgraded with servers all running Raft protocol version 3, it
will no longer allow servers running any older Raft protocol versions
to be added.
### Upgrading a Production Cluster to Raft Version 3
2022-06-16 17:24:18 +00:00
For production raft clusters with 3 or more members, the easiest way
2022-02-03 20:03:12 +00:00
to upgrade servers is to have each server leave the cluster, upgrade
2023-01-30 14:48:43 +00:00
its [`raft_protocol`] version in the `server` block (if upgrading to
2022-03-25 14:11:46 +00:00
a version lower than v1.3.0), and then add it back. Make sure the new
server joins successfully and that the cluster is stable before
rolling the upgrade forward to the next server. It's also possible to
stand up a new set of servers, and then slowly stand down each of the
older servers in a similar fashion.
2022-02-03 20:03:12 +00:00
For in-place raft protocol upgrades, perform the following for each
server, leaving the leader until last to reduce the chance of leader
elections that will slow down the process:
2022-03-25 14:11:46 +00:00
* Stop the server.
* Run `nomad server force-leave $server_name`.
* If the upgrade is for a Nomad version lower than v1.3.0, update the
[`raft_protocol`] in the server's configuration file to `3`.
* Restart the server.
* Run `nomad operator raft list-peers` to verify that the
`RaftProtocol` for the server is now `3`.
2022-02-03 20:03:12 +00:00
* On the server, run `nomad agent-info` and check that the
`last_log_index` is of a similar value to the other servers. This
step ensures that raft is healthy and changes are replicating to the
new server.
### Upgrading a Single Server Cluster to Raft Version 3
If you are running a single Nomad server, restarting it in-place will
result in that server not being able to elect itself as a leader. To
2023-06-19 06:56:51 +00:00
avoid this, create a new [`peers.json`][peers-json] file before
2022-02-03 20:03:12 +00:00
restarting the server with the new configuration. If you have `jq`
installed you can run the following script on the server's host to
2023-06-19 06:56:51 +00:00
write the correct `peers.json` file:
2022-02-03 20:03:12 +00:00
```
#!/usr/bin/env bash
NOMAD_DATA_DIR=$(nomad agent-info -json | jq -r '.config.DataDir')
NOMAD_ADDR=$(nomad agent-info -json | jq -r '.stats.nomad.leader_addr')
NODE_ID=$(cat "$NOMAD_DATA_DIR/server/node-id")
cat <<EOF > "$NOMAD_DATA_DIR/server/raft/peers.json"
[
{
"id": "$NODE_ID",
"address": "$NOMAD_ADDR",
"non_voter": false
}
]
EOF
```
2022-03-25 14:11:46 +00:00
After running this script, if the upgrade is for a Nomad version lower
than v1.3.0, update the [`raft_protocol`] in the server's
configuration to `3` and restart the server.
2022-02-03 20:03:12 +00:00
2023-01-25 17:31:14 +00:00
[peers-json]: /nomad/tutorials/manage-clusters/outage-recovery#manual-recovery-using-peersjson
[`raft_protocol`]: /nomad/docs/configuration/server#raft_protocol
2023-04-14 19:35:32 +00:00
[`leave_on_interrupt`]: /nomad/docs/configuration#leave_on_interrupt
[`leave_on_terminate`]: /nomad/docs/configuration#leave_on_terminate