docs: add drain guide

This commit is contained in:
Michael Schurter 2018-04-12 09:59:50 -07:00
parent 0b6fbb8e16
commit 0a282e6c88
2 changed files with 251 additions and 0 deletions

View File

@ -0,0 +1,247 @@
---
layout: "guides"
page_title: "Decommissioning Nodes"
sidebar_current: "guides-decommissioning-nodes"
description: |-
Decommissioning nodes is a normal part of cluster operations for a variety of
reasons: server maintenance, operating system upgrades, etc. Nomad offers a
number of parameters for controlling how running jobs are migrated off of
draining nodes.
---
# Decommissioning Nomad Client Nodes
Decommissioning nodes is a normal part of cluster operations for a variety of
reasons: server maintenance, operating system upgrades, etc. Nomad offers a
number of parameters for controlling how running jobs are migrated off of
draining nodes.
## Configuring How Jobs are Migrated
In Nomad 0.8 a [`migrate`][migrate] stanza was added to jobs to allow control over how
allocations for a job are migrated off of a draining node. For example for a
job that runs a web service and has a Consul health check:
```hcl
job "webapp" {
datacenters = ["dc1"]
migrate {
max_parallel = 2
health_check = "checks"
min_healthy_time = "15s"
healthy_deadline = "5m"
}
group "webapp" {
count = 9
task "webapp" {
driver = "docker"
config {
image = "hashicorp/http-echo:0.2.3"
args = ["-text", "ok"]
port_map {
http = 5678
}
}
resources {
network {
mbits = 10
port "http" {}
}
}
service {
name = "webapp"
port = "http"
check {
name = "http-ok"
type = "http"
path = "/"
interval = "10s"
timeout = "2s"
}
}
}
}
}
```
The above `migrate` stanza ensures only 2 allocations are stopped at a time to
migrate during node drains.
When the job is run it may be placed on multiple nodes. In the following
example the 9 `webapp` allocations are spread across 2 nodes:
```text
$ nomad run webapp.nomad
==> Monitoring evaluation "5129bc74"
Evaluation triggered by job "webapp"
Allocation "5b4d6db5" created: node "46f1c6c4", group "webapp"
Allocation "670a715f" created: node "f7476465", group "webapp"
Allocation "78b6b393" created: node "46f1c6c4", group "webapp"
Allocation "85743ff5" created: node "f7476465", group "webapp"
Allocation "edf71a5d" created: node "f7476465", group "webapp"
Allocation "56f770c0" created: node "46f1c6c4", group "webapp"
Allocation "9a51a484" created: node "46f1c6c4", group "webapp"
Allocation "f6f6e64c" created: node "f7476465", group "webapp"
Allocation "fefe81d0" created: node "f7476465", group "webapp"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "5129bc74" finished with status "complete"
```
If one those nodes needed to be decommissioned, perhaps because of a hardware
issue, then an operator would issue node drain to migrate the allocations off:
```text
$ nomad node drain -enable -yes 46f1
2018-04-11T23:41:56Z: Ctrl-C to stop monitoring: will not cancel the node drain
2018-04-11T23:41:56Z: Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" drain strategy set
2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" marked for migration
2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" marked for migration
2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" draining
2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" draining
2018-04-11T23:42:03Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" status running -> complete
2018-04-11T23:42:03Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" status running -> complete
2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" marked for migration
2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" draining
2018-04-11T23:42:27Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" status running -> complete
2018-04-11T23:42:29Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" marked for migration
2018-04-11T23:42:29Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" draining
2018-04-11T23:42:29Z: Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" drain complete
2018-04-11T23:42:34Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" status running -> complete
2018-04-11T23:42:34Z: All allocations on node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" have stopped.
```
There are a couple important events to notice in the output. First, only 2
allocations are migrated initially:
```
2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" marked for migration
2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" marked for migration
```
This is because `max_parallel = 2` in the job specification. The next
allocation on the draining node waits to be migrated:
```
2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" marked for migration
```
Note that this occurs 25 seconds after the initial migrations. The 25 second
delay is because a replacement allocation took 10 seconds to become healthy and
then the `min_healthy_deadline = "15s"` meant node draining waited an
additional 15 seconds. If the replacement allocation had failed within that
time the node drain would not have continued until a replacement could be
successfully made.
### Scheduling Eligibility
Now that the example drain has finished we can inspect the state of the drained
node:
```text
$ nomad node status
ID DC Name Class Drain Eligibility Status
46f1c6c4 dc1 nomad-5 <none> false ineligible ready
96b52ad8 dc1 nomad-6 <none> false eligible ready
f7476465 dc1 nomad-4 <none> false eligible ready
```
While node `46f1` has `Drain = false`, notice that its `Eligibility =
ineligible`. Node scheduling eligibility is a new field in Nomad 0.8. When a
node is ineligible for scheduling the scheduler will not consider it for new
placements.
While draining, a node will always be ineligible for scheduling. Once draining
completes it will remain ineligible to prevent refilling a newly drained node.
However, by default canceling a drain with the `-disable` option will reset a
node to be eligible for scheduling. To cancel a drain and preserving the node's
ineligible status use the `-keep-ineligible` option.
Scheduling eligibility can be toggled independently of node drains by using the
[`nomad node eligibility`][eligibility] command.
### Node Drain Deadline
Sometimes a drain is unable to proceed and complete normally. This could be
caused by not enough capacity existing in the cluster to replace the drained
allocations or by replacement allocations failing to start successfully in a
timely fashion.
Operators may specify a deadline using the option for node drain to prevent
drains from getting stuck. Once the deadline is reached, all remaining
allocations on the node are stopped regardless of `migrate` stanza parameters.
The default deadline is 1 hour and may be changed with the
[`-deadline`][deadline] command line option. The [`-force`][force] option is
like an instant deadline: all allocations are immediately stopped. The
[`-no-deadline`][no-deadline] option disables the deadline so a drain may
continue indefinitely.
Like all other drain parameters, a drain's deadline can be updated by making
subsequent `nomad node drain ...` calls with updated values.
## Node Drains and Non-Service Jobs
So far we have only seen how draining works with service jobs. Both batch and
system jobs are have different behaviors during node drains.
### Draining Batch Jobs
Node drains only migrate batch jobs once the drain's deadline has been reached.
For node drains without a deadline the drain will not complete until all batch
jobs on the node have completed (or failed).
The goal of this behavior is to avoid losing progress a batch job has made by
forcing it to exit early.
### Keeping System Jobs Running
Node drains only stop system jobs once all other allocations have exited. This
way if a node is running a log shipping daemon or metrics collector as a system
job, it will continue to run as long as there are other services running.
The [`-ignore-system`][ignore-system] option leaves system jobs running even
after all other allocations have exited. This is useful when system jobs are
used to monitor Nomad itself or other system properties.
## Draining Multiple Nodes
A common operation is to decommission an entire class of nodes at once. Prior
to Nomad 0.7 this was a problematic operation as the first node to begin
draining may migrate all of their allocations to the next node about to be
drained. In pathological cases this could repeat on each node to be drained and
cause allocations to be rescheduled repeatedly.
As of Nomad 0.8 an operator can avoid this churn by marking node ineligible for
scheduling before draining them using the [`nomad node
eligibility`][eligibility] command:
```text
$ nomad node eligibility -disable 46f1
Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" scheduling eligibility set: ineligible for scheduling
$ nomad node eligibility -disable 96b5
Node "96b52ad8-e9ad-1084-c14f-0e11f10772e4" scheduling eligibility set: ineligible for scheduling
$ nomad node status
ID DC Name Class Drain Eligibility Status
46f1c6c4 dc1 nomad-5 <none> false ineligible ready
96b52ad8 dc1 nomad-6 <none> false ineligible ready
f7476465 dc1 nomad-4 <none> false eligible ready
```
Now that both `nomad-5` and `nomad-6` are ineligible for scheduling, they can
be drained without risking placing allocations on an _about-to-be-drained_
node.
[deadline]: /docs/commands/node/drain.html#deadline
[eligibility]: /docs/commands/node/eligibility.html
[force]: /docs/commands/node/drain.html#force
[ignore-system]: /docs/commands/node/drain.html#ignore-system
[migrate]: /docs/job-specification/migrate.html
[no-deadline]: /docs/commands/node/drain.html#no-deadline

View File

@ -57,6 +57,10 @@
</ul>
</li>
<li<%= sidebar_current("guides-decommissioning-nodes") %>>
<a href="/guides/node-draining.html">Decommissioning Nodes</a>
</li>
<li<%= sidebar_current("guides-namespaces") %>>
<a href="/guides/namespaces.html">Namespaces</a>
</li>