docs: add drain guide

2018-04-12 09:59:50 -07:00 · 2018-04-12 09:59:50 -07:00 · 0a282e6c88
parent 0b6fbb8e16
commit 0a282e6c88
2 changed files with 251 additions and 0 deletions
--- a/website/source/guides/node-draining.html.md
+++ b/website/source/guides/node-draining.html.md
@ -0,0 +1,247 @@
+---
+layout: "guides"
+page_title: "Decommissioning Nodes"
+sidebar_current: "guides-decommissioning-nodes"
+description: |-
+  Decommissioning nodes is a normal part of cluster operations for a variety of
+  reasons: server maintenance, operating system upgrades, etc. Nomad offers a
+  number of parameters for controlling how running jobs are migrated off of
+  draining nodes.
+---
+
+# Decommissioning Nomad Client Nodes
+
+Decommissioning nodes is a normal part of cluster operations for a variety of
+reasons: server maintenance, operating system upgrades, etc. Nomad offers a
+number of parameters for controlling how running jobs are migrated off of
+draining nodes.
+
+## Configuring How Jobs are Migrated
+
+In Nomad 0.8 a [`migrate`][migrate] stanza was added to jobs to allow control over how
+allocations for a job are migrated off of a draining node. For example for a
+job that runs a web service and has a Consul health check:
+
+```hcl
+job "webapp" {
+  datacenters = ["dc1"]
+
+  migrate {
+    max_parallel = 2
+    health_check = "checks"
+    min_healthy_time = "15s"
+    healthy_deadline = "5m"
+  }
+
+  group "webapp" {
+    count = 9
+
+    task "webapp" {
+      driver = "docker"
+      config {
+        image = "hashicorp/http-echo:0.2.3"
+        args  = ["-text", "ok"]
+        port_map {
+          http = 5678
+        }
+      }
+
+      resources {
+        network {
+          mbits = 10
+          port "http" {}
+        }
+      }
+
+      service {
+        name = "webapp"
+        port = "http"
+        check {
+          name = "http-ok"
+          type = "http"
+          path = "/"
+          interval = "10s"
+          timeout  = "2s"
+        }
+      }
+    }
+  }
+}
+```
+
+The above `migrate` stanza ensures only 2 allocations are stopped at a time to
+migrate during node drains.
+
+When the job is run it may be placed on multiple nodes. In the following
+example the 9 `webapp` allocations are spread across 2 nodes: 
+
+```text
+$ nomad run webapp.nomad
+==> Monitoring evaluation "5129bc74"
+    Evaluation triggered by job "webapp"
+    Allocation "5b4d6db5" created: node "46f1c6c4", group "webapp"
+    Allocation "670a715f" created: node "f7476465", group "webapp"
+    Allocation "78b6b393" created: node "46f1c6c4", group "webapp"
+    Allocation "85743ff5" created: node "f7476465", group "webapp"
+    Allocation "edf71a5d" created: node "f7476465", group "webapp"
+    Allocation "56f770c0" created: node "46f1c6c4", group "webapp"
+    Allocation "9a51a484" created: node "46f1c6c4", group "webapp"
+    Allocation "f6f6e64c" created: node "f7476465", group "webapp"
+    Allocation "fefe81d0" created: node "f7476465", group "webapp"
+    Evaluation status changed: "pending" -> "complete"
+==> Evaluation "5129bc74" finished with status "complete"
+```
+
+If one those nodes needed to be decommissioned, perhaps because of a hardware
+issue, then an operator would issue node drain to migrate the allocations off:
+
+```text
+$ nomad node drain -enable -yes 46f1
+2018-04-11T23:41:56Z: Ctrl-C to stop monitoring: will not cancel the node drain
+2018-04-11T23:41:56Z: Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" drain strategy set
+2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" marked for migration
+2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" marked for migration
+2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" draining
+2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" draining
+2018-04-11T23:42:03Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" status running -> complete
+2018-04-11T23:42:03Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" status running -> complete
+2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" marked for migration
+2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" draining
+2018-04-11T23:42:27Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" status running -> complete
+2018-04-11T23:42:29Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" marked for migration
+2018-04-11T23:42:29Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" draining
+2018-04-11T23:42:29Z: Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" drain complete
+2018-04-11T23:42:34Z: Alloc "9a51a484-8c43-aa4e-d60a-46cfd1450780" status running -> complete
+2018-04-11T23:42:34Z: All allocations on node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" have stopped.
+```
+
+There are a couple important events to notice in the output. First, only 2
+allocations are migrated initially:
+
+```
+2018-04-11T23:41:57Z: Alloc "5b4d6db5-3fcb-eb7d-0415-23eefcd78b6a" marked for migration
+2018-04-11T23:41:57Z: Alloc "56f770c0-f8aa-4565-086d-01faa977f82d" marked for migration
+```
+
+This is because `max_parallel = 2` in the job specification. The next
+allocation on the draining node waits to be migrated:
+
+```
+2018-04-11T23:42:22Z: Alloc "78b6b393-d29c-d8f8-e8e8-28931c0013ee" marked for migration
+```
+
+Note that this occurs 25 seconds after the initial migrations. The 25 second
+delay is because a replacement allocation took 10 seconds to become healthy and
+then the `min_healthy_deadline = "15s"` meant node draining waited an
+additional 15 seconds. If the replacement allocation had failed within that
+time the node drain would not have continued until a replacement could be
+successfully made.
+
+### Scheduling Eligibility
+
+Now that the example drain has finished we can inspect the state of the drained
+node:
+
+```text
+$ nomad node status
+ID        DC   Name     Class   Drain  Eligibility  Status
+46f1c6c4  dc1  nomad-5  <none>  false  ineligible   ready
+96b52ad8  dc1  nomad-6  <none>  false  eligible     ready
+f7476465  dc1  nomad-4  <none>  false  eligible     ready
+```
+
+While node `46f1` has `Drain = false`, notice that its `Eligibility =
+ineligible`. Node scheduling eligibility is a new field in Nomad 0.8. When a
+node is ineligible for scheduling the scheduler will not consider it for new
+placements.
+
+While draining, a node will always be ineligible for scheduling. Once draining
+completes it will remain ineligible to prevent refilling a newly drained node.
+
+However, by default canceling a drain with the `-disable` option will reset a
+node to be eligible for scheduling. To cancel a drain and preserving the node's
+ineligible status use the `-keep-ineligible` option.
+
+Scheduling eligibility can be toggled independently of node drains by using the
+[`nomad node eligibility`][eligibility] command.
+
+### Node Drain Deadline
+
+Sometimes a drain is unable to proceed and complete normally. This could be
+caused by not enough capacity existing in the cluster to replace the drained
+allocations or by replacement allocations failing to start successfully in a
+timely fashion.
+
+Operators may specify a deadline using the option for node drain to prevent
+drains from getting stuck. Once the deadline is reached, all remaining
+allocations on the node are stopped regardless of `migrate` stanza parameters.
+
+The default deadline is 1 hour and may be changed with the
+[`-deadline`][deadline] command line option. The [`-force`][force] option is
+like an instant deadline: all allocations are immediately stopped. The
+[`-no-deadline`][no-deadline] option disables the deadline so a drain may
+continue indefinitely.
+
+Like all other drain parameters, a drain's deadline can be updated by making
+subsequent `nomad node drain ...` calls with updated values.
+
+## Node Drains and Non-Service Jobs
+
+So far we have only seen how draining works with service jobs. Both batch and
+system jobs are have different behaviors during node drains.
+
+### Draining Batch Jobs
+
+Node drains only migrate batch jobs once the drain's deadline has been reached.
+For node drains without a deadline the drain will not complete until all batch
+jobs on the node have completed (or failed).
+
+The goal of this behavior is to avoid losing progress a batch job has made by
+forcing it to exit early.
+
+### Keeping System Jobs Running
+
+Node drains only stop system jobs once all other allocations have exited. This
+way if a node is running a log shipping daemon or metrics collector as a system
+job, it will continue to run as long as there are other services running.
+
+The [`-ignore-system`][ignore-system] option leaves system jobs running even
+after all other allocations have exited. This is useful when system jobs are
+used to monitor Nomad itself or other system properties.
+
+## Draining Multiple Nodes
+
+A common operation is to decommission an entire class of nodes at once. Prior
+to Nomad 0.7 this was a problematic operation as the first node to begin
+draining may migrate all of their allocations to the next node about to be
+drained. In pathological cases this could repeat on each node to be drained and
+cause allocations to be rescheduled repeatedly.
+
+As of Nomad 0.8 an operator can avoid this churn by marking node ineligible for
+scheduling before draining them using the [`nomad node
+eligibility`][eligibility] command:
+
+```text
+$ nomad node eligibility -disable 46f1
+Node "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806" scheduling eligibility set: ineligible for scheduling
+
+$ nomad node eligibility -disable 96b5
+Node "96b52ad8-e9ad-1084-c14f-0e11f10772e4" scheduling eligibility set: ineligible for scheduling
+
+$ nomad node status
+ID        DC   Name     Class   Drain  Eligibility  Status
+46f1c6c4  dc1  nomad-5  <none>  false  ineligible   ready
+96b52ad8  dc1  nomad-6  <none>  false  ineligible   ready
+f7476465  dc1  nomad-4  <none>  false  eligible     ready
+```
+
+Now that both `nomad-5` and `nomad-6` are ineligible for scheduling, they can
+be drained without risking placing allocations on an _about-to-be-drained_
+node.
+
+[deadline]: /docs/commands/node/drain.html#deadline
+[eligibility]: /docs/commands/node/eligibility.html
+[force]: /docs/commands/node/drain.html#force
+[ignore-system]: /docs/commands/node/drain.html#ignore-system
+[migrate]: /docs/job-specification/migrate.html
+[no-deadline]: /docs/commands/node/drain.html#no-deadline
--- a/website/source/layouts/guides.erb
+++ b/website/source/layouts/guides.erb
@ -57,6 +57,10 @@
        </ul>
      </li>

+      <li<%= sidebar_current("guides-decommissioning-nodes") %>>
+        <a href="/guides/node-draining.html">Decommissioning Nodes</a>
+      </li>
+
      <li<%= sidebar_current("guides-namespaces") %>>
        <a href="/guides/namespaces.html">Namespaces</a>
      </li>