From 0f81725359622f3ac954b8c418794d60710a01e3 Mon Sep 17 00:00:00 2001 From: Michael Schurter Date: Thu, 29 Mar 2018 17:52:35 -0700 Subject: [PATCH] docs: node drain, eligibility; jobspec migrate --- .../docs/commands/job/deployments.html.md.erb | 4 +- website/source/docs/commands/node.html.md.erb | 6 +- .../docs/commands/node/drain.html.md.erb | 65 +++++++++++++- .../commands/node/eligibility.html.md.erb | 71 ++++++++++++++++ .../docs/job-specification/migrate.html.md | 84 +++++++++++++++++++ website/source/layouts/docs.erb | 6 ++ 6 files changed, 230 insertions(+), 6 deletions(-) create mode 100644 website/source/docs/commands/node/eligibility.html.md.erb create mode 100644 website/source/docs/job-specification/migrate.html.md diff --git a/website/source/docs/commands/job/deployments.html.md.erb b/website/source/docs/commands/job/deployments.html.md.erb index b234f48a4..28451002a 100644 --- a/website/source/docs/commands/job/deployments.html.md.erb +++ b/website/source/docs/commands/job/deployments.html.md.erb @@ -8,8 +8,8 @@ description: > # Command: job deployments -The `job dispatch` command is used to display the deployments for a particular -job. +The `job deployments` command is used to display the deployments for a +particular job. ## Usage diff --git a/website/source/docs/commands/node.html.md.erb b/website/source/docs/commands/node.html.md.erb index 58c7be7b9..878ff0057 100644 --- a/website/source/docs/commands/node.html.md.erb +++ b/website/source/docs/commands/node.html.md.erb @@ -18,9 +18,11 @@ Run `nomad node -h` for help on that subcommand. The following subcommands are available: * [`node config`][config] - View or modify client configuration details -* [`node drain`][drain] - Toggle drain mode on a given node +* [`node drain`][drain] - Set drain mode on a given node +* [`node eligibility`][eligibility] - Toggle scheduilng eligibility on a given node * [`node status`][status] - Display status information about nodes [config]: /docs/commands/node/config.html "View or modify client configuration details" -[drain]: /docs/commands/node/drain.html "Toggle drain mode on a given node" +[drain]: /docs/commands/node/drain.html "Set drain mode on a given node" +[eligibility]: /docs/commands/node/eligibility.html "Toggle scheduling eligibility on a given node" [status]: /docs/commands/node/status.html "Display status information about nodes" diff --git a/website/source/docs/commands/node/drain.html.md.erb b/website/source/docs/commands/node/drain.html.md.erb index ca8944daf..8822c9620 100644 --- a/website/source/docs/commands/node/drain.html.md.erb +++ b/website/source/docs/commands/node/drain.html.md.erb @@ -10,7 +10,20 @@ description: > The `node drain` command is used to toggle drain mode on a given node. Drain mode prevents any new tasks from being allocated to the node, and begins -migrating all existing allocations away. +migrating all existing allocations away. Allocations will be migrated according +to their [`migrate`][migrate] stanza until the drain's deadline is reached. + +By default the `node drain` command blocks until a node is done draining and +all allocations have terminated. Canceling the `node drain` command *will not* +cancel the drain. Drains may be canceled by using the `-disable` parameter +below. + +When draining more than one node at a time, it is recommended you first disable +[scheduling eligibility][eligibility] on all nodes you are about to drain. For +example if you are decommissioning an entire class of nodes, first run `node +eligibility -disable` on all of their node IDs, and then run `node drain +-enable`. This will ensure allocations drained from the first node are not +placed on another node about to be drained. The [node status](/docs/commands/node/status.html) command compliments this nicely by providing the current drain status of a given node. @@ -37,6 +50,19 @@ operation is desired. * `-enable`: Enable node drain mode. * `-disable`: Disable node drain mode. +* `-deadline`: Set the deadline by which all allocations must be moved off the + node. Remaining allocations after the deadline are force removed from the + node. Defaults to 1 hour. +* `-detach`: Return immediately instead of entering monitor mode. +* `-force`: Force remove allocations off the node immediately. +* `-no-deadline`: No deadline allows the allocations to drain off the node + without being force stopped after a certain deadline. +* `-ignore-system`: Ignore sytem allows the drain to complete without stopping + system job allocations. By default system jobs are stopped last. +* `-keep-ineligible`: Keep ineligible will maintain the node's scheduling + ineligibility even if the drain is being disabled. This is useful when an + existing drain is being cancelled but additional scheduling on the node is not + desired. * `-self`: Drain the local node. * `-yes`: Automatic yes to prompts. @@ -45,11 +71,46 @@ operation is desired. Enable drain mode on node with ID prefix "4d2ba53b": ``` -$ nomad node drain -enable 4d2ba53b +$ nomad node drain -enable f4e8a9e5 +Are you sure you want to enable drain mode for node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e"? [y/N] y +2018-03-30T23:13:16Z: Ctrl-C to stop monitoring: will not cancel the node drain +2018-03-30T23:13:16Z: Node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" drain strategy set +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" marked for migration +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" draining +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" status running -> complete +2018-03-30T23:13:29Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" marked for migration +2018-03-30T23:13:29Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" draining +2018-03-30T23:13:30Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" status running -> complete +2018-03-30T23:13:41Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" marked for migration +2018-03-30T23:13:41Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" draining +2018-03-30T23:13:41Z: Node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" drain complete +2018-03-30T23:13:42Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" status running -> complete +2018-03-30T23:13:42Z: All allocations on node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" have stopped. ``` Enable drain mode on the local node: ``` $ nomad node drain -enable -self +... ``` + +Enable drain mode but do not stop system jobs: + +``` +$ nomad node drain -enable -ignore-system 4d2ba53b +... +``` + +Disable drain mode but keep the node ineligible for scheduling. Useful for +inspecting the current state of a misbehaving node without Nomad trying to +start or migrate allocations: + +``` +$ nomad node drain -disable -keep-ineligible 4d2ba53b +... +``` + + +[eligibility]: /docs/commands/node/eligibility.html +[migrate]: /docs/job-specification/migrate.html diff --git a/website/source/docs/commands/node/eligibility.html.md.erb b/website/source/docs/commands/node/eligibility.html.md.erb new file mode 100644 index 000000000..00e81ed3c --- /dev/null +++ b/website/source/docs/commands/node/eligibility.html.md.erb @@ -0,0 +1,71 @@ +--- +layout: "docs" +page_title: "Commands: node eligibility" +sidebar_current: "docs-commands-node-eligibility" +description: > + The node eligibility command is used to configure a node's scheduling + eligibility. +--- + +# Command: node eligibility + +The `node eligibility` command is used to toggle scheduling eligibility for a +given node. By default node's are eligible for scheduling meaning they can +receive placements and run new allocations. Node's that have their scheduling +elegibility disabled are ineligibile for new placements. + +The [`node drain`][drain] command automatically disables eligibility. Disabling +a drain restore eligibility by default. + +Disable scheduling eligibility is useful when draining a set of nodes: first +disable eligibility on each node that will be drained. Then drain each node. +If you just drain each node allocations may get rescheduled multiple times as +they get placed on node's about to be drained! + +Disabling scheduling eligibility may also be useful when investigating poorly +behaved nodes. It allows operators to investigate the current state of a node +without the risk of additional work being assigned to it. + +## Usage + +``` +nomad node eligibility [options] +``` + +A `-self` flag can be used to toggle eligibility of the local node. If this is +not supplied, a node ID or prefix must be provided. If there is an exact match, +the eligibility will be adjusted for that node. Otherwise, a list of matching +nodes and information will be displayed. + +It is also required to pass one of `-enable` or `-disable`, depending on which +operation is desired. + +## General Options + +<%= partial "docs/commands/_general_options" %> + +## Drain Options + +* `-enable`: Enable scheduling eligbility. +* `-disable`: Disable scheduling eligibility. +* `-self`: Set eligibility for the local node. +* `-yes`: Automatic yes to prompts. + +## Examples + +Enable scheduling eligibility on node with ID prefix "574545c5": + +``` +$ nomad node eligibility -enable 574545c5 +Node "574545c5-c2d7-e352-d505-5e2cb9fe169f" scheduling eligibility set: eligible for scheduling +``` + +Disable scheduling eligibility on the local node: + +``` +$ nomad node eligibility -disable -self +Node "574545c5-c2d7-e352-d505-5e2cb9fe169f" scheduling eligibility set: ineligible for scheduling +``` + + +[drain]: /docs/commands/node/drain.html diff --git a/website/source/docs/job-specification/migrate.html.md b/website/source/docs/job-specification/migrate.html.md new file mode 100644 index 000000000..37974f1c7 --- /dev/null +++ b/website/source/docs/job-specification/migrate.html.md @@ -0,0 +1,84 @@ +--- +layout: "docs" +page_title: "migrate Stanza - Job Specification" +sidebar_current: "docs-job-specification-migrate" +description: |- + The "migrate" stanza specifies the group's migrate strategy. The migrate + strategy is used to control the job's behavior when it is being migrated off + of a draining node. +--- + +# `migrate` Stanza + + + + + + + +
Placement + job -> **migrate** + + job -> group -> **migrate** +
+ +The `migrate` stanza specifies the group's strategy for migrating off of +[draining][drain] nodes. If omitted, a default migration strategy is applied. +If specified at the job level, the configuration will apply to all groups +within the job. Only service jobs with a count greater than 1 support migrate +stanzas. + +```hcl +job "docs" { + migrate { + max_parallel = 1 + health_check = "checks" + min_healthy_time = "10s" + healthy_deadline = "5m" + } +} +``` + +When one or more nodes are draining, only `max_parallel` allocations will be +stopped at a time. Node draining will not continue until replacement +allocations have been healthy for their `min_healthy_time` or +`healthy_deadline` is reached. + +Note that a node's drain [deadline][deadline] will override the `migrate` +stanza for allocations on that node. The `migrate` stanza is for job authors to +define how their services should be migrated, while the node drain deadline is +for system operators to put hard limits on how long a drain may take. + +## `migrate` Parameters + +- `max_parallel` `(int: 1)` - Specifies the number of allocations that can be + migrated at the same time. This number must be less than the total + [`count`][count] for the group as `count - max_parallel` will be left running + during migrations. + +- `health_check` `(string: "checks")` - Specifies the mechanism in which + allocations health is determined. The potential values are: + + - "checks" - Specifies that the allocation should be considered healthy when + all of its tasks are running and their associated [checks][checks] are + healthy, and unhealthy if any of the tasks fail or not all checks become + healthy. This is a superset of "task_states" mode. + + - "task_states" - Specifies that the allocation should be considered healthy when + all its tasks are running and unhealthy if tasks fail. + +- `min_healthy_time` `(string: "10s")` - Specifies the minimum time the + allocation must be in the healthy state before it is marked as healthy and + unblocks further allocations from being migrated. This is specified using a + label suffix like "30s" or "15m". + +- `healthy_deadline` `(string: "5m")` - Specifies the deadline in which the + allocation must be marked as healthy after which the allocation is + automatically transitioned to unhealthy. This is specified using a label + suffix like "2m" or "1h". + + +[checks]: /docs/job-specification/service.html#check-parameters +[count]: /docs/job-specification/group.html#count +[drain]: /docs/commands/node/drain.html +[deadline]: /docs/commands/node/drain.html#deadline diff --git a/website/source/layouts/docs.erb b/website/source/layouts/docs.erb index 87b0cd50b..4fd16fcb7 100644 --- a/website/source/layouts/docs.erb +++ b/website/source/layouts/docs.erb @@ -53,6 +53,9 @@ > meta + > + migrate + > network @@ -324,6 +327,9 @@ > drain + > + eligibility + > status