open-nomad/website/pages/docs/job-specification/restart.mdx
Chris Baker a6e0d17433
update restart documentation (#7603)
* update `restart` documentation

#7288 added support for task-specific `restart` policy. this PR updates the docs to reflect that.

* added an explicit example of task-specific restart policy

* Update website/pages/docs/job-specification/restart.mdx
2020-04-13 16:29:43 -05:00

125 lines
3.2 KiB
Plaintext

---
layout: docs
page_title: restart Stanza - Job Specification
sidebar_title: restart
description: The "restart" stanza configures a group's behavior on task failure.
---
# `restart` Stanza
<Placement
groups={[
['job', 'group', 'restart'],
['job', 'group', 'task', 'restart']
]}
/>
The `restart` stanza configures a tasks's behavior on task failure. Restarts
happen on the client that is running the task.
```hcl
job "docs" {
group "example" {
restart {
attempts = 3
delay = "30s"
}
}
}
```
If specified at the group level, the configuration is inherited by all
tasks in the group. If present on the task, the policy is merged with
the restart policy from the encapsulating task group.
For example, assuming that the task group restart policy is:
```hcl
restart {
interval = "30m"
attempts = 2
delay = "15s"
mode = "fail"
}
```
and the individual task restart policy is:
```hcl
restart {
attempts = 5
}
```
then the effective restart policy for the task will be:
```hcl
restart {
interval = "30m"
attempts = 5
delay = "15s"
mode = "fail"
}
```
## `restart` Parameters
- `attempts` `(int: <varies>)` - Specifies the number of restarts allowed in the
configured interval. Defaults vary by job type, see below for more
information.
- `delay` `(string: "15s")` - Specifies the duration to wait before restarting a
task. This is specified using a label suffix like "30s" or "1h". A random
jitter of up to 25% is added to the delay.
- `interval` `(string: <varies>)` - Specifies the duration which begins when the
first task starts and ensures that only `attempts` number of restarts happens
within it. If more than `attempts` number of failures happen, behavior is
controlled by `mode`. This is specified using a label suffix like "30s" or
"1h". Defaults vary by job type, see below for more information.
- `mode` `(string: "fail")` - Controls the behavior when the task fails more
than `attempts` times in an interval. For a detailed explanation of these
values and their behavior, please see the [mode values section](#mode-values).
### `restart` Parameter Defaults
The values for many of the `restart` parameters vary by job type. Here are the
defaults by job type:
- The default batch restart policy is:
```hcl
restart {
attempts = 3
delay = "15s"
interval = "24h"
mode = "fail"
}
```
- The default service and system job restart policy is:
```hcl
restart {
interval = "30m"
attempts = 2
delay = "15s"
mode = "fail"
}
```
### `mode` Values
This section details the specific values for the "mode" parameter in the Nomad
job specification for constraints. The mode is always specified as a string
```hcl
restart {
mode = "..."
}
```
- `"delay"` - Instructs the scheduler to delay the next restart until the next
`interval` is reached.
- `"fail"` - Instructs the scheduler to not attempt to restart the task on
failure. This is the default behavior. This mode is useful for non-idempotent jobs which are unlikely to
succeed after a few failures. Failed jobs will be restarted according to
the [`reschedule`](/docs/job-specification/reschedule) stanza.