open-nomad/website/content/docs/job-specification/reschedule.mdx

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

130 lines
4 KiB
Plaintext
Raw Normal View History

---
2020-02-06 23:45:31 +00:00
layout: docs
page_title: reschedule Block - Job Specification
2020-02-06 23:45:31 +00:00
description: >-
The "reschedule" block specifies the group's rescheduling strategy upon
2020-02-06 23:45:31 +00:00
allocation failures. Nomad will only attempt to reschedule failed allocations
on
to another node only after any local
[restarts](/nomad/docs/job-specification/restart)
2020-02-06 23:45:31 +00:00
2018-02-08 14:23:19 +00:00
have been exceeded.
---
# `reschedule` Block
2020-02-06 23:45:31 +00:00
<Placement
groups={[
['job', 'reschedule'],
2020-09-30 13:48:40 +00:00
['job', 'group', 'reschedule'],
2020-02-06 23:45:31 +00:00
]}
/>
The `reschedule` block specifies the group's rescheduling strategy. If specified at the job
2018-02-08 14:23:19 +00:00
level, the configuration will apply to all groups within the job. If the
reschedule block is present on both the job and the group, they are merged with
the group block taking the highest precedence and then the job.
Nomad will attempt to schedule the allocation on another node if any of its
task statuses become `failed`. The scheduler prefers to create a replacement
allocation on a node that was not used by a previous allocation.
```hcl
job "docs" {
group "example" {
reschedule {
attempts = 15
interval = "1h"
delay = "30s"
delay_function = "exponential"
max_delay = "120s"
unlimited = false
}
}
}
```
~> The reschedule block does not apply to `system` or `sysbatch` jobs because
they run on every node.
## `reschedule` Parameters
2018-02-08 14:23:19 +00:00
- `attempts` `(int: <varies>)` - Specifies the number of reschedule attempts
2020-02-06 23:45:31 +00:00
allowed in the configured interval. Defaults vary by job type, see below
for more information.
2018-02-08 14:23:19 +00:00
- `interval` `(string: <varies>)` - Specifies the sliding window which begins
2020-02-06 23:45:31 +00:00
when the first reschedule attempt starts and ensures that only `attempts`
number of reschedule happen within it. If more than `attempts` number of
failures happen with this interval, Nomad will not reschedule any more.
- `delay` `(string: <varies>)` - Specifies the duration to wait before attempting
to reschedule a failed task. This is specified using a label suffix like "30s" or "1h".
Delay cannot be less than 5 seconds.
2018-03-27 19:09:04 +00:00
- `delay_function` `(string: <varies>)` - Specifies the function that is used to
calculate subsequent reschedule delays. The initial delay is specified by the delay parameter.
`delay_function` has three possible values which are described below.
2020-02-06 23:45:31 +00:00
- `constant` - The delay between reschedule attempts stays constant at the `delay` value.
- `exponential` - The delay between reschedule attempts doubles.
- `fibonacci` - The delay between reschedule attempts is calculated by adding the two most recent
delays applied. For example if `delay` is set to 5 seconds, the next five reschedule attempts will be
delayed by 5 seconds, 5 seconds, 10 seconds, 15 seconds, and 25 seconds respectively.
- `max_delay` `(string: <varies>)` - is an upper bound on the delay beyond which it will not increase. This parameter
is used when `delay_function` is `exponential` or `fibonacci`, and is ignored when `constant` delay is used.
- `unlimited` `(boolean:<varies>)` - `unlimited` enables unlimited reschedule attempts. If this is set to true
the `attempts` and `interval` fields are not used.
2018-02-08 14:23:19 +00:00
Information about reschedule attempts are displayed in the CLI and API for
allocations. Rescheduling is enabled by default for service and batch jobs
with the options shown below.
### `reschedule` Parameter Defaults
2018-02-08 14:23:19 +00:00
The values for the `reschedule` parameters vary by job type. Below are the
defaults by job type:
2020-12-01 06:08:39 +00:00
- The default batch reschedule policy is:
2020-02-06 23:45:31 +00:00
```hcl
reschedule {
attempts = 1
interval = "24h"
unlimited = false
delay = "5s"
delay_function = "constant"
}
```
2020-12-01 06:08:39 +00:00
- The default service reschedule policy is:
2020-02-06 23:45:31 +00:00
```hcl
reschedule {
delay = "30s"
delay_function = "exponential"
max_delay = "1h"
unlimited = true
}
```
2020-02-06 23:45:31 +00:00
### Disabling rescheduling
2018-02-06 18:25:44 +00:00
To disable rescheduling, set the `attempts` parameter to zero and `unlimited` to false.
2018-02-06 18:25:44 +00:00
```hcl
job "docs" {
group "example" {
reschedule {
attempts = 0
unlimited = false
2018-02-06 18:25:44 +00:00
}
}
}
```