open-nomad/website/content/docs/job-specification/spread.mdx

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

168 lines
5.5 KiB
Plaintext
Raw Normal View History

2018-12-10 17:54:23 +00:00
---
2020-02-06 23:45:31 +00:00
layout: docs
page_title: spread Stanza - Job Specification
description: >-
The "spread" stanza is used to spread placements across a certain node
attributes such as datacenter.
Spread may be specified at the job or group levels for ultimate flexibility.
2020-02-06 23:45:31 +00:00
More than one spread stanza may be specified with relative weights between
each.
2018-12-10 17:54:23 +00:00
---
# `spread` Stanza
2020-02-06 23:45:31 +00:00
<Placement
groups={[
['job', 'spread'],
2020-09-30 13:48:40 +00:00
['job', 'group', 'spread'],
2020-02-06 23:45:31 +00:00
]}
/>
2018-12-10 17:54:23 +00:00
2019-01-28 22:59:55 +00:00
The `spread` stanza allows operators to increase the failure tolerance of their
applications by specifying a node attribute that allocations should be spread
over. This allows operators to spread allocations over attributes such as
datacenter, availability zone, or even rack in a physical datacenter. By
default, when using spread the scheduler will attempt to place allocations
equally among the available values of the given target.
2018-12-10 17:54:23 +00:00
2019-01-24 20:58:26 +00:00
```hcl
job "docs" {
# Spread allocations over all datacenter
spread {
attribute = "${node.datacenter}"
}
group "example" {
# Spread allocations over each rack based on desired percentage
spread {
attribute = "${meta.rack}"
target "r1" {
2019-01-24 20:58:26 +00:00
percent = 60
}
target "r2" {
2019-01-24 20:58:26 +00:00
percent = 40
}
}
}
}
```
2019-01-10 21:46:03 +00:00
Nodes are scored according to how closely they match the desired target percentage defined in the
2018-12-10 17:54:23 +00:00
spread stanza. Spread scores are combined with other scoring factors such as bin packing.
A job or task group can have more than one spread criteria, with weights to express relative preference.
scheduler: fix quadratic performance with spread blocks (#11712) When the scheduler picks a node for each evaluation, the `LimitIterator` provides at most 2 eligible nodes for the `MaxScoreIterator` to choose from. This keeps scheduling fast while producing acceptable results because the results are binpacked. Jobs with a `spread` block (or node affinity) remove this limit in order to produce correct spread scoring. This means that every allocation within a job with a `spread` block is evaluated against _all_ eligible nodes. Operators of large clusters have reported that jobs with `spread` blocks that are eligible on a large number of nodes can take longer than the nack timeout to evaluate (60s). Typical evaluations are processed in milliseconds. In practice, it's not necessary to evaluate every eligible node for every allocation on large clusters, because the `RandomIterator` at the base of the scheduler stack produces enough variation in each pass that the likelihood of an uneven spread is negligible. Note that feasibility is checked before the limit, so this only impacts the number of _eligible_ nodes available for scoring, not the total number of nodes. This changeset sets the iterator limit for "large" `spread` block and node affinity jobs to be equal to the number of desired allocations. This brings an example problematic job evaluation down from ~3min to ~10s. The included tests ensure that we have acceptable spread results across a variety of large cluster topologies.
2021-12-21 15:10:01 +00:00
Spread criteria are treated as a soft preference by the Nomad
scheduler. If no nodes match a given spread criteria, placement is
still successful. To avoid scoring every node for every placement,
allocations may not be perfectly spread. Spread works best on
attributes with similar number of nodes: identically configured racks
or similarly configured datacenters.
2019-01-10 21:46:03 +00:00
Spread may be expressed on [attributes][interpolation] or [client metadata][client-meta].
Additionally, spread may be specified at the [job][job] and [group][group] levels for ultimate flexibility. Job level spread criteria are inherited by all task groups in the job.
2019-01-10 21:46:03 +00:00
2018-12-10 17:54:23 +00:00
## `spread` Parameters
- `attribute` `(string: "")` - Specifies the name or reference of the attribute
to use. This can be any of the [Nomad interpolated
2020-02-06 23:45:31 +00:00
values](/docs/runtime/interpolation#interpreted_node_vars).
2018-12-10 17:54:23 +00:00
2020-02-06 23:45:31 +00:00
- `target` <code>([target](#target-parameters): &lt;required&gt;)</code> - Specifies one or more target
percentages for each value of the `attribute` in the spread stanza. If this is omitted,
Nomad will spread allocations evenly across all values of the attribute.
2018-12-10 17:54:23 +00:00
- `weight` `(integer:0)` - Specifies a weight for the spread stanza. The weight is used
2019-01-10 21:46:03 +00:00
during scoring and must be an integer between 0 to 100. Weights can be used
2018-12-10 17:54:23 +00:00
when there is more than one spread or affinity stanza to express relative preference across them.
## `target` Parameters
- `value` `(string:"")` - Specifies a target value of the attribute from a `spread` stanza.
- `percent` `(integer:0)` - Specifies the percentage associated with the target value.
## `spread` Examples
The following examples show different ways to use the `spread` stanza.
### Even Spread Across Data Center
2019-01-10 21:46:03 +00:00
This example shows a spread stanza across the node's `datacenter` attribute. If we have
two datacenters `us-east1` and `us-west1`, and a task group of `count = 10`,
Nomad will attempt to place 5 allocations in each datacenter.
2018-12-10 17:54:23 +00:00
```hcl
spread {
attribute = "${node.datacenter}"
weight = 100
}
```
### Spread With Target Percentages
2019-01-10 21:46:03 +00:00
This example shows a spread stanza that specifies one target percentage. If we
have three datacenters `us-east1`, `us-east2`, and `us-west1`, and a task group
of `count = 10`, Nomad will attempt to place 5 of the allocations in "us-east1",
and will spread the remaining among the other two datacenters.
2019-01-10 21:46:03 +00:00
```hcl
spread {
attribute = "${node.datacenter}"
weight = 100
target "us-east1" {
percent = 50
}
}
```
2018-12-10 17:54:23 +00:00
This example shows a spread stanza that specifies target percentages for two
2019-01-10 21:46:03 +00:00
different datacenters. If we have two datacenters `us-east1` and `us-west1`,
and a task group of `count = 10`, Nomad will attempt to place 6 allocations
in `us-east1` and 4 in `us-west1`.
2018-12-10 17:54:23 +00:00
```hcl
spread {
attribute = "${node.datacenter}"
weight = 100
2019-01-10 21:46:03 +00:00
2018-12-10 17:54:23 +00:00
target "us-east1" {
percent = 60
}
2019-01-10 21:46:03 +00:00
2018-12-10 17:54:23 +00:00
target "us-west1" {
percent = 40
}
}
```
### Spread Across Multiple Attributes
2019-01-10 21:46:03 +00:00
This example shows spread stanzas with multiple attributes. Consider a Nomad cluster
where there are two datacenters `us-east1` and `us-west1`, and each datacenter has nodes
with `${meta.rack}` being `r1` or `r2`. With the following spread stanza used on a job with `count=12`, Nomad
2019-01-10 21:46:03 +00:00
will attempt to place 6 allocations in each datacenter. Within a datacenter, Nomad will
attempt to place 3 allocations in nodes on rack `r1`, and 3 allocations in nodes on rack `r2`.
2018-12-10 17:54:23 +00:00
```hcl
spread {
attribute = "${node.datacenter}"
weight = 50
}
spread {
attribute = "${meta.rack}"
weight = 50
}
```
2020-02-06 23:45:31 +00:00
[job]: /docs/job-specification/job 'Nomad job Job Specification'
[group]: /docs/job-specification/group 'Nomad group Job Specification'
[client-meta]: /docs/configuration/client#meta 'Nomad meta Job Specification'
[task]: /docs/job-specification/task 'Nomad task Job Specification'
[interpolation]: /docs/runtime/interpolation 'Nomad interpolation'
[node-variables]: /docs/runtime/interpolation#node-variables- 'Nomad interpolation-Node variables'
[constraint]: /docs/job-specification/constraint 'Nomad Constraint job Specification'