open-nomad/website/content/docs/job-specification
Tim Gross b0c3b99b03
scheduler: fix quadratic performance with spread blocks (#11712)
When the scheduler picks a node for each evaluation, the
`LimitIterator` provides at most 2 eligible nodes for the
`MaxScoreIterator` to choose from. This keeps scheduling fast while
producing acceptable results because the results are binpacked.

Jobs with a `spread` block (or node affinity) remove this limit in
order to produce correct spread scoring. This means that every
allocation within a job with a `spread` block is evaluated against
_all_ eligible nodes. Operators of large clusters have reported that
jobs with `spread` blocks that are eligible on a large number of nodes
can take longer than the nack timeout to evaluate (60s). Typical
evaluations are processed in milliseconds.

In practice, it's not necessary to evaluate every eligible node for
every allocation on large clusters, because the `RandomIterator` at
the base of the scheduler stack produces enough variation in each pass
that the likelihood of an uneven spread is negligible. Note that
feasibility is checked before the limit, so this only impacts the
number of _eligible_ nodes available for scoring, not the total number
of nodes.

This changeset sets the iterator limit for "large" `spread` block and
node affinity jobs to be equal to the number of desired
allocations. This brings an example problematic job evaluation down
from ~3min to ~10s. The included tests ensure that we have acceptable
spread results across a variety of large cluster topologies.
2021-12-21 10:10:01 -05:00
..
hcl2 docs: fix jobspec hcl2 locals example. 2021-05-21 15:20:46 +02:00
affinity.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
artifact.mdx git example - suggest providing real repo 2021-05-03 08:12:10 -04:00
check_restart.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
connect.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
constraint.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
csi_plugin.mdx docs: update csi_plugin example (#10821) 2021-06-28 08:28:03 -04:00
device.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
dispatch_payload.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
env.mdx docs: note env and meta map assignment syntax (#11095) 2021-08-29 14:35:09 -04:00
ephemeral_disk.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
expose.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
gateway.mdx Mesh Gateway doc enhancements (#11354) 2021-12-20 17:10:44 -05:00
group.mdx docs: clarify that a default update strategy is used when update strategy is omitted 2021-05-10 08:27:22 -04:00
index.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
job.mdx docs: add Nomad version requirement note for sysbatch (#11231) 2021-09-29 15:14:51 -04:00
lifecycle.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
logs.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
meta.mdx docs: add some extra documentation around client host environment variables (#11208) 2021-09-21 17:23:30 -04:00
migrate.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
multiregion.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
network.mdx docs: document that network mode is only supported on Linux (#11192) 2021-10-01 23:17:20 -04:00
parameterized.mdx mention sysbatch in addition to batch (#11587) 2021-12-06 19:12:03 -05:00
periodic.mdx mention sysbatch in addition to batch (#11587) 2021-12-06 19:12:03 -05:00
proxy.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
reschedule.mdx core: implement system batch scheduler 2021-08-03 10:30:47 -04:00
resources.mdx add a section about memory oversubscription (#10573) 2021-05-13 13:35:51 -04:00
restart.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
scaling.mdx feat(website): extract /plugins /tools docs (#11584) 2021-12-09 14:25:18 -05:00
service.mdx docs: clarify default check.initial_status behavior 2021-06-03 10:02:25 -04:00
sidecar_service.mdx documentation for disable_default_tcp_check 2021-05-07 13:16:39 -04:00
sidecar_task.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
spread.mdx scheduler: fix quadratic performance with spread blocks (#11712) 2021-12-21 10:10:01 -05:00
task.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
template.mdx docs: add more references and examples to the template block (#11691) 2021-12-16 14:14:01 -05:00
update.mdx docs: clarify that a default update strategy is used when update strategy is omitted 2021-05-10 08:27:22 -04:00
upstreams.mdx consul/connect: fix upstream mesh gateway default mode setting 2021-06-04 08:53:12 -05:00
vault.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
volume.mdx docs: mount_flags takes a slice of strings (#11583) 2021-11-29 10:07:34 -05:00
volume_mount.mdx feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00