f13f022176
Also remove old version indicators. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>
95 lines
4.4 KiB
Plaintext
95 lines
4.4 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Schedulers
|
|
description: Learn about Nomad's various schedulers.
|
|
---
|
|
|
|
# Schedulers
|
|
|
|
Nomad has four scheduler types that can be used when creating your job:
|
|
`service`, `batch`, `system` and `sysbatch`. Here we will describe the differences
|
|
between each of these schedulers.
|
|
|
|
## Service
|
|
|
|
The `service` scheduler is designed for scheduling long lived services that
|
|
should never go down. As such, the `service` scheduler ranks a large portion
|
|
of the nodes that meet the job's constraints and selects the optimal node to
|
|
place a task group on. The `service` scheduler uses a best fit scoring algorithm
|
|
influenced by Google's work on [Borg]. Ranking this larger set of candidate
|
|
nodes increases scheduling time but provides greater guarantees about the
|
|
optimality of a job placement, which given the service workload is highly
|
|
desirable.
|
|
|
|
Service jobs are intended to run until explicitly stopped by an operator. If a
|
|
service task exits it is considered a failure and handled according to the job's
|
|
[restart] and [reschedule] blocks.
|
|
|
|
## Batch
|
|
|
|
Batch jobs are much less sensitive to short term performance fluctuations and
|
|
are short lived, finishing in a few minutes to a few days. Although the `batch`
|
|
scheduler is very similar to the `service` scheduler, it makes certain
|
|
optimizations for the batch workload. The main distinction is that after finding
|
|
the set of nodes that meet the job's constraints it uses the power of two
|
|
choices described in Berkeley's [Sparrow] scheduler to limit the number of nodes
|
|
that are ranked.
|
|
|
|
Batch jobs are intended to run until they exit successfully. Batch tasks that
|
|
exit with an error are handled according to the job's [restart] and [reschedule]
|
|
blocks.
|
|
|
|
## System
|
|
|
|
The `system` scheduler is used to register jobs that should be run on all
|
|
clients that meet the job's constraints. The `system` scheduler is also invoked
|
|
when clients join the cluster or transition into the ready state. This means
|
|
that all registered `system` jobs will be re-evaluated and their tasks will be
|
|
placed on the newly available nodes if the constraints are met.
|
|
|
|
This scheduler type is extremely useful for deploying and managing tasks that
|
|
should be present on every node in the cluster. Since these tasks are
|
|
managed by Nomad, they can take advantage of job updating,
|
|
service discovery, and more.
|
|
|
|
The system scheduler will preempt eligible lower priority tasks running on a
|
|
node if there isn't enough capacity to place a system job. See [preemption]
|
|
for details on how tasks that get preempted are chosen.
|
|
|
|
Systems jobs are intended to run until explicitly stopped either by an operator
|
|
or [preemption]. If a system task exits it is considered a failure and handled
|
|
according to the job's [restart] block; system jobs do not have rescheduling.
|
|
|
|
## System Batch
|
|
|
|
The `sysbatch` scheduler is used to register jobs that should be run to completion
|
|
on all clients that meet the job's constraints. The `sysbatch` scheduler will
|
|
schedule jobs similarly to the `system` scheduler, but like a `batch` job once a
|
|
task exits successfully it is not restarted on that client.
|
|
|
|
This scheduler type is useful for issuing "one off" commands to be run on every
|
|
node in the cluster. Sysbatch jobs can also be created as [periodic] and [parameterized]
|
|
jobs. Since these tasks are managed by Nomad, they can take advantage of job
|
|
updating, service discovery, monitoring, and more.
|
|
|
|
The `sysbatch` scheduler will preempt lower priority tasks running on a node if there
|
|
is not enough capacity to place the job. See preemption details on how tasks that
|
|
get preempted are chosen.
|
|
|
|
Sysbatch jobs are intended to run until successful completion, explicitly stopped
|
|
by an operator, or evicted through [preemption]. Sysbatch tasks that exit with an
|
|
error are handled according to the job's [restart] block.
|
|
|
|
Like the `batch` scheduler, task groups in system batch jobs may have a `count`
|
|
greater than 1 to control how many instances are run. Instances that cannot be
|
|
immediately placed will be scheduled when resources become available,
|
|
potentially on a node that has already run another instance of the same job.
|
|
|
|
[borg]: https://research.google.com/pubs/pub43438.html
|
|
[parameterized]: /nomad/docs/job-specification/parameterized
|
|
[periodic]: /nomad/docs/job-specification/periodic
|
|
[preemption]: /nomad/docs/concepts/scheduling/preemption
|
|
[restart]: /nomad/docs/job-specification/restart
|
|
[reschedule]: /nomad/docs/job-specification/reschedule
|
|
[sparrow]: https://cs.stanford.edu/~matei/papers/2013/sosp_sparrow.pdf
|