d60c1393a3
* docs: remove cores/memory beta label, update driver cpu docs * docs: fixup cr stuff
142 lines
5 KiB
Plaintext
142 lines
5 KiB
Plaintext
---
|
||
layout: docs
|
||
page_title: resources Block - Job Specification
|
||
description: |-
|
||
The "resources" block describes the requirements a task needs to execute.
|
||
Resource requirements include memory, cpu, and more.
|
||
---
|
||
|
||
# `resources` Block
|
||
|
||
<Placement groups={['job', 'group', 'task', 'resources']} />
|
||
|
||
The `resources` block describes the requirements a task needs to execute.
|
||
Resource requirements include memory, CPU, and more.
|
||
|
||
```hcl
|
||
job "docs" {
|
||
group "example" {
|
||
task "server" {
|
||
resources {
|
||
cpu = 100
|
||
memory = 256
|
||
|
||
device "nvidia/gpu" {
|
||
count = 2
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
## `resources` Parameters
|
||
|
||
- `cpu` `(int: 100)` - Specifies the CPU required to run this task in MHz.
|
||
|
||
- `cores` <code>(`int`: <optional>)</code> - Specifies the number of CPU cores
|
||
to reserve specifically for the task. This may not be used with `cpu`. The behavior
|
||
of setting `cores` is specific to each task driver (e.g. [docker][docker_cpu], [exec][exec_cpu]).
|
||
|
||
- `memory` `(int: 300)` - Specifies the memory required in MB.
|
||
|
||
- `memory_max` <code>(`int`: <optional>)</code> - Optionally, specifies the
|
||
maximum memory the task may use, if the client has excess memory capacity, in MB.
|
||
See [Memory Oversubscription](#memory-oversubscription) for more details.
|
||
|
||
- `device` <code>([Device][]: <optional>)</code> - Specifies the device
|
||
requirements. This may be repeated to request multiple device types.
|
||
|
||
## `resources` Examples
|
||
|
||
The following examples only show the `resources` blocks. Remember that the
|
||
`resources` block is only valid in the placements listed above.
|
||
|
||
### Cores
|
||
|
||
This example specifies that the task requires 2 reserved cores. With this block, Nomad will find
|
||
a client with enough spare capacity to reserve 2 cores exclusively for the task. Unlike the `cpu` field, the
|
||
task will not share cpu time with any other tasks managed by Nomad on the client.
|
||
|
||
```hcl
|
||
resources {
|
||
cores = 2
|
||
}
|
||
```
|
||
|
||
If `cores` and `cpu` are both defined in the same resource block, validation of the job will fail.
|
||
|
||
### Memory
|
||
|
||
This example specifies the task requires 2 GB of RAM to operate. 2 GB is the
|
||
equivalent of 2000 MB:
|
||
|
||
```hcl
|
||
resources {
|
||
memory = 2000
|
||
}
|
||
```
|
||
|
||
### Devices
|
||
|
||
This example shows a device constraints as specified in the [device][] block
|
||
which require two nvidia GPUs to be made available:
|
||
|
||
```hcl
|
||
resources {
|
||
device "nvidia/gpu" {
|
||
count = 2
|
||
}
|
||
}
|
||
```
|
||
## Memory Oversubscription
|
||
|
||
Setting task memory limits requires balancing the risk of interrupting tasks
|
||
against the risk of wasting resources. If a task memory limit is set too low,
|
||
the task may exceed the limit and be interrupted; if the task memory is too
|
||
high, the cluster is left underutilized.
|
||
|
||
To help maximize cluster memory utilization while allowing a safety margin for
|
||
unexpected load spikes, Nomad 1.1. lets job authors set two separate memory
|
||
limits:
|
||
|
||
* `memory`: the reserve limit to represent the task’s typical memory usage —
|
||
this number is used by the Nomad scheduler to reserve and place the task
|
||
|
||
* `memory_max`: the maximum memory the task may use, if the client has excess
|
||
available memory, and may be terminated if it exceeds
|
||
|
||
If a client's memory becomes contended or low, the operating system will
|
||
pressure the running tasks to free up memory. If the contention persists, Nomad
|
||
may kill oversubscribed tasks and reschedule them to other clients. The exact
|
||
mechanism for memory pressure is specific to the task driver, operating system,
|
||
and application runtime.
|
||
|
||
The new max limit attribute is currently supported by the official `docker`,
|
||
`exec`, and `java` task drivers. Consult the documentation of
|
||
community-supported task drivers for their memory oversubscription support.
|
||
|
||
Memory oversubscription is opt-in. Nomad operators can enable [Memory Oversubscription in the scheduler
|
||
configuration](/nomad/api-docs/operator/scheduler#update-scheduler-configuration). Enterprise customers can use [Resource
|
||
Quotas](/nomad/tutorials/governance-and-policy/quotas) to limit the memory
|
||
oversubscription.
|
||
|
||
To avoid degrading the cluster experience, we recommend examining and monitoring
|
||
resource utilization and considering the following suggestions:
|
||
|
||
* Set `oom_score_adj` for Linux host services that aren't managed by Nomad, e.g.
|
||
Docker, logging services, and the Nomad agent itself. For Systemd services, you can use the [`OOMScoreAdj` field](https://github.com/hashicorp/nomad/blob/v1.0.0/dist/systemd/nomad.service#L25).
|
||
|
||
* Monitor hosts for memory utilization and set alerts on Out-Of-Memory errors
|
||
|
||
* Set the [client `reserved`](/nomad/docs/configuration/client#reserved) with enough
|
||
memory for host services that aren't managed by Nomad as well as a buffer
|
||
for the memory excess. For example, if the client reserved memory is 1GB,
|
||
the allocations on the host may exceed their soft memory limit by almost
|
||
1GB in aggregate before the memory becomes contended and allocations get
|
||
killed.
|
||
|
||
[device]: /nomad/docs/job-specification/device 'Nomad device Job Specification'
|
||
[docker_cpu]: /nomad/docs/drivers/docker#cpu
|
||
[exec_cpu]: /nomad/docs/drivers/exec#cpu
|