open-nomad/website/content/docs/job-specification/resources.mdx
Seth Hoenig d60c1393a3
docs: remove cores/memory beta label, update driver cpu docs (#16175)
* docs: remove cores/memory beta label, update driver cpu docs

* docs: fixup cr stuff
2023-02-14 14:43:07 -06:00

142 lines
5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: docs
page_title: resources Block - Job Specification
description: |-
The "resources" block describes the requirements a task needs to execute.
Resource requirements include memory, cpu, and more.
---
# `resources` Block
<Placement groups={['job', 'group', 'task', 'resources']} />
The `resources` block describes the requirements a task needs to execute.
Resource requirements include memory, CPU, and more.
```hcl
job "docs" {
group "example" {
task "server" {
resources {
cpu = 100
memory = 256
device "nvidia/gpu" {
count = 2
}
}
}
}
}
```
## `resources` Parameters
- `cpu` `(int: 100)` - Specifies the CPU required to run this task in MHz.
- `cores` <code>(`int`: &lt;optional&gt;)</code> - Specifies the number of CPU cores
to reserve specifically for the task. This may not be used with `cpu`. The behavior
of setting `cores` is specific to each task driver (e.g. [docker][docker_cpu], [exec][exec_cpu]).
- `memory` `(int: 300)` - Specifies the memory required in MB.
- `memory_max` <code>(`int`: &lt;optional&gt;)</code> - Optionally, specifies the
maximum memory the task may use, if the client has excess memory capacity, in MB.
See [Memory Oversubscription](#memory-oversubscription) for more details.
- `device` <code>([Device][]: &lt;optional&gt;)</code> - Specifies the device
requirements. This may be repeated to request multiple device types.
## `resources` Examples
The following examples only show the `resources` blocks. Remember that the
`resources` block is only valid in the placements listed above.
### Cores
This example specifies that the task requires 2 reserved cores. With this block, Nomad will find
a client with enough spare capacity to reserve 2 cores exclusively for the task. Unlike the `cpu` field, the
task will not share cpu time with any other tasks managed by Nomad on the client.
```hcl
resources {
cores = 2
}
```
If `cores` and `cpu` are both defined in the same resource block, validation of the job will fail.
### Memory
This example specifies the task requires 2 GB of RAM to operate. 2 GB is the
equivalent of 2000 MB:
```hcl
resources {
memory = 2000
}
```
### Devices
This example shows a device constraints as specified in the [device][] block
which require two nvidia GPUs to be made available:
```hcl
resources {
device "nvidia/gpu" {
count = 2
}
}
```
## Memory Oversubscription
Setting task memory limits requires balancing the risk of interrupting tasks
against the risk of wasting resources. If a task memory limit is set too low,
the task may exceed the limit and be interrupted; if the task memory is too
high, the cluster is left underutilized.
To help maximize cluster memory utilization while allowing a safety margin for
unexpected load spikes, Nomad 1.1. lets job authors set two separate memory
limits:
* `memory`: the reserve limit to represent the tasks typical memory usage —
this number is used by the Nomad scheduler to reserve and place the task
* `memory_max`: the maximum memory the task may use, if the client has excess
available memory, and may be terminated if it exceeds
If a client's memory becomes contended or low, the operating system will
pressure the running tasks to free up memory. If the contention persists, Nomad
may kill oversubscribed tasks and reschedule them to other clients. The exact
mechanism for memory pressure is specific to the task driver, operating system,
and application runtime.
The new max limit attribute is currently supported by the official `docker`,
`exec`, and `java` task drivers. Consult the documentation of
community-supported task drivers for their memory oversubscription support.
Memory oversubscription is opt-in. Nomad operators can enable [Memory Oversubscription in the scheduler
configuration](/nomad/api-docs/operator/scheduler#update-scheduler-configuration). Enterprise customers can use [Resource
Quotas](/nomad/tutorials/governance-and-policy/quotas) to limit the memory
oversubscription.
To avoid degrading the cluster experience, we recommend examining and monitoring
resource utilization and considering the following suggestions:
* Set `oom_score_adj` for Linux host services that aren't managed by Nomad, e.g.
Docker, logging services, and the Nomad agent itself. For Systemd services, you can use the [`OOMScoreAdj` field](https://github.com/hashicorp/nomad/blob/v1.0.0/dist/systemd/nomad.service#L25).
* Monitor hosts for memory utilization and set alerts on Out-Of-Memory errors
* Set the [client `reserved`](/nomad/docs/configuration/client#reserved) with enough
memory for host services that aren't managed by Nomad as well as a buffer
for the memory excess. For example, if the client reserved memory is 1GB,
the allocations on the host may exceed their soft memory limit by almost
1GB in aggregate before the memory becomes contended and allocations get
killed.
[device]: /nomad/docs/job-specification/device 'Nomad device Job Specification'
[docker_cpu]: /nomad/docs/drivers/docker#cpu
[exec_cpu]: /nomad/docs/drivers/exec#cpu