304 lines
7.6 KiB
Plaintext
304 lines
7.6 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: device Stanza - Job Specification
|
|
description: |-
|
|
The "device" stanza is used to require a certain device be made available
|
|
to the task.
|
|
---
|
|
|
|
# `device` Stanza
|
|
|
|
<Placement groups={['job', 'group', 'task', 'resources', 'device']} />
|
|
|
|
The `device` stanza is used to create both a scheduling and runtime requirement
|
|
that the given task has access to the specified devices. A device is a hardware
|
|
device that is attached to the node and may be made available to the task.
|
|
Examples are GPUs, FPGAs, and TPUs.
|
|
|
|
When a `device` stanza is added, Nomad will schedule the task onto a node that
|
|
contains the set of device(s) that meet the specified requirements. The `device` stanza
|
|
allows the operator to specify as little as just the type of device required,
|
|
such as `gpu`, all the way to specifying arbitrary constraints and affinities.
|
|
Once the scheduler has placed the allocation on a suitable node, the Nomad
|
|
Client will invoke the device plugin to retrieve information on how to mount the
|
|
device and what environment variables to expose. For more information on the
|
|
runtime environment, please consult the individual device plugin's documentation.
|
|
|
|
See the [device plugin's documentation][devices] for a list of supported devices.
|
|
|
|
```hcl
|
|
job "docs" {
|
|
group "example" {
|
|
task "server" {
|
|
resources {
|
|
device "nvidia/gpu" {
|
|
count = 2
|
|
|
|
constraint {
|
|
attribute = "${device.attr.memory}"
|
|
operator = ">="
|
|
value = "2 GiB"
|
|
}
|
|
|
|
affinity {
|
|
attribute = "${device.attr.memory}"
|
|
operator = ">="
|
|
value = "4 GiB"
|
|
weight = 75
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
In the above example, the task is requesting two GPUs, from the Nvidia vendor,
|
|
but is not specifying the specific model required. Instead it is placing a hard
|
|
constraint that the device has at least 2 GiB of memory and that it would prefer
|
|
to use GPUs that have at least 4 GiB. This examples shows how expressive the
|
|
`device` stanza can be.
|
|
|
|
~> Device support is currently limited to Linux, and container based drivers
|
|
due to the ability to isolate devices to specific tasks.
|
|
|
|
## `device` Parameters
|
|
|
|
- `name` `(string: "")` - Specifies the device required. The following inputs
|
|
are valid:
|
|
|
|
- `<device_type>`: If a single value is given, it is assumed to be the device
|
|
type, such as "gpu", or "fpga".
|
|
|
|
- `<vendor>/<device_type>`: If two values are given separated by a `/`, the
|
|
given device type will be selected, constraining on the provided vendor.
|
|
Examples include "nvidia/gpu" or "amd/gpu".
|
|
|
|
- `<vendor>/<device_type>/<model>`: If three values are given separated by a `/`, the
|
|
given device type will be selected, constraining on the provided vendor, and
|
|
model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti".
|
|
|
|
- `count` `(int: 1)` - Specifies the number of instances of the given device
|
|
that are required.
|
|
|
|
- `constraint` <code>([Constraint][]: nil)</code> - Constraints to restrict
|
|
which devices are eligible. This can be provided multiple times to define
|
|
additional constraints. See below for available attributes.
|
|
|
|
- `affinity` <code>([Affinity][]: nil)</code> - Affinity to specify a preference
|
|
for which devices get selected. This can be provided multiple times to define
|
|
additional affinities. See below for available attributes.
|
|
|
|
## `device` Constraint and Affinity Attributes
|
|
|
|
The set of attributes available for use in a `constraint` or `affinity` are as
|
|
follows:
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Variable</th>
|
|
<th>Description</th>
|
|
<th>Example Value</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>
|
|
<code>{'${device.type}'}</code>
|
|
</td>
|
|
<td>The type of device</td>
|
|
<td>
|
|
<code>"gpu", "tpu", "fpga"</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>{'${device.vendor}'}</code>
|
|
</td>
|
|
<td>The device's vendor</td>
|
|
<td>
|
|
<code>"amd", "nvidia", "intel"</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>{'${device.model}'}</code>
|
|
</td>
|
|
<td>The device's model</td>
|
|
<td>
|
|
<code>"1080ti"</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>
|
|
${'{'}device.attr.<property>{'}'}
|
|
</code>
|
|
</td>
|
|
<td>Property of the device</td>
|
|
<td>
|
|
<code>{'${device.attr.memory} => 8 GiB'}</code>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
For the set of attributes available, please see the individual [device plugin's
|
|
documentation][devices].
|
|
|
|
### Attribute Units and Conversions
|
|
|
|
Devices report their attributes with strict types and can also provide unit
|
|
information. For example, when a GPU is reporting its memory, it can report that
|
|
it is "4096 MiB". Since Nomad has the associated unit information, a constraint
|
|
that requires greater than "3.5 GiB" can match since Nomad can convert between
|
|
these units.
|
|
|
|
The units Nomad supports are as follows:
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Base Unit</th>
|
|
<th>Values</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>
|
|
<code>Byte</code>
|
|
</td>
|
|
<td>
|
|
<code>
|
|
<strong>Base 2</strong>: KiB, MiB, GiB, TiB, PiB, EiB
|
|
</code>
|
|
<br />
|
|
<code>
|
|
<strong>Base 10</strong>: kB, KB (equivalent to kB), MB, GB, TB, PB,
|
|
EB
|
|
</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>Byte Rates</code>
|
|
</td>
|
|
<td>
|
|
<code>
|
|
<strong>Base 2</strong>: KiB/s, MiB/s, GiB/s, TiB/s, PiB/s, EiB/s
|
|
</code>
|
|
<br />
|
|
<code>
|
|
<strong>Base 10</strong>: kB/s, KB/s (equivalent to kB/s), MB/s, GB/s,
|
|
TB/s, PB/s,EB/s
|
|
</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>Hertz</code>
|
|
</td>
|
|
<td>
|
|
<code>MHz, GHz</code>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<code>Watts</code>
|
|
</td>
|
|
<td>
|
|
<code>mW, W, kW, MW, GW</code>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
Conversion is only possible within the same base unit.
|
|
|
|
## `device` Examples
|
|
|
|
The following examples only show the `device` stanzas. Remember that the
|
|
`device` stanza is only valid in the placements listed above.
|
|
|
|
### Single Nvidia GPU
|
|
|
|
This example schedules a task with a single Nvidia GPU made available.
|
|
|
|
```hcl
|
|
device "nvidia/gpu" {}
|
|
```
|
|
|
|
### Multiple Nvidia GPU
|
|
|
|
This example schedules a task with a two Nvidia GPU made available.
|
|
|
|
```hcl
|
|
device "nvidia/gpu" {
|
|
count = 2
|
|
}
|
|
```
|
|
|
|
### Single Nvidia GPU with Specific Model
|
|
|
|
This example schedules a task with a single Nvidia GPU made available and uses
|
|
the name to specify the exact model to be used.
|
|
|
|
```hcl
|
|
device "nvidia/gpu/1080ti" {}
|
|
```
|
|
|
|
This is a simplification of the following:
|
|
|
|
```hcl
|
|
device "gpu" {
|
|
count = 1
|
|
|
|
constraint {
|
|
attribute = "${device.vendor}"
|
|
value = "nvidia"
|
|
}
|
|
|
|
constraint {
|
|
attribute = "${device.model}"
|
|
value = "1080ti"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Affinity with Unit Conversion
|
|
|
|
This example uses an affinity to tell the scheduler it would prefer if the GPU
|
|
had at least 1.5 GiB of memory. The following are both equivalent as Nomad can
|
|
do unit conversions.
|
|
|
|
Specified in `GiB`:
|
|
|
|
```hcl
|
|
device "nvidia/gpu" {
|
|
affinity {
|
|
attribute = "${device.attr.memory}"
|
|
operator = ">="
|
|
value = "1.5 GiB"
|
|
weight = 75
|
|
}
|
|
}
|
|
```
|
|
|
|
Specified in `MiB`:
|
|
|
|
```hcl
|
|
device "nvidia/gpu" {
|
|
affinity {
|
|
attribute = "${device.attr.memory}"
|
|
operator = ">="
|
|
value = "1500 MiB"
|
|
weight = 75
|
|
}
|
|
}
|
|
```
|
|
|
|
[affinity]: /docs/job-specification/affinity 'Nomad affinity Job Specification'
|
|
[constraint]: /docs/job-specification/constraint 'Nomad constraint Job Specification'
|
|
[devices]: /docs/devices 'Nomad Device Plugins'
|