256 lines
7.2 KiB
Markdown
256 lines
7.2 KiB
Markdown
|
---
|
||
|
layout: "docs"
|
||
|
page_title: "device Stanza - Job Specification"
|
||
|
sidebar_current: "docs-job-specification-device"
|
||
|
description: |-
|
||
|
The "device" stanza is used to require a certain device be made available
|
||
|
to the task.
|
||
|
---
|
||
|
|
||
|
# `device` Stanza
|
||
|
|
||
|
<table class="table table-bordered table-striped">
|
||
|
<tr>
|
||
|
<th width="120">Placement</th>
|
||
|
<td>
|
||
|
<code>job -> group -> task -> resources -> **device**</code>
|
||
|
</td>
|
||
|
</tr>
|
||
|
</table>
|
||
|
|
||
|
The `device` stanza is used to create both a scheduling and runtime requirement
|
||
|
that the given task has access to the specified devices. A device is a hardware
|
||
|
device that is attached to the node and may be made available to the task.
|
||
|
Examples are GPUs, FPGAs, and TPUs.
|
||
|
|
||
|
When a `device` stanza is added, Nomad will schedule the task onto a node that
|
||
|
contains the set of device(s) that meet the specified requirements. The `device` stanza
|
||
|
allows the operator to specify as little as just the type of device required,
|
||
|
such as `gpu`, all the way to specifying arbitrary constraints and affinities.
|
||
|
Once the scheduler has placed the allocation on a suitable node, the Nomad
|
||
|
Client will invoke the device plugin to retrieve information on how to mount the
|
||
|
device and what environment variables to expose. For more information on the
|
||
|
runtime environment, please consult the individual device plugin's documentation.
|
||
|
|
||
|
See the [device plugin's documentation][devices] for a list of supported devices.
|
||
|
|
||
|
```hcl
|
||
|
job "docs" {
|
||
|
group "example" {
|
||
|
task "server" {
|
||
|
resources {
|
||
|
device "nvidia/gpu" {
|
||
|
count = 2
|
||
|
|
||
|
constraint {
|
||
|
attribute = "${driver.attr.memory}"
|
||
|
operator = ">="
|
||
|
value = "2 GiB"
|
||
|
}
|
||
|
|
||
|
affinity {
|
||
|
attribute = "${driver.attr.memory}"
|
||
|
operator = ">="
|
||
|
value = "4 GiB"
|
||
|
weight = 75
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
In the above example, the task is requesting two GPUs, from the Nvidia vendor,
|
||
|
but is not specifying the specific model required. Instead it is placing a hard
|
||
|
constraint that the device has at least 2 GiB of memory and that it would prefer
|
||
|
to use GPUs that have at least 4 GiB. This examples shows how expressive the
|
||
|
`device` stanza can be.
|
||
|
|
||
|
~> Device supported is currently limited to Linux, and container based drivers
|
||
|
due to the ability to isolate devices to specific tasks.
|
||
|
|
||
|
## `device` Parameters
|
||
|
|
||
|
- `name` `(string: "")` - Specifies the device required. The following inputs
|
||
|
are valid:
|
||
|
|
||
|
* `<device_type>`: If a single value is given, it is assumed to be the device
|
||
|
type, such as "gpu", or "fpga".
|
||
|
|
||
|
* `<vendor>/<device_type>`: If two values are given separated by a `/`, the
|
||
|
given device type will be selected, constraining on the provided vendor.
|
||
|
Examples include "nvidia/gpu" or "amd/gpu".
|
||
|
|
||
|
* `<vendor>/<device_type>/<model>`: If three values are given separated by a `/`, the
|
||
|
given device type will be selected, constraining on the provided vendor, and
|
||
|
model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti".
|
||
|
|
||
|
- `count` `(int: 1)` - Specifies the number of instances of the given device
|
||
|
that are required.
|
||
|
|
||
|
- `constraint` <code>([Constraint][]: nil)</code> - Constraints to restrict
|
||
|
which devices are eligible. This can be provided multiple times to define
|
||
|
additional constraints. See below for available attributes.
|
||
|
|
||
|
- `affinity` <code>([Affinity][]: nil)</code> - Affinity to specify a preference
|
||
|
for which devices get selected. This can be provided multiple times to define
|
||
|
additional affinities. See below for available attributes.
|
||
|
|
||
|
## `device` Constraint and Affinity Attributes
|
||
|
|
||
|
The set of attributes available for use in a `constraint` or `affinity` are as
|
||
|
follows:
|
||
|
|
||
|
<table class="table table-bordered table-striped">
|
||
|
<tr>
|
||
|
<th>Variable</th>
|
||
|
<th>Description</th>
|
||
|
<th>Example Value</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>${device.type}</tt></td>
|
||
|
<td>The type of device</td>
|
||
|
<td><tt>"gpu", "tpu", "fpga"</tt></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>${device.vendor}</tt></td>
|
||
|
<td>The device's vendor</td>
|
||
|
<td><tt>"amd", "nvidia", "intel"</tt></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>${device.model}</tt></td>
|
||
|
<td>The device's model</td>
|
||
|
<td><tt>"1080ti"</tt></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>${device.attr.<property>}</tt></td>
|
||
|
<td>Property of the device</td>
|
||
|
<td><tt>${device.attr.memory} => 8 GiB</tt></td>
|
||
|
</tr>
|
||
|
</table>
|
||
|
|
||
|
For the set of attributes available, please see the individual [device plugin's
|
||
|
documentation][devices].
|
||
|
|
||
|
### Attribute Units and Conversions
|
||
|
|
||
|
Devices report their attributes with strict types and can also provide unit
|
||
|
information. For example, when a GPU is reporting its memory, it can report that
|
||
|
it is "4096 MiB". Since Nomad has the associated unit information, a constraint
|
||
|
that requires greater than "3.5 GiB" can match since Nomad can convert between
|
||
|
these units.
|
||
|
|
||
|
The units Nomad supports is as follows:
|
||
|
|
||
|
<table class="table table-bordered table-striped">
|
||
|
<tr>
|
||
|
<th>Base Unit</th>
|
||
|
<th>Values</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>Byte</tt></td>
|
||
|
<td><tt>**Base 2**: KiB, MiB, GiB, TiB, PiB, EiB<br>**Base 10**: kB, KB (equivalent to kB), MB, GB, TB, PB, EB</tt>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>Byte Rates</tt></td>
|
||
|
<td><tt>**Base 2**: KiB/s, MiB/s, GiB/s, TiB/s, PiB/s, EiB/s<br>**Base 10**: kB/s, KB/s (equivalent to kB/s), MB/s, GB/s, TB/s, PB/s, EB/s</tt>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>Hertz</tt></td>
|
||
|
<td><tt>MHz, GHz</tt></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>Watts</tt></td>
|
||
|
<td><tt>mW, W, kW, MW, GW</tt></td>
|
||
|
</tr>
|
||
|
</table>
|
||
|
|
||
|
Conversion is only possible within the same base unit.
|
||
|
|
||
|
## `device` Examples
|
||
|
|
||
|
The following examples only show the `device` stanzas. Remember that the
|
||
|
`device` stanza is only valid in the placements listed above.
|
||
|
|
||
|
### Single Nvidia GPU
|
||
|
|
||
|
This example schedules a task with a single Nvidia GPU made available.
|
||
|
|
||
|
```hcl
|
||
|
device "nvidia/gpu" {}
|
||
|
```
|
||
|
|
||
|
### Multiple Nvidia GPU
|
||
|
|
||
|
This example schedules a task with a two Nvidia GPU made available.
|
||
|
|
||
|
```hcl
|
||
|
device "nvidia/gpu" {
|
||
|
count = 2
|
||
|
}
|
||
|
```
|
||
|
|
||
|
### Single Nvidia GPU with Specific Model
|
||
|
|
||
|
This example schedules a task with a single Nvidia GPU made available and uses
|
||
|
the name to specify the exact model to be used.
|
||
|
|
||
|
```hcl
|
||
|
device "nvidia/gpu/1080ti" {}
|
||
|
```
|
||
|
|
||
|
This is a simplification of the following:
|
||
|
|
||
|
```hcl
|
||
|
device "gpu" {
|
||
|
count = 1
|
||
|
|
||
|
constraint {
|
||
|
attribute = "${device.vendor}"
|
||
|
value = "nvidia"
|
||
|
}
|
||
|
|
||
|
constraint {
|
||
|
attribute = "${device.model}"
|
||
|
value = "1080ti"
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
### Affinity with Unit Conversion
|
||
|
|
||
|
This example uses an affinity to tell the scheduler it would prefer if the GPU
|
||
|
had at least 1.5 GiB of memory. The following are both equivalent as Nomad can
|
||
|
do unit conversions.
|
||
|
|
||
|
Specified in `GiB`:
|
||
|
|
||
|
```hcl
|
||
|
device "nvidia/gpu" {
|
||
|
affinity {
|
||
|
attribute = "${device.attr.memory}"
|
||
|
operator = ">="
|
||
|
value = "1.5 GiB"
|
||
|
weight = 75
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Specified in `MiB`:
|
||
|
|
||
|
```hcl
|
||
|
device "nvidia/gpu" {
|
||
|
affinity {
|
||
|
attribute = "${device.attr.memory}"
|
||
|
operator = ">="
|
||
|
value = "1500 MiB"
|
||
|
weight = 75
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
[affinity]: /docs/job-specification/affinity.html "Nomad affinity Job Specification"
|
||
|
[constraint]: /docs/job-specification/constraint.html "Nomad constraint Job Specification"
|
||
|
[devices]: /docs/devices/index.html "Nomad Device Plugins"
|