Commit Graph

17 Commits

Author SHA1 Message Date
Mahmood Ali c2d3c3e431
nvidia: support disabling the nvidia plugin (#8353) 2020-07-21 10:11:16 -04:00
Mahmood Ali 2588b3bc98 cleanup driver eventor goroutines
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.

This change makes few modifications:

First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting.  Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.

Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
2020-05-26 11:04:04 -04:00
Michael Schurter 18853a1df2
Update devices/gpu/nvidia/README.md
Co-Authored-By: dadgar <alex@hashicorp.com>
2019-01-23 17:44:24 -08:00
Alex Dadgar 41d5f52f63 fix package import 2019-01-23 12:53:33 -08:00
Alex Dadgar 003aa2a69c nvidia device plugin docs 2019-01-23 10:58:46 -08:00
oleksii.shyman e41fbf7577 Add support for docker runtimes
- docker fingerprint issues a docker api system info call to get the
  list of supported OCI runtimes.
  - OCI runtimes are reported as comma separated list of names
  - docker driver is aware of GPU runtime presence
  - docker driver throws an error when user tries to run container with
  GPU, when GPU runtime is not present
  - docker GPU runtime name is configurable
2019-01-15 11:34:47 -08:00
Alex Dadgar 730a6f5b9a lint 2018-12-18 16:48:00 -08:00
Alex Dadgar 4c57d2ec4d Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Mahmood Ali ced76978bb devices/nvidia: memory state as the summary stat 2018-12-10 12:18:24 -05:00
Mahmood Ali c5de71a424 Allow nullable fields in StatValues
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).

We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations.  Thus, I opted not to use it.
2018-11-14 14:41:19 -05:00
Mahmood Ali 713c9fe683 Move Stat{Object|Value} to plugins/shared/structs
Moving them as they may be useful for other packages/plugins besides
devices.
2018-11-14 09:01:26 -05:00
Alex Dadgar 47fd4d2bf7 remove unused struct field 2018-11-07 11:55:06 -08:00
Alex Dadgar 204ca8230c Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Nick Ethier bda3b1d3b3
rename NomadConfig to ClientAgentConfig 2018-10-29 21:34:34 -04:00
Nick Ethier 65adb80ebf
plumb NomadConfig into plugins 2018-10-16 22:47:22 -04:00
Alex Dadgar 5fc9a95201 Use Attribute in device fingerprinting 2018-10-13 11:43:06 -07:00
Alex Dadgar 0183fb4e5c nvidia package restructue + build non-linux 2018-10-05 13:56:04 -07:00