When starting an allocation that is preempting other allocs, we create a
new group allocation watcher, and then wait for the allocations to
terminate in the allocation PreRun hooks.
If there's no preempted allocations, then we simply provide a
NoopAllocWatcher.
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.
TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
This change makes few compromises:
* Looks up the devices associated with tasks at look up time. Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine. However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.
* `client.Client` performs the task devices->statistics lookup. It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
* Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
* The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)
* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
* Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
* With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
* Stopping an alloc is implemented via Updates but update hooks are
*not* run.
* Destroying an alloc is a best effort cleanup.
* AllocRunner destroy hooks implemented.
* Disk migration and blocking on a previous allocation exiting moved to
its own package to avoid cycles. Now only depends on alloc broadcaster
instead of also using a waitch.
* AllocBroadcaster now only drops stale allocations and always keeps the
latest version.
* Made AllocDir safe for concurrent use
Lots of internal contexts that are currently unused. Unsure if they
should be used or removed.
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4 500 4059591 ns/op 23736 B/op 56 allocs/op
BenchmarkWriteDeduplication_Off-4 300 4115319 ns/op 25942 B/op 55 allocs/op