When starting an allocation that is preempting other allocs, we create a
new group allocation watcher, and then wait for the allocations to
terminate in the allocation PreRun hooks.
If there's no preempted allocations, then we simply provide a
NoopAllocWatcher.
The Group Alloc watcher is an implementation of a PrevAllocWatcher that
can wait for multiple previous allocs before terminating.
This is to be used when running an allocation that is preempting upstream
allocations, and thus only supports being ran with a local alloc watcher.
It also currently requires all of its child watchers to correctly handle
context cancellation. Should this be a problem, it should be fairly easy
to implement a replacement using channels rather than a waitgroup.
It obeys the PrevAllocWatcher interface for convenience, but it may be
better to extract Migration capabilities into a seperate interface for
greater clarity.
As of now, FileRotator uses bufio.Write under the hood to write data to
configured output file. Due to the way how bufio handles any occurred io
error - saves it into `err` variable never resetting it automatically -
any operation like `Write`, `Flush` etc will become a no-op, returning the very same,
saved error (eg. Out of disk space) even when the problem is fixed (eg. disk
space is available again).
That automatically means that FileRotator will stop writing any logs,
reporting the same error over and over again, even if it's no longer
valid.
This PR fixes it by resetting the bufio Writer, which resets any errors
and tries to write requested data.
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
Also, LXC requires target paths to be relative. Container paths in LXC
binds should never be absolute paths, so we strip any preceeding `/`,
even if a user sets one.
The previous integration test was broken during the client refactor, and
it seems to be some sort of race with state updating.
I'm going to try and construct a replacement test as part of work on
performance, but for now, the underlying behaviour is still being
tested.
This PR fixes an edge case where we could GC an allocation that was in a
desired stop state but had not terminated yet. This can be hit if the
client hasn't shutdown the allocation yet or if the allocation is still
shutting down (long kill_timeout).
Fixes https://github.com/hashicorp/nomad/issues/4940