The logging package is used by logmon and the legacy mock_driver. Because the
legacy drivers are going away, I'm moving it here to signify its actual
ownership.
As part of deprecating legacy drivers, we're moving the env package to a
new drivers/shared tree, as it is used by the modern docker and rkt
driver packages, and is useful for 3rd party plugins.
Looking at NewTaskRunner I'm unsure whether TaskRunner.TaskResources
(from which req.TaskResources is set) is intended to be nil at times or
if the TODO in NewTaskRunner is intended to ensure it is always non-nil.
The old approach was incomplete. Hook env vars are now:
* persisted and restored between agent restarts
* deterministic (LWW if 2 hooks set the same key)
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.
TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
The group utility struct does not support asynchronously launched
goroutines (goroutines-inside-of-goroutines), so switch those uses to a
normal go call.
This means watchNodeUpdates and watchNodeEvents may not be shutdown when
Shutdown() exits. During nomad agent shutdown this does not matter.
During tests this means a test may leak those goroutines or be unable to
know when those goroutines have exited.
Since there's no runtime impact and these goroutines do not affect alloc
state syncing it seems ok to risk leaking them.
We were incorrectly returning a 0 duration to the taskrunner when
determining when a task should restart. This would cause tasks to be
restarted immediately, ignoring the restart {} stanza in a users
configuration.
This commit causes us to return the restart duration to the task runner
so it may correctly delay further execution.
This change makes few compromises:
* Looks up the devices associated with tasks at look up time. Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine. However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.
* `client.Client` performs the task devices->statistics lookup. It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
* Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
* The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)
* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
* Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
* With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).
We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations. Thus, I opted not to use it.
Tests expect that as soon as eventer shuts down immediately on context
cancellations; but golang does not guarantee priority when multiple
pending channels are ready in a select statement.
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver. Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.
Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
For lifecycle operations such as Restart and Kill, the client should not
expect driver plugins to be well behaved and close their waitCh on
context cancelation. Always wait on the passed in context as well as the
waitCh.
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
If ar.state.TaskStates has not been set, set it on the copy of ar.state.
That keeps ar.state manipulations in one location and allows AllocState
to only acquire read-locks.
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.
However, that would lead to AR having two awkwardly similar methods:
- Alloc() - which returns the server-sent alloc
- ClientAlloc() - which returns the fully materialized client alloc
Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?
Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.
This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?
So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!
- AR.Alloc() is for static Allocation access (often for the Job)
- AR.AllocState() is for the dynamic AR & TR runtime state (deployment
status, task states, etc).
If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()
It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
Multiple receivers raced for the WaitResult when killing tasks which
could lead to a deadlock if the "wrong" receiver won.
Wrap handlers in an ugly little proxy to avoid this. At first I wanted
to push this into drivers, but the result is tied to the TR's handle
lifecycle -- not the lifecycle of an alloc or task.
"Ask forgiveness, not permission."
Instead of peaking at TaskStates (which are no longer updated on the
AR.Alloc() view of the world) to only read logs for running tasks, just
try to read the logs and improve the error handling if they don't exist.
This should make log streaming less dependent on AR/TR behavior.
Also fixed a race where the log streamer could exit before reading an
error. This caused no logs or errors to be displayed sometimes when an
error occurred.
Driver plugin framework to facilitate development of driver plugins.
Implementing plugins only need to implement the DriverPlugin interface.
The framework proxies this interface to the go-plugin GRPC interface generated
from the driver.proto spec.
A testing harness is provided to allow implementing drivers to test the full
lifecycle of the driver plugin. An example use:
func TestMyDriver(t *testing.T) {
harness := NewDriverHarness(t, &MyDiverPlugin{})
// The harness implements the DriverPlugin interface and can be used as such
taskHandle, err := harness.StartTask(...)
}
* client/executor: refactor client to remove interpolation
* executor: POC libcontainer based executor
* vendor: use hashicorp libcontainer fork
* vendor: add libcontainer/nsenter dep
* executor: updated executor interface to simplify operations
* executor: implement logging pipe
* logmon: new logmon plugin to manage task logs
* driver/executor: use logmon for log management
* executor: fix tests and windows build
* executor: fix logging key names
* executor: fix test failures
* executor: add config field to toggle between using libcontainer and standard executors
* logmon: use discover utility to discover nomad executable
* executor: only call libcontainer-shim on main in linux
* logmon: use seperate path configs for stdout/stderr fifos
* executor: windows fixes
* executor: created reusable pid stats collection utility that can be used in an executor
* executor: update fifo.Open calls
* executor: fix build
* remove executor from docker driver
* executor: Shutdown func to kill and cleanup executor and its children
* executor: move linux specific universal executor funcs to seperate file
* move logmon initialization to a task runner hook
* client: doc fixes and renaming from code review
* taskrunner: use shared config struct for logmon fifo fields
* taskrunner: logmon only needs to be started once per task
Updated to hclog.
It exposed fields that required an unexported lock to access. Created a
getter methodn instead. Only old allocrunner currently used this
feature.
* vendor: bump libcontainer and docker to remove Sirupsen imports
* vendor: fix bad vendoring of archive package
* vendor: fix api changes to cgroups in executor
* vendor: fix docker api changes
* vendor: update github.com/Azure/go-ansiterm to use non capitalized logrus import
* UpdateState: set state, append event, persist, update servers
* EmitEvent: append event, persist, update servers
* AppendEvent: append event, persist
AppendEvent may not even have to persist, but for the sake of
correctness I'm going with that for now.
* Stopping an alloc is implemented via Updates but update hooks are
*not* run.
* Destroying an alloc is a best effort cleanup.
* AllocRunner destroy hooks implemented.
* Disk migration and blocking on a previous allocation exiting moved to
its own package to avoid cycles. Now only depends on alloc broadcaster
instead of also using a waitch.
* AllocBroadcaster now only drops stale allocations and always keeps the
latest version.
* Made AllocDir safe for concurrent use
Lots of internal contexts that are currently unused. Unsure if they
should be used or removed.
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4 500 4059591 ns/op 23736 B/op 56 allocs/op
BenchmarkWriteDeduplication_Off-4 300 4115319 ns/op 25942 B/op 55 allocs/op