This PR fixes two issues:
1) A close of a nil stopCollection channel when restoring and prestart
fails. The failure will cause the killCh to be triggered which will
close collection before it has been initialized.
2) Fixes a deadlock in which the handleWaitCh is never triggered since
it is not initialized when there is an error in prestart and the killCh
is triggered.
Both fixes are by maintaining the loop invariant that the two channels
are valid after there is a handle.
This PR fixes our vet script and fixes all the missed vet changes.
It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
This PR introduces a coordinator for doing CRUD on a Docker image. It
should fix racy deletion of images. The issue before was images would be
deleted between prestart and start causing an error.
A change in the behavior of `os.Rename` in Go 1.8 brought to light a
difference in the logic between `{Alloc,Task}Runner` and this test:
AllocRunner builds the alloc dir, moves dirs if necessary, and then lets
TaskRunner call TaskDir.Build().
This test called `TaskDir.Build` *before* `AllocDir.Move`, so in Go 1.8
it failed to `os.Rename over` the empty {data,local} dirs.
I updated the test to behave like the real code, but I defensively added
`os.Remove` calls as a subtle change in call order shouldn't break this
code. `os.Remove` won't remove a non-empty directory, so it's still
safe.
This *might* be a fix for #2295 -- I've been unable to reproduce the
bug. However, this guard seems wise regardless. I should never be
overwriting an intialized created resources with a nil.
This PR fixes a race condition in which the client was not locked while
deriving Vault tokens. This allowed the token to be set which would
cause subsequent Vault requests to fail with permission denied because
the incorrect Vault token was being used.
Further this PR makes the unsetting and unlocking of the client atomic
to avoid an even harder to hit race condition (not sure it was ever hit
but was still incorrect).