Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin.
The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)), and [0.9.0-beta3 preperation](b849d84f2f).
This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync. Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085
Some of the context uses in TR hooks are useless (Killed during Stop
never seems meaningful).
None of the hooks are interruptable for graceful shutdown which is
unfortunate and probably needs fixing.
Builds upon earlier commit that cleans up restored handles of terminal
allocs by also emitting terminated events and calling exited hooks when
appropriate.
The test is sadly quite complicated and peeks into things (logmon's
reattach config) AR doesn't normally have access to.
However, I couldn't find another way of asserting logmon got cleaned up
without resorting to smaller unit tests. Smaller unit tests risk
re-implementing dependencies in an unrealistic way, so I opted for an
ugly integration test.
This commit is a significant change. TR.Run is now always executed, even
for terminal allocations. This was changed to allow TR.Run to cleanup
(run stop hooks) if a handle was recovered.
This is intended to handle the case of Nomad receiving a
DesiredStatus=Stop allocation update, persisting it, but crashing before
stopping AR/TR.
The commit also renames task runner hook data as it was very easy to
accidently set state on Requests instead of Responses using the old
field names.
Port some integration tests of driver fingerprinting.
Some tests (e.g. `TestFingerprintManager_Run_DriversInBlacklist`) have
been subsituted by more isolated tests in
`client/pluginmanager/drivermanager/manager_test.go`
This code chooses to be conservative as opposed to optimal: when failing
to reattach to logmon simply return a recoverable error instead of
immediately trying to restart logmon.
The recoverable error will cause the task's restart policy to be
applied and a new logmon will be launched upon restart.
Trying to do the optimal approach of simply starting a new logmon
requires error string comparison and should be tested against a task
actively logging to assert the behavior (are writes blocked? dropped?).
There were multiple bugs here:
1. Reattach unmarshalling always returned an error because you can't
unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
reattaching was never attempted.
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
0.9.0beta2 contains a regression where artifact download errors would
not cause a task restart and instead immediately fail the task.
This restores the pre-0.9 behavior of retrying all artifact errors and
adds missing tests.
Fixes an issue where if a task was restarted after restating the client,
the task dir environment variables would not be populated. This PR fixes
this for both upgrades from 0.8.X and for normal 0.9 restarts.
* master: (23 commits)
tests: avoid assertion in goroutine
spell check
ci: run checkscripts
tests: deflake TestRktDriver_StartWaitRecoverWaitStop
drivers/rkt: Remove unused github.com/rkt/rkt
drivers/rkt: allow development on non-linux
cli: Hide `nomad docker_logger` from help output
api: test api and structs are in sync
goimports until make check is happy
nil check node resources to prevent panic
tr: use context in as select statement
move pluginutils -> helper/pluginutils
vet
goimports
gofmt
Split hclspec
move hclutils
Driver tests do not use hcl2/hcl, hclspec, or hclutils
move reattach config
loader and singleton
...
Added ability to adjust the number of events the TaskRunner keeps as
there's no way to observe all events otherwise.
Task events differ slightly from 0.8 because 0.9 emits Terminated every
time a task exits instead of only when it exits on its own (not due to
restart or kill).
0.9 does not emit Killing/Killed for restarts like 0.8 which seems fine
as `Restart Signaled/Terminated/Restarting` is more descriptive.
Original v0.8 events emitted:
```
expected := []string{
"Received",
"Task Setup",
"Started",
"Restart Signaled",
"Killing",
"Killed",
"Restarting",
"Started",
"Restart Signaled",
"Killing",
"Killed",
"Restarting",
"Started",
"Restart Signaled",
"Killing",
"Killed",
"Not Restarting",
}
```