This PR makes GetAllocs use a blocking query as well as adding a sanity
check to the clients watchAllocation code to ensure it gets the correct
allocations.
This PR fixes https://github.com/hashicorp/nomad/issues/2119 and
https://github.com/hashicorp/nomad/issues/2153.
The issue was that the client was talking to two different servers, one
to check which allocations to pull and the other to pull those
allocations. However the latter call was not with a blocking query and
thus the client would not retreive the allocations it requested.
The logging has been improved to make the problem more clear as well.
`/usr/bin/yes` could produce output very quickly (100s of MBps on my
laptop) and therefore could cause log files to roll over.
A bash loop with a sleep avoids that issue. The test is slower but
should be much more resilient to the massive timing differences between
workstations and Travis.
Consolidate task environment building in GetTaskEnv since it can
determine what kind of filesystem isolation is used.
This means drivers no longer have to manipulate task environment paths.
AllocRunner's state dropped the Context struct which needs to be
converted to the new AllocDir+TaskDir structs in RestoreState.
TaskRunner added a TaskDirBuilt flag, but it's safe to just let that
default to `false` and rebuild all task dirs once on upgrade.
Client.Shutdown holds the allocLock when destroying alloc runners in dev
mode.
Client.updateAllocStatus can be called during AllocRunner shutdown and
calls getAllocRunners which tries to acquire allocLock.RLock. This
deadlocks since Client.Shutdown already has the write lock.
Switching Client.Shutdown to use getAllocRunners and not hold a lock
during AllocRunner shutdown is the solution.