Race was test only and due to unlocked map access.
Panic was test only and due to checking a field on a struct even when we
knew the struct was nil.
Race output that was fixed:
```
==================
WARNING: DATA RACE
Read at 0x00c000697dd0 by goroutine 768:
runtime.mapaccess2()
/usr/local/go/src/runtime/map.go:439 +0x0
github.com/hashicorp/nomad/nomad.TestLeader_PeriodicDispatcher_Restore_Adds.func8()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader_test.go:402
+0xe6
github.com/hashicorp/nomad/testutil.WaitForResultRetries()
/home/schmichael/go/src/github.com/hashicorp/nomad/testutil/wait.go:30
+0x5a
github.com/hashicorp/nomad/testutil.WaitForResult()
/home/schmichael/go/src/github.com/hashicorp/nomad/testutil/wait.go:22
+0x57
github.com/hashicorp/nomad/nomad.TestLeader_PeriodicDispatcher_Restore_Adds()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader_test.go:401
+0xb53
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
Previous write at 0x00c000697dd0 by goroutine 569:
runtime.mapassign()
/usr/local/go/src/runtime/map.go:549 +0x0
github.com/hashicorp/nomad/nomad.(*PeriodicDispatch).Add()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/periodic.go:224
+0x2eb
github.com/hashicorp/nomad/nomad.(*Server).restorePeriodicDispatcher()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:394
+0x29a
github.com/hashicorp/nomad/nomad.(*Server).establishLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:234
+0x593
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:117
+0x82e
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
Goroutine 768 (running) created at:
testing.(*T).Run()
/usr/local/go/src/testing/testing.go:878 +0x650
testing.runTests.func1()
/usr/local/go/src/testing/testing.go:1119 +0xa8
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
testing.runTests()
/usr/local/go/src/testing/testing.go:1117 +0x4ee
testing.(*M).Run()
/usr/local/go/src/testing/testing.go:1034 +0x2ee
main.main()
_testmain.go:1150 +0x221
Goroutine 569 (running) created at:
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:70
+0x269
==================
```
Similar to previous commits the delayed eval update chan was set and
access from different goroutines causing a race. Passing the chan on the
stack resolves the race.
Race output from `go test -race -run 'Server_RPC$'` in nomad/
```
==================
WARNING: DATA RACE
Write at 0x00c000339150 by goroutine 63:
github.com/hashicorp/nomad/nomad.(*EvalBroker).flush()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:708
+0x3dc
github.com/hashicorp/nomad/nomad.(*EvalBroker).SetEnabled()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:174
+0xc4
github.com/hashicorp/nomad/nomad.(*Server).revokeLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:718
+0x1fd
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:122
+0x95d
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
Previous read at 0x00c000339150 by goroutine 73:
github.com/hashicorp/nomad/nomad.(*EvalBroker).runDelayedEvalsWatcher()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:771
+0x176
Goroutine 63 (running) created at:
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:70
+0x269
Goroutine 73 (running) created at:
github.com/hashicorp/nomad/nomad.(*EvalBroker).SetEnabled()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:170
+0x173
github.com/hashicorp/nomad/nomad.(*Server).establishLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:207
+0x355
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:117
+0x82e
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
==================
```
PeriodicDispatch.SetEnabled sets updateCh in one goroutine, and
PeriodicDispatch.run accesses updateCh in another.
The race can be prevented by having SetEnabled pass updateCh to run.
Race detector output from `go test -race -run TestServer_RPC` in nomad/
```
==================
WARNING: DATA RACE
Write at 0x00c0001d3f48 by goroutine 75:
github.com/hashicorp/nomad/nomad.(*PeriodicDispatch).SetEnabled()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/periodic.go:468
+0x256
github.com/hashicorp/nomad/nomad.(*Server).revokeLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:724
+0x267
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:131
+0x3c
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:163
+0x4dd
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
Previous read at 0x00c0001d3f48 by goroutine 515:
github.com/hashicorp/nomad/nomad.(*PeriodicDispatch).run()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/periodic.go:338
+0x177
Goroutine 75 (running) created at:
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:70
+0x269
Goroutine 515 (running) created at:
github.com/hashicorp/nomad/nomad.(*PeriodicDispatch).SetEnabled()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/periodic.go:176
+0x1bc
github.com/hashicorp/nomad/nomad.(*Server).establishLeadership()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:231
+0x582
github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:117
+0x82e
github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
/home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
==================
```
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
This PR fixes an edge case where we could GC an allocation that was in a
desired stop state but had not terminated yet. This can be hit if the
client hasn't shutdown the allocation yet or if the allocation is still
shutting down (long kill_timeout).
Fixes https://github.com/hashicorp/nomad/issues/4940
`currentExpiration` field is accessed in multiple goroutines: Stats and
renewal, so needs locking.
I don't anticipate high contention, so simple mutex suffices.
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.
TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
Keep attempting to renew Vault token past locally recorded expiry, just
in case the token was renewed out of band, e.g. on another Nomad server,
until Vault returns an unrecoverable error.
Seems like the stats field is a micro-optimization that doesn't justify
the complexity it introduces. Removing it and computing the stats from
revoking field directly.
Vault's RenewSelf(...) API may return (nil, nil). We failed to check if
secret was nil before attempting to use it.
RenewSelf:
e3eee5b4fb/api/auth_token.go (L138-L155)
Calls ParseSecret:
e3eee5b4fb/api/secret.go (L309-L311)
If anyone has an idea on how to test this I didn't see any options. We
use a real Vault service, so there's no opportunity to mock the
response.
This adds constraints for asserting that a given attribute or value
exists, or does not exist. This acts as a companion to =, or !=
operators, e.g:
```hcl
constraint {
attribute = "${attrs.type}"
operator = "!="
value = "database"
}
constraint {
attribute = "${attrs.type}"
operator = "is_set"
}
```
This test expects 11 repeats of the same message emitted at intervals of
200ms; so we need more than 2 seconds to adjust for time sleep
variations and the like. So raising it to 3s here that should be
enough.
Fixes https://github.com/hashicorp/nomad/issues/4299
Upon investigating this case further, we determined the issue to be a race between applying `JobBatchDeregisterRequest` fsm operation and processing job-deregister evals.
Processing job-deregister evals should wait until the FSM log message finishes applying, by using the snapshot index. However, with `JobBatchDeregister`, any single individual job deregistering was applied accidentally incremented the snapshot index and resulted into processing job-deregister evals. When a Nomad server receives an eval for a job in the batch that is yet to be deleted, we accidentally re-run it depending on the state of allocation.
This change ensures that we delete deregister all of the jobs and inserts all evals in a single transactions, thus blocking processing related evals until deregistering complete.
The old logic for cancelling duplicate blocked evaluations by job id had
the issue where the newer evaluation could have additional node classes
that it is (in)eligible for that we would not capture. This could make
it such that cluster state could change such that the job would make
progress but no evaluation was unblocked.
Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.
Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
This commit implements an allocation selection algorithm for finding
allocations to preempt. It currently special cases network resource asks
from others (cpu/memory/disk/iops).
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
Denormalize jobs in AppendAllocs:
AppendAlloc was originally only ever called for inplace upgrades and new
allocations. Both these code paths would remove the job from the
allocation. Now we use this to also add fields such as FollowupEvalID
which did not normalize the job. This is only a performance enhancement.
Ignore terminal allocs:
Failed allocations are annotated with the followup Eval ID when one is
created to replace the failed allocation. However, in the plan applier,
when we check if allocations fit, these terminal allocations were not
filtered. This could result in the plan being rejected if the node would
be overcommited if the terminal allocations resources were considered.
Fixes three issues:
1. Retrieving the latest evaluation index was not properly selecting the
greatest index. This would undermine checks we had to reduce the number
of evaluations created when the latest eval index was greater than any
alloc change
2. Fix an issue where the blocking query code was using the incorrect
index such that the index was higher than necassary.
3. Special case handling of blocked evaluation since the create/snapshot
index is no particularly useful since they can be reblocked.
Multiplexer continues to create rpc connections even when
the context which is passed to the underlying rpc connections
is cancelled by the server.
This was causing #4413 - when a SIGHUP causes everything to reload,
it uses context to cancel the underlying http/rpc connections
so that they may come up with the new configuration.
The multiplexer was not being cancelled properly so it would
continue to create rpc connections and constantly fail,
causing communication issues with other nomad agents.
Fixes#4413