Before this commit if a task had 2 checks cause restarts at the same
time, both would trigger restarts of the task! This change removes all
checks for a task whenever one of them is restarted.
@dadgar made the excellent observation in #3105 that TaskRunner removes
and re-registers checks on restarts. This means checkWatcher doesn't
need to do *any* internal restart tracking. Individual checks can just
remove themselves and be re-added when the task restarts.
This PR fixes:
* An issue in which a node-drain that contains a complete batch alloc
would cause a replacement
* An issue in which allocations with the same name during a scale
down/stop event wouldn't be properly stopped.
* An issue in which batch allocations from previous job versions may not
have been stopped properly.
Fixes https://github.com/hashicorp/nomad/issues/3210
This PR fixes a scheduling race condition in which the plan results from
one invocation of the scheduler were not being considered by the next
since the Worker was not waiting for the correct index.
Fixes https://github.com/hashicorp/nomad/issues/3198
This PR sorts the output of the endpoint since its results are used as
part of Consul checks to avoid the value changing unnecessarily.
Fixes https://github.com/hashicorp/nomad/issues/3211
This PR removes locking around commonly accessed node attributes that do
not need to be locked. The locking could cause nodes to TTL as the
heartbeat code path was acquiring a lock that could be held for an
excessively long time. An example of this is when Vault is inaccessible,
since the fingerprint is run with a lock held but the Vault
fingerprinter makes the API calls with a large timeout.
Fixes https://github.com/hashicorp/nomad/issues/2689