Enterprise only.
Disable preemption for service and batch jobs by default.
Maintain backward compatibility in a x.y.Z release. Consider switching
the default for new clusters in the future.
Revert plan_apply.go changes from #5411
Since non-Command Raft messages do not update the StateStore index,
SnapshotAfter may unnecessarily block and needlessly fail in idle
clusters where the last Raft message is a non-Command message.
This is trivially reproducible with the dev agent and a job that has 2
tasks, 1 of which fails.
The correct logic would be to SnapshotAfter the previous plan's index to
ensure consistency. New clusters or newly elected leaders will not have
a previous plan, so the index the leader was elected should be used
instead.
Fix a case where `node.StatusUpdatedAt` was manipulated directly in
memory.
This ensures that StatusUpdatedAt is set in raft layer, and ensures that
the field is updated when node drain/eligibility is updated too.
Previous commit could introduce a deadlock if the capacityChangeCh was
full and the receiving side exited before freeing a slot for the sending
side could send. Flush would then block forever waiting to acquire the
lock just to throw the pending update away.
The race is around getting/setting the chan field, not chan operations,
so only lock around getting the chan field.
I assume the mutex was being released before sending on capacityChangeCh
to avoid blocking in the critical section, but:
1. This is race.
2. capacityChangeCh has a *huge* buffer (8096). If it's full things
already seem Very Bad, and a little backpressure seems appropriate.
This fixes a bug in the state store during plan apply. When
denormalizing preempted allocations it incorrectly set the preemptor's
job during the update. This eventually causes a panic downstream in the
client. Added a test assertion that failed before and passes after this fix
* master: (912 commits)
Update redirects.txt
Added redirect for Spark guide link
client: log when server list changes
docs: mention regression in task config validation
fix update to changelog
update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665
typo: "atleast" -> "at least"
implement nomad exec for rkt
docs: fixed typo
use pty/tty terminology similar to github.com/kr/pty
vendor github.com/kr/pty
drivers: implement streaming exec for executor based drivers
executors: implement streaming exec
executor: scaffolding for executor grpc handling
client: expose allocated memory per task
client improve a comment in updateNetworks
stalebot: Add 'thinking' as an exempt label (#5684)
Added Sparrow link
update links to use new canonical location
Add redirects for restructing done in GH-5667
...