1de395b42c
Fix a bug where the scheduler may panic when preemption is enabled. The conditions are a bit complicated: A job with higher priority that schedule multiple allocations that preempt other multiple allocations on the same node, due to port/network/device assignments. The cause of the bug is incidental mutation of internal cached data. `RankedNode` computes and cache proposed allocations in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L42-L53 . But scheduler then mutates the list to remove pre-emptable allocs in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L293-L294, and `RemoveAllocs` mutates and sets the tail of cached slice with `nil`s triggering a nil-pointer derefencing case. I fixed the issue by avoiding the mutation in `RemoveAllocs` - the micro-optimization there doesn't seem necessary. Fixes https://github.com/hashicorp/nomad/issues/11342 |
||
---|---|---|
.. | ||
annotate.go | ||
annotate_test.go | ||
context.go | ||
context_test.go | ||
device.go | ||
device_test.go | ||
feasible.go | ||
feasible_test.go | ||
generic_sched.go | ||
generic_sched_test.go | ||
preemption.go | ||
preemption_test.go | ||
propertyset.go | ||
rank.go | ||
rank_test.go | ||
reconcile.go | ||
reconcile_test.go | ||
reconcile_util.go | ||
reconcile_util_test.go | ||
scheduler.go | ||
scheduler_oss.go | ||
scheduler_sysbatch_test.go | ||
scheduler_system.go | ||
scheduler_system_test.go | ||
select.go | ||
select_test.go | ||
spread.go | ||
spread_test.go | ||
stack.go | ||
stack_not_ent.go | ||
stack_test.go | ||
testing.go | ||
util.go | ||
util_test.go |