open-nomad/nomad
Mahmood Ali 1de395b42c
Fix preemption panic (#11346)
Fix a bug where the scheduler may panic when preemption is enabled. The conditions are a bit complicated:
A job with higher priority that schedule multiple allocations that preempt other multiple allocations on the same node, due to port/network/device assignments.

The cause of the bug is incidental mutation of internal cached data. `RankedNode` computes and cache proposed allocations  in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L42-L53 . But scheduler then mutates the list to remove pre-emptable allocs in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L293-L294, and  `RemoveAllocs` mutates and sets the tail of cached slice with `nil`s triggering a nil-pointer derefencing case.

I fixed the issue by avoiding the mutation in `RemoveAllocs` - the micro-optimization there doesn't seem necessary.

Fixes https://github.com/hashicorp/nomad/issues/11342
2021-10-19 20:22:03 -04:00
..
deploymentwatcher gofmt all the files 2021-10-01 10:14:28 -04:00
drainer Node Drain Metadata (#10250) 2021-05-07 13:58:40 -04:00
mock chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
state gofmt all the files 2021-10-01 10:14:28 -04:00
stream events: fix wildcard namespace handling (#10935) 2021-09-02 09:36:55 -07:00
structs Fix preemption panic (#11346) 2021-10-19 20:22:03 -04:00
volumewatcher volumewatcher: fix test data race. 2021-06-14 12:11:35 +02:00
acl.go Audit config, seams for enterprise audit features 2020-03-23 13:47:42 -04:00
acl_endpoint.go nomad: only activate one-time auth tokens with 1.1.0 (#10952) 2021-07-27 13:17:55 -04:00
acl_endpoint_test.go one-time token: never return expired tokens 2021-03-10 08:17:56 -05:00
acl_test.go Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447) 2020-12-01 11:11:34 -05:00
alloc_endpoint.go updated Allocation.List to properly handle ACL checking for namespace=* 2020-11-05 17:26:33 +00:00
alloc_endpoint_test.go documenting test for #9268 2020-11-05 16:19:55 +00:00
autopilot.go implement MinQuorum 2020-02-16 16:04:59 -06:00
autopilot_test.go test: fix flaky TestAutopilot_CleanupDeadServer 2021-09-28 16:38:56 -07:00
blocked_evals.go Add metrics for blocked eval resources (#10454) 2021-04-29 15:03:45 -04:00
blocked_evals_stats.go Add metrics for blocked eval resources (#10454) 2021-04-29 15:03:45 -04:00
blocked_evals_stats_test.go Add metrics for blocked eval resources (#10454) 2021-04-29 15:03:45 -04:00
blocked_evals_system.go blocked_evals system evals indexed by job and node 2019-07-18 10:32:12 -04:00
blocked_evals_test.go Add metrics for blocked eval resources (#10454) 2021-04-29 15:03:45 -04:00
client_agent_endpoint.go json handles were moved to a new package in #10202 2021-04-02 13:31:10 +00:00
client_agent_endpoint_test.go debug: Improve namespace and region support (#11269) 2021-10-12 16:58:41 -04:00
client_alloc_endpoint.go Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
client_alloc_endpoint_test.go tests: remove duplicate import statements. 2021-06-11 09:39:22 +02:00
client_csi_endpoint.go CSI: volume snapshot 2021-04-01 11:16:52 -04:00
client_csi_endpoint_test.go CSI: volume snapshot 2021-04-01 11:16:52 -04:00
client_fs_endpoint.go Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
client_fs_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
client_rpc.go fixup! vendor: explicit use of hashicorp/go-msgpack 2020-03-31 09:48:07 -04:00
client_rpc_test.go Simplify Bootstrap logic in tests 2020-03-02 13:47:43 -05:00
client_stats_endpoint.go server 2018-09-15 16:23:13 -07:00
client_stats_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
config.go core: implement system batch scheduler 2021-08-03 10:30:47 -04:00
consul.go consul: correctly check consul acl token namespace when using consul oss 2021-06-08 13:55:57 -05:00
consul_oss_test.go gofmt all the files 2021-10-01 10:14:28 -04:00
consul_policy.go consul: correctly check consul acl token namespace when using consul oss 2021-06-08 13:55:57 -05:00
consul_policy_oss_test.go gofmt all the files 2021-10-01 10:14:28 -04:00
consul_policy_test.go gofmt all the files 2021-10-01 10:14:28 -04:00
consul_test.go consul: correctly check consul acl token namespace when using consul oss 2021-06-08 13:55:57 -05:00
core_sched.go core: implement system batch scheduler 2021-08-03 10:30:47 -04:00
core_sched_test.go CSI: capability block is required for volume registration 2021-04-08 13:02:24 -04:00
csi_endpoint.go CSI ListSnapshots secrets implementation 2021-07-28 11:30:29 -07:00
csi_endpoint_test.go CSI ListSnapshots secrets implementation 2021-07-28 11:30:29 -07:00
deployment_endpoint.go api: add field filters to /v1/{allocations,nodes} 2020-10-14 10:35:22 -07:00
deployment_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
deployment_watcher_shims.go consul: plubming for specifying consul namespace in job/group 2021-04-05 10:03:19 -06:00
drainer_int_test.go Migrate all allocs when draining a node (#10411) 2021-04-21 12:11:14 -04:00
drainer_shims.go set node.StatusUpdatedAt in raft 2019-05-21 16:13:32 -04:00
endpoints_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
eval_broker.go test: fix fake by increasing time window 2021-09-28 12:22:59 -07:00
eval_broker_test.go test: fix fake by increasing time window 2021-09-28 12:22:59 -07:00
eval_endpoint.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
eval_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
event_endpoint.go Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447) 2020-12-01 11:11:34 -05:00
event_endpoint_test.go events: fix event endpoint tests to ignore heartbeats. 2021-05-24 10:28:19 +02:00
fsm.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
fsm_not_ent.go gofmt all the files 2021-10-01 10:14:28 -04:00
fsm_registry_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
fsm_test.go events: fix wildcard namespace handling (#10935) 2021-09-02 09:36:55 -07:00
heartbeat.go goimports 2019-01-22 15:44:31 -08:00
heartbeat_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
job_endpoint.go Update idempotency comment to reflect all jobs 2021-07-07 15:54:56 -05:00
job_endpoint_hook_connect.go fix panic when Connect mesh gateway doesn't have a proxy block (#11257) 2021-10-04 15:52:07 -04:00
job_endpoint_hook_connect_test.go fix panic when Connect mesh gateway doesn't have a proxy block (#11257) 2021-10-04 15:52:07 -04:00
job_endpoint_hook_expose_check.go connect: use deterministic injected dynamic exposed port 2021-04-30 15:18:22 -06:00
job_endpoint_hook_expose_check_test.go connect: use deterministic injected dynamic exposed port 2021-04-30 15:18:22 -06:00
job_endpoint_hooks.go Allow configuring memory oversubscription (#10466) 2021-04-29 22:09:56 -04:00
job_endpoint_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
job_endpoint_oss_test.go gofmt all the files 2021-10-01 10:14:28 -04:00
job_endpoint_test.go Match idempotency key on all child jobs and return existing job when idempotency keys match. 2021-07-02 14:08:46 -05:00
leader.go nomad: only activate one-time auth tokens with 1.1.0 (#10952) 2021-07-27 13:17:55 -04:00
leader_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
leader_test.go tests: deflake TestLeader_LeftLeader 2021-08-18 14:19:00 -04:00
merge.go
namespace_endpoint.go Fix some errcheck errors (#9811) 2021-01-14 12:46:35 -08:00
namespace_endpoint_test.go core: open source namespaces 2020-10-22 15:26:32 -07:00
node_endpoint.go Consider all system jobs for a new node (#11054) 2021-08-18 09:50:37 -04:00
node_endpoint_test.go Consider all system jobs for a new node (#11054) 2021-08-18 09:50:37 -04:00
operator_endpoint.go minor tweaks from Ent 2020-07-20 09:25:09 -04:00
operator_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
periodic.go periodic: always reset periodic children status 2021-03-25 11:27:09 -04:00
periodic_endpoint.go dispatch-job capability to dispatch periodic jobs 2020-10-27 16:33:01 -04:00
periodic_endpoint_test.go dispatch-job capability to dispatch periodic jobs 2020-10-27 16:33:01 -04:00
periodic_test.go periodic: always reset periodic children status 2021-03-25 11:27:09 -04:00
plan_apply.go plan applier: add trace-level log of plan 2021-06-02 10:25:23 -04:00
plan_apply_not_ent.go gofmt all the files 2021-10-01 10:14:28 -04:00
plan_apply_pool.go
plan_apply_pool_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
plan_apply_test.go reworked Node.Canonicalize() to enforce invariants, fixed a broken test 2021-03-26 18:58:38 +00:00
plan_endpoint.go goimports 2019-01-22 15:44:31 -08:00
plan_endpoint_test.go tests: swap lib/freeport for tweaked helper/freeport 2019-12-09 08:37:32 -06:00
plan_normalization_test.go allow configuration of Docker hostnames in bridge mode (#11173) 2021-09-16 08:13:09 +02:00
plan_queue.go nomad: refactor waitForIndex into SnapshotAfter 2019-05-17 13:30:23 -07:00
plan_queue_test.go nomad: fix test goroutine (#6593) 2019-10-31 08:23:32 -04:00
raft_rpc.go Refactor 2018-02-15 13:59:00 -08:00
regions_endpoint.go server 2018-09-15 16:23:13 -07:00
regions_endpoint_test.go tests: swap lib/freeport for tweaked helper/freeport 2019-12-09 08:37:32 -06:00
rpc.go update tests to make an actual RaftRPC 2021-08-27 10:37:30 -04:00
rpc_test.go update tests to make an actual RaftRPC 2021-08-27 10:37:30 -04:00
scaling_endpoint.go fix #9227: use both job and type query on scaling policy list endpoint 2020-11-10 23:26:35 +00:00
scaling_endpoint_test.go simple test to ensure that scaling endpoint methods support IsRead for 2021-01-05 13:42:18 +00:00
search_endpoint.go api: include ent fuzzy struct types in oss 2021-04-20 11:19:38 -06:00
search_endpoint_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
search_endpoint_test.go api: fuzzy search results include job name with id in scope 2021-04-16 17:03:36 -06:00
serf.go tweak bootstrap testing 2021-01-04 09:00:40 -05:00
serf_test.go deflake TestNomad_BootstrapExpect and other leader tests 2021-06-10 22:04:10 -04:00
server.go corrected peersInfoContent - was copied from Consul and not updated for Nomad (#11109) 2021-09-01 08:30:49 +02:00
server_setup_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
server_test.go Simplify Bootstrap logic in tests 2020-03-02 13:47:43 -05:00
stats_fetcher.go server 2018-09-15 16:23:13 -07:00
stats_fetcher_test.go Simplify Bootstrap logic in tests 2020-03-02 13:47:43 -05:00
status_endpoint.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
status_endpoint_test.go tests: swap lib/freeport for tweaked helper/freeport 2019-12-09 08:37:32 -06:00
system_endpoint.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
system_endpoint_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
testing.go tests: deflake TestMonitor_Monitor_RemoteServer and cross-region tests 2021-06-10 21:27:55 -04:00
testing_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
timetable.go vendor: explicit use of hashicorp/go-msgpack 2020-03-31 09:45:21 -04:00
timetable_test.go fixup! vendor: explicit use of hashicorp/go-msgpack 2020-03-31 09:48:07 -04:00
util.go csi: client RPCs should return wrapped errors for checking (#8605) 2020-08-07 11:01:36 -04:00
util_test.go remove unused dropButLastChannel 2020-02-13 18:56:53 -05:00
vault.go Fix some errcheck errors (#9811) 2021-01-14 12:46:35 -08:00
vault_test.go Merge pull request #8524 from hashicorp/b-vault-health-checks 2020-08-11 16:01:07 -04:00
vault_testing.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
worker.go add create and modify timestamps to evaluations (#5881) 2019-08-07 09:50:35 -07:00
worker_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00