open-nomad

Author	SHA1	Message	Date
Tim Gross	4368dcc02f	fix deadlock in plan_apply (#13407 ) The plan applier has to get a snapshot with a minimum index for the plan it's working on in order to ensure consistency. Under heavy raft loads, we can exceed the timeout. When this happens, we hit a bug where the plan applier blocks waiting on the `indexCh` forever, and all schedulers will block in `Plan.Submit`. Closing the `indexCh` when the `asyncPlanWait` is done with it will prevent the deadlock without impacting correctness of the previous snapshot index. This changeset includes the a PoC failing test that works by injecting a large timeout into the state store. We need to turn this into a test we can run normally without breaking the state store before we can merge this PR. Increase `snapshotMinIndex` timeout to 10s. This timeout creates backpressure where any concurrent `Plan.Submit` RPCs will block waiting for results. This sheds load across all servers and gives raft some CPU to catch up, because schedulers won't dequeue more work while waiting. Increase it to 10s based on observations of large production clusters.	2022-06-23 12:06:27 -04:00
James Rasell	a05114fdac	core: add namespace to plan for node rejected log line. (#12868 )	2022-05-05 10:56:40 +02:00
Derek Strickland	5e309f3f33	reconciler: Handle canaries when client disconnects (#12539 ) * plan_apply: Allow node updates in disconnected node plans * plan: Keep the job when persisting unknown allocs * reconciler: stop unknown allocs when stopping all * reconcile_util: reorder filtering to handle canaries; skip rescheduling unknown * heartbeat: Fix bug in node heartbeating	2022-04-21 10:05:58 -04:00
Jorge Marey	96dd3f53c6	Fix in-place updates over ineligible nodes (#12264 )	2022-04-06 11:30:40 -04:00
DerekStrickland	da9bc350c8	evaluateNodePlan: validate plans for disconnected nodes	2022-04-05 17:12:22 -04:00
Luiz Aoqui	b1753d0568	scheduler: detect and log unexpected scheduling collisions (#11793 )	2022-01-14 20:09:14 -05:00
Luiz Aoqui	5be6710216	add `/s/port-plan-failure` redirect and link to in in plan reject log message	2021-11-02 20:43:54 -04:00
Michael Schurter	efe5714840	core: bump rejected plans from debug -> info As we have continued to see reports of #9506 we need to elevate this log line as it is the only way to detect when plans are being erroneously rejected. Users who see this log line repeatedly should drain and restart the node in the log line. This seems to workaorund the issue. Please post any details on #9506!	2021-10-31 12:51:42 -07:00
Tim Gross	e9777a88ce	plan applier: add trace-level log of plan The plans generated by the scheduler produce high-level output of counts on each evaluation, but when debugging scheduler issues it'd be nice to have a more detailed view of the resulting plan. Emitting this log at trace minimizes the overhead, and producing it in the plan applyer makes it easier to find as it will always be on the leader.	2021-06-02 10:25:23 -04:00
Chris Baker	770c9cecb5	restored Node.Sanitize() for RPC endpoints multiple other updates from code review	2021-03-26 17:03:15 +00:00
Chris Baker	dd291e69f4	removed deprecated fields from Drain structs and API node drain: use msgtype on txn so that events are emitted wip: encoding extension to add Node.Drain field back to API responses new approach for hiding Node.SecretID in the API, using `json` tag documented this approach in the contributing guide refactored the JSON handlers with extensions modified event stream encoding to use the go-msgpack encoders with the extensions	2021-03-21 15:30:11 +00:00
Drew Bailey	9d48818eb8	writetxn can return error, add alloc and job generic events. Add events table for durability	2020-10-14 12:44:39 -04:00
Drew Bailey	4793bb4e01	Events/deployment events (#9004 ) * Node Drain events and Node Events (#8980) Deployment status updates handle deployment status updates (paused, failed, resume) deployment alloc health generate events from apply plan result txn err check, slim down deployment event one ndjson line per index * consolidate down to node event + type * fix UpdateDeploymentAllocHealth test invocations * fix test	2020-10-14 12:44:37 -04:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Lang Martin	ac7c39d3d3	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 ) * client/heartbeatstop: reversed time condition for startup grace * scheduler/generic_sched: use `delayInstead` to avoid a loop Without protecting the loop that creates followUpEvals, a delayed eval is allowed to create an immediate subsequent delayed eval. For both `stop_after_client_disconnect` and the `reschedule` block, a delayed eval should always produce some immediate result (running or blocked) and then only after the outcome of that eval produce a second delayed eval. * scheduler/reconcile: lostLater are different than delayedReschedules Just slightly. `lostLater` allocs should be used to create batched evaluations, but `handleDelayedReschedules` assumes that the allocations are in the untainted set. When it creates the in-place updates to those allocations at the end, it causes the allocation to be treated as running over in the planner, which causes the initial `stop_after_client_disconnect` evaluation to be retried by the worker.	2020-06-03 09:48:38 -04:00
Jasmine Dahilig	8d980edd2e	add create and modify timestamps to evaluations (#5881 )	2019-08-07 09:50:35 -07:00
Michael Schurter	69ba495f0c	nomad: expand comments on subtle plan apply behaviors	2019-06-26 08:49:24 -07:00
Michael Schurter	e4bc943a68	nomad: SnapshotAfter -> SnapshotMinIndex Rename SnapshotAfter to SnapshotMinIndex. The old name was not technically accurate. SnapshotAtOrAfter is more accurate, but wordy and still lacks context about what precisely it is at or after (the index). SnapshotMinIndex was chosen as it describes the action (snapshot), a constraint (minimum), and the object of the constraint (index).	2019-06-24 12:16:46 -07:00
Michael Schurter	0f8164b2f1	nomad: evaluate plans after previous plan index The previous commit prevented evaluating plans against a state snapshot which is older than the snapshot at which the plan was created. This is correct and prevents failures trying to retrieve referenced objects that may not exist until the plan's snapshot. However, this is insufficient to guarantee consistency if the following events occur: 1. P1, P2, and P3 are enqueued with snapshot @ 100 2. Leader evaluates and applies Plan P1 with snapshot @ 100 3. Leader evaluates Plan P2 with snapshot+P1 @ 100 4. P1 commits @ 101 4. Leader evaluates applies Plan P3 with snapshot+P2 @ 100 Since only the previous plan is optimistically applied to the state store, the snapshot used to evaluate a plan may not contain the N-2 plan! To ensure plans are evaluated and applied serially we must consider all previous plan's committed indexes when evaluating further plans. Therefore combined with the last PR, the minimum index at which to evaluate a plan is: min(previousPlanResultIndex, plan.SnapshotIndex)	2019-06-24 12:16:46 -07:00
Michael Schurter	e10fea1d7a	nomad: include snapshot index when submitting plans Plan application should use a state snapshot at or after the Raft index at which the plan was created otherwise it risks being rejected based on stale data. This commit adds a Plan.SnapshotIndex which is set by workers when submitting plan. SnapshotIndex is set to the Raft index of the snapshot the worker used to generate the plan. Plan.SnapshotIndex plays a similar role to PlanResult.RefreshIndex. While RefreshIndex informs workers their StateStore is behind the leader's, SnapshotIndex is a way to prevent the leader from using a StateStore behind the worker's. Plan.SnapshotIndex should be considered the lower bound index for consistently handling plan application. Plans must also be committed serially, so Plan N+1 should use a state snapshot containing Plan N. This is guaranteed for plans after the first plan after a leader election. The Raft barrier on leader election ensures the leader's statestore has caught up to the log index at which it was elected. This guarantees its StateStore is at an index > lastPlanIndex.	2019-06-24 12:16:46 -07:00
Mahmood Ali	392f5bac44	Stop updating allocs.Job on stopping or preemption	2019-06-10 18:30:20 -04:00
Michael Schurter	a8fc50cc1b	nomad: revert use of SnapshotAfter in planApply Revert plan_apply.go changes from #5411 Since non-Command Raft messages do not update the StateStore index, SnapshotAfter may unnecessarily block and needlessly fail in idle clusters where the last Raft message is a non-Command message. This is trivially reproducible with the dev agent and a job that has 2 tasks, 1 of which fails. The correct logic would be to SnapshotAfter the previous plan's index to ensure consistency. New clusters or newly elected leaders will not have a previous plan, so the index the leader was elected should be used instead.	2019-06-03 15:34:21 -07:00
Michael Schurter	b80a7e0feb	nomad: wait for state store to sync in plan apply Wait for state store to catch up with raft when applying plans.	2019-05-17 14:37:12 -07:00
Arshneet Singh	d4e7a5c005	Add comments to functions, and use require instead of assert	2019-04-23 09:57:21 -07:00
Arshneet Singh	0dd4c109e8	Compat tags	2019-04-23 09:18:01 -07:00
Arshneet Singh	b977748a4b	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Alex Dadgar	e3cbb2c82e	allocs fit checks if devices get oversubscribed	2018-11-07 10:33:22 -08:00
Preetha Appan	c33469157d	unit test plan apply with preemptions	2018-11-01 20:06:32 -05:00
Preetha Appan	1415032c13	More review comments	2018-10-30 11:06:32 -05:00
Preetha Appan	3190a2c29b	Fix linting	2018-10-30 11:06:32 -05:00
Preetha Appan	9e4a35fff0	Fix comment	2018-10-30 11:06:32 -05:00
Preetha Appan	cc295b90de	Implement preemption for system jobs. This commit implements an allocation selection algorithm for finding allocations to preempt. It currently special cases network resource asks from others (cpu/memory/disk/iops).	2018-10-30 11:06:32 -05:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Michael Schurter	c0542474db	drain: initial drainv2 structs and impl	2018-03-21 16:49:48 -07:00
Josh Soref	6aa0a05998	spelling: scheduling	2018-03-11 18:53:11 +00:00
Josh Soref	7da15e6ee5	spelling: minimum	2018-03-11 18:26:53 +00:00
Josh Soref	a0452dd7f5	spelling: evaluate	2018-03-11 18:01:29 +00:00
Preetha Appan	2567b51c58	Edge trigger evaluation when allocations client status is failed	2018-01-31 09:56:53 -06:00
Preetha Appan	3c36abfe14	Update eval modify index as part of plan apply.	2017-12-18 10:03:55 -06:00
Preetha Appan	d340c3adb1	Always set modify time on allocations, and other changes addressing review comments	2017-11-01 15:13:48 -05:00
Preetha Appan	39d70be009	Add ModifyTime to Allocation and update it both on plan applies and client initiated updates	2017-11-01 15:13:48 -05:00
Alex Dadgar	c1cc51dbee	sync	2017-10-13 14:36:02 -07:00
Alex Dadgar	4173834231	Enable more linters	2017-09-26 15:26:33 -07:00
Alex Dadgar	f91312a49a	Fix typo	2017-07-14 09:53:46 -07:00
Alex Dadgar	7a74080079	Log reason a plan gets rejected per node. This PR adds a log explaining why a plan gets rejected. Should help debugging.	2017-07-13 17:14:02 -07:00
Alex Dadgar	952a22fb02	plan apply tests	2017-07-07 12:10:04 -07:00
Alex Dadgar	477c713df5	Plan apply handles canaries and success is set via update	2017-07-07 12:10:04 -07:00
Alex Dadgar	09dfa2fc10	Rename CreateDeployments and remove cancelling behavior in state_store	2017-07-07 12:10:04 -07:00
Alex Dadgar	b3f4db0930	cancel deployments	2017-07-07 12:01:17 -07:00

1 2 3

105 commits