open-nomad

Commit Graph

Author	SHA1	Message	Date
Jasmine Dahilig	8d980edd2e	add create and modify timestamps to evaluations (#5881 )	2019-08-07 09:50:35 -07:00
Lang Martin	8f7a20839e	worker comment system -> core	2019-07-18 10:32:13 -04:00
Michael Schurter	e4bc943a68	nomad: SnapshotAfter -> SnapshotMinIndex Rename SnapshotAfter to SnapshotMinIndex. The old name was not technically accurate. SnapshotAtOrAfter is more accurate, but wordy and still lacks context about what precisely it is at or after (the index). SnapshotMinIndex was chosen as it describes the action (snapshot), a constraint (minimum), and the object of the constraint (index).	2019-06-24 12:16:46 -07:00
Michael Schurter	e10fea1d7a	nomad: include snapshot index when submitting plans Plan application should use a state snapshot at or after the Raft index at which the plan was created otherwise it risks being rejected based on stale data. This commit adds a Plan.SnapshotIndex which is set by workers when submitting plan. SnapshotIndex is set to the Raft index of the snapshot the worker used to generate the plan. Plan.SnapshotIndex plays a similar role to PlanResult.RefreshIndex. While RefreshIndex informs workers their StateStore is behind the leader's, SnapshotIndex is a way to prevent the leader from using a StateStore behind the worker's. Plan.SnapshotIndex should be considered the lower bound index for consistently handling plan application. Plans must also be committed serially, so Plan N+1 should use a state snapshot containing Plan N. This is guaranteed for plans after the first plan after a leader election. The Raft barrier on leader election ensures the leader's statestore has caught up to the log index at which it was elected. This guarantees its StateStore is at an index > lastPlanIndex.	2019-06-24 12:16:46 -07:00
Michael Schurter	0e39927782	nomad: emit more detailed error Avoid returning context.DeadlineExceeded as it lacks helpful information and is often ignored or handled specially by callers.	2019-05-17 14:37:42 -07:00
Michael Schurter	9732bc37ff	nomad: refactor waitForIndex into SnapshotAfter Generalize wait for index logic in the state store for reuse elsewhere. Also begin plumbing in a context to combine handling of timeouts and shutdown.	2019-05-17 13:30:23 -07:00
Arshneet Singh	4cf4324b8f	Remove allowPlanOptimization from schedulers	2019-04-23 09:18:02 -07:00
Arshneet Singh	b977748a4b	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Alex Dadgar	6d8bb3a7bd	Duplicate blocked evals cancelling improved The old logic for cancelling duplicate blocked evaluations by job id had the issue where the newer evaluation could have additional node classes that it is (in)eligible for that we would not capture. This could make it such that cluster state could change such that the job would make progress but no evaluation was unblocked.	2018-11-07 10:08:23 -08:00
Alex Dadgar	bd420692f3	fix logging	2018-09-25 10:49:55 -07:00
Preetha Appan	86e725e84c	Added logging around nacked evals in the scheduler worker	2018-09-25 10:49:02 -07:00
Alex Dadgar	99498da6ed	Denormalize jobs in plan and ignore resources of terminal allocs Denormalize jobs in AppendAllocs: AppendAlloc was originally only ever called for inplace upgrades and new allocations. Both these code paths would remove the job from the allocation. Now we use this to also add fields such as FollowupEvalID which did not normalize the job. This is only a performance enhancement. Ignore terminal allocs: Failed allocations are annotated with the followup Eval ID when one is created to replace the failed allocation. However, in the plan applier, when we check if allocations fit, these terminal allocations were not filtered. This could result in the plan being rejected if the node would be overcommited if the terminal allocations resources were considered.	2018-09-24 13:53:43 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	405dab2253	integration test and basic fixes	2018-03-21 16:51:44 -07:00
Josh Soref	b56ebdf350	spelling: invoke	2018-03-11 18:00:32 +00:00
Josh Soref	3232f6f27e	spelling: abandoned	2018-03-11 17:34:16 +00:00
Alex Dadgar	6911bd7676	Worker waits til max ModifyIndex across EvalsByJob This PR fixes a scheduling race condition in which the plan results from one invocation of the scheduler were not being considered by the next since the Worker was not waiting for the correct index. Fixes https://github.com/hashicorp/nomad/issues/3198	2017-09-14 14:28:43 -07:00
Alex Dadgar	84d06f6abe	Sync namespace changes	2017-09-07 17:04:21 -07:00
Luke Farnell	f0ced87b95	fixed all spelling mistakes for goreport	2017-08-07 17:13:05 -04:00
Alex Dadgar	b69b357c7f	Nomad builds	2017-02-07 20:31:23 -08:00
Alex Dadgar	bbe6e3d0b6	Larger delay on mismatch	2016-10-27 11:41:43 -07:00
Alex Dadgar	a1d08c2aba	Add scheduler version enforcement	2016-10-26 14:52:48 -07:00
Ben Barnard	83f647ed84	Replace "the the" with "the" in documentation and comments	2016-10-11 15:31:40 -04:00
Diptanu Choudhury	dabb83063b	Review comments	2016-07-25 17:26:38 -07:00
Diptanu Choudhury	7bafb7c675	Updating the job summary while mutating jobs and allocation objects	2016-07-25 17:26:38 -07:00
Alex Dadgar	3a8a27bcff	refresh index eval id in log	2016-06-22 13:48:41 -07:00
Alex Dadgar	8ceb7ead20	Do not use snapshot	2016-06-22 09:33:15 -07:00
Alex Dadgar	25decca3ca	Worker waitForIndex uses StateStore index, not Raft Applied Index	2016-06-22 09:04:22 -07:00
Alex Dadgar	bfdd5846e1	Track unblock indexes and check evals on block to see if they missed an update while in the scheduler	2016-05-24 20:10:56 -07:00
Alex Dadgar	15936822a4	Worker annotates evals with their snapshot index	2016-05-24 20:10:56 -07:00
Alex Dadgar	18d9e89065	Reuse the same evaluation and reblock it until there is no more work to do	2016-05-24 20:10:56 -07:00
Alex Dadgar	88ddfbed31	Revert "Debug messages around the plan and plan response" This reverts commit 7646657e6b8a892210779eaf5708341b94b29b24.	2016-02-22 22:24:52 -08:00
Alex Dadgar	fa8e2d31ee	Revert "err logs in worker and scheduler" This reverts commit 7befc586521b70eb84013bff367310e4cfa45c27.	2016-02-22 22:23:57 -08:00
Alex Dadgar	c2242552a1	Debug messages around the plan and plan response	2016-02-22 20:36:11 -08:00
Alex Dadgar	f48eabe753	err logs in worker and scheduler	2016-02-22 14:47:59 -08:00
Alex Dadgar	18d2d9c091	Killing a driver handle is retried with an exponential backoff	2016-02-16 21:00:49 -08:00
Armon Dadgar	df16cea2a4	nomad: worker supports create eval	2015-09-07 14:23:48 -07:00
Armon Dadgar	c7773feced	nomad: improve error messages at start for dev mode	2015-09-06 20:18:47 -07:00
Armon Dadgar	7e644b7cc9	nomad: use fast and slow exponential backoff in worker	2015-08-23 17:39:49 -07:00
Armon Dadgar	8c2bc337e6	nomad: adding ability to pause a worker	2015-08-23 10:52:31 -07:00
Armon Dadgar	cae67b7f60	nomad: expose UpdateEval as a planner	2015-08-15 14:25:00 -07:00
Armon Dadgar	8dfcb99e7f	nomad: rename SystemScheduler to CoreScheduler	2015-08-15 12:38:58 -07:00
Armon Dadgar	f7007bfeb5	nomad: avoid split-brain in plan processing due to leader transition or eval retry	2015-08-12 15:44:36 -07:00
Armon Dadgar	183a238481	nomad: avoid split-brain eval handling after leader transition	2015-08-12 15:25:31 -07:00
Armon Dadgar	c09da9fa51	nomad: measure invoke time per scheduler	2015-08-06 17:45:27 -07:00
Armon Dadgar	09a8c15d7e	scheduler: adding service scheduler definition	2015-08-06 17:25:14 -07:00
Armon Dadgar	8c2f8cddd0	nomad: create system scheduler as needed	2015-08-06 17:08:40 -07:00
Armon Dadgar	5f1ebb9274	nomad: adding special 'system' scheduler	2015-08-06 17:04:35 -07:00
Armon Dadgar	db78ce9a02	nomad: make worker more resilient to transient errors	2015-08-05 16:23:37 -07:00

1 2

56 Commits