open-nomad

Author	SHA1	Message	Date
Mahmood Ali	fd54cfce6e	Revert the `requireCanary` check introduced in https://github.com/hashicorp/nomad/pull/8691/files#diff-1801138ac4d10f2064ba6f2e434ac9b4L430-R431 . The change was intended to fix a case where a canary alloc may fail to be rescheduled if all the other allocs fail as well (e.g. if all allocs happen to be placed on a node that died). However, it introduced some unintended side-effects. Reverting the change for now and will investigate further.	2020-09-10 14:59:02 -04:00
Mahmood Ali	2438b90334	Update scheduler/reconcile.go Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-08-25 17:37:19 -04:00
Mahmood Ali	38b61b97d8	simplify canary check `(alloc.DeploymentStatus == nil \|\| !alloc.DeploymentStatus.IsCanary())` and `!alloc.DeploymentStatus.IsCanary()` are equivalent.	2020-08-25 17:37:19 -04:00
Mahmood Ali	8a342926b7	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Tim Gross	1ca2c4ec2c	scheduler: DesiredCanaries can be set on every pass safely The reconcile loop sets `DeploymentState.DesiredCanaries` only on the first pass through the loop and if the job is not paused/pending. In MRD, deployments will make one pass though the loop while "pending", and were not ever getting `DesiredCanaries` set. We can't set it in the initial `DeploymentState` constructor because the first pass through setting up canaries expects it's not there yet. However, this value is static for a given version of a job because it's coming from the update stanza, so it's safe to re-assign the value on subsequent passes.	2020-07-20 11:25:53 -04:00
Tim Gross	d3341a2019	refactor: make it clear where we're accessing dstate The field name `Deployment.TaskGroups` contains a map of `DeploymentState`, which makes it a little harder to follow state updates when combined with inconsistent naming conventions, particularly when we also have the state store or actual `TaskGroup`s in scope. This changeset changes all uses to `dstate` so as not to be confused with actual TaskGroups.	2020-07-20 11:25:53 -04:00
Tim Gross	fe5f5e35aa	mrd: reconcile should treat pending deployments as paused (#8446 ) If a job update includes a task group that has no changes, those allocations have their version bumped in-place. The ends up triggering an eval from `deploymentwatcher` when it verifies their health. Although this eval is a no-op, we were only treating pending deployments the same as paused when the deployment was a new MRD. This means that any eval after the initial one will kick off the deployment, and that caused pending deployments to "jump the queue" and run ahead of schedule, breaking MRD invariants and resulting in a state with all regions blocked. This behavior can be replicated even in the case of job updates with no in-place updates by patching `deploymentwatcher` to inject a spurious no-op eval. This changeset fixes the behavior by treating pending deployments the same as paused in all cases in the reconciler.	2020-07-16 13:00:08 -04:00
Tim Gross	bd457343de	MRD: all regions should start pending (#8433 ) Deployments should wait until kicked off by `Job.Register` so that we can assert that all regions have a scheduled deployment before starting any region. This changeset includes the OSS fixes to support the ENT work. `IsMultiregionStarter` has no more callers in OSS, so remove it here.	2020-07-14 10:57:37 -04:00
Tim Gross	31185325c9	reconcile should not overwrite unblocking state (#8349 ) Pre-0.12.0 beta, a deployment was considered "complete" if it was successful. But with MRD we have "blocked" and "unblocking" states as well. We did not consider the case where a concurrent alloc health status update triggers a `Compute` call on a deployment that's moved from "blocked" to "unblocking" (it's a small window), which caused an extra pass thru the `nextRegion` logic in `deploymentwatcher` and triggered an error when later transitioning to "successful". This changeset makes sure we don't overwrite that status.	2020-07-06 11:31:33 -04:00
Tim Gross	d3ecb87984	multiregion: initial deploymentPaused must match start condition (#8215 ) In #8209 we fixed the `max_parallel` stanza for multiregion by introducing the `IsMultiregionStarter` check, but didn't apply it to the earlier place its required. The result is that deployments start but don't place allocations.	2020-06-19 13:42:38 -04:00
Tim Gross	b654e1b8a4	multiregion: all regions start in running if no max_parallel (#8209 ) If `max_parallel` is not set, all regions should begin in a `running` state rather than a `pending` state. Otherwise the first region is set to `running` and then all the remaining regions once it enters `blocked. That behavior is technically correct in that we have at most `max_parallel` regions running, but definitely not what a user expects.	2020-06-19 11:17:09 -04:00
Tim Gross	c14a75bfab	multiregion: use pending instead of paused The `paused` state is used as an operator safety mechanism, so that they can debug a deployment or halt one that's causing a wider failure. By using the `paused` state as the first state of a multiregion deployment, we risked resuming an intentionally operator-paused deployment because of activity in a peer region. This changeset replaces the use of the `paused` state with a `pending` state, and provides a `Deployment.Run` internal RPC to replace the use of the `Deployment.Pause` (resume) RPC we were using in `deploymentwatcher`.	2020-06-17 11:06:14 -04:00
Tim Gross	fd50b12ee2	multiregion: integrate with deploymentwatcher * `nextRegion` should take status parameter * thread Deployment/Job RPCs thru `nextRegion` * add `nextRegion` calls to `deploymentwatcher` * use a better description for paused for peer	2020-06-17 11:06:00 -04:00
Tim Gross	5c4d0a73f4	start all but first region deployment in paused state	2020-06-17 11:05:34 -04:00
Tim Gross	473a0f1d44	multiregion: unblock and cancel RPCs	2020-06-17 11:02:26 -04:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Lang Martin	ac7c39d3d3	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 ) * client/heartbeatstop: reversed time condition for startup grace * scheduler/generic_sched: use `delayInstead` to avoid a loop Without protecting the loop that creates followUpEvals, a delayed eval is allowed to create an immediate subsequent delayed eval. For both `stop_after_client_disconnect` and the `reschedule` block, a delayed eval should always produce some immediate result (running or blocked) and then only after the outcome of that eval produce a second delayed eval. * scheduler/reconcile: lostLater are different than delayedReschedules Just slightly. `lostLater` allocs should be used to create batched evaluations, but `handleDelayedReschedules` assumes that the allocations are in the untainted set. When it creates the in-place updates to those allocations at the end, it causes the allocation to be treated as running over in the planner, which causes the initial `stop_after_client_disconnect` evaluation to be retried by the worker.	2020-06-03 09:48:38 -04:00
Lang Martin	d3c4700cd3	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Jasmine Dahilig	4edebe389a	add default update stanza and max_parallel=0 disables deployments (#6191 )	2019-09-02 10:30:09 -07:00
Mahmood Ali	faf643a375	Don't stop rescheduleLater allocations When an alloc is due to be rescheduleLater, it goes through the reconciler twice: once to be ignored with a follow up evals, and once again when processing the follow up eval where they appear as rescheduleNow. Here, we ignore them in the first run and mark them as stopped in second iteration; rather than stop them twice.	2019-06-13 09:44:41 -04:00
Mahmood Ali	fd8fb8c22b	Stop allocs to be rescheduled Currently, when an alloc fails and is rescheduled, the alloc desired state remains as "run" and the nomad client may not free the resources. Here, we ensure that an alloc is marked as stopped when it's rescheduled. Notice the Desired Status and Description before and after this change: Before: ``` mars-2:nomad notnoop$ nomad alloc status 02aba49e ID = 02aba49e Eval ID = bb9ed1d2 Name = example-reschedule.nodes[0] Node ID = 5853d547 Node Name = mars-2.local Job ID = example-reschedule Job Version = 0 Client Status = failed Client Description = Failed tasks Desired Status = run Desired Description = <none> Created = 10s ago Modified = 5s ago Replacement Alloc ID = d6bf872b Task "payload" is "dead" Task Resources CPU Memory Disk Addresses 0/100 MHz 24 MiB/300 MiB 300 MiB Task Events: Started At = 2019-06-06T21:12:45Z Finished At = 2019-06-06T21:12:50Z Total Restarts = 0 Last Restart = N/A Recent Events: Time Type Description 2019-06-06T17:12:50-04:00 Not Restarting Policy allows no restarts 2019-06-06T17:12:50-04:00 Terminated Exit Code: 1 2019-06-06T17:12:45-04:00 Started Task started by client 2019-06-06T17:12:45-04:00 Task Setup Building Task Directory 2019-06-06T17:12:45-04:00 Received Task received by client ``` After: ``` ID = 5001ccd1 Eval ID = 53507a02 Name = example-reschedule.nodes[0] Node ID = a3b04364 Node Name = mars-2.local Job ID = example-reschedule Job Version = 0 Client Status = failed Client Description = Failed tasks Desired Status = stop Desired Description = alloc was rescheduled because it failed Created = 13s ago Modified = 3s ago Replacement Alloc ID = 7ba7ac20 Task "payload" is "dead" Task Resources CPU Memory Disk Addresses 21/100 MHz 24 MiB/300 MiB 300 MiB Task Events: Started At = 2019-06-06T21:22:50Z Finished At = 2019-06-06T21:22:55Z Total Restarts = 0 Last Restart = N/A Recent Events: Time Type Description 2019-06-06T17:22:55-04:00 Not Restarting Policy allows no restarts 2019-06-06T17:22:55-04:00 Terminated Exit Code: 1 2019-06-06T17:22:50-04:00 Started Task started by client 2019-06-06T17:22:50-04:00 Task Setup Building Task Directory 2019-06-06T17:22:50-04:00 Received Task received by client ```	2019-06-06 17:27:12 -04:00
Lang Martin	34230577df	describe a pending deployment with auto_promote accurately	2019-05-22 12:32:08 -04:00
Lang Martin	d462639cc9	sched reconcile copy AutoPromote to DeploymentState	2019-05-22 12:32:08 -04:00
Preetha Appan	1574e898af	Fix bug in reconciler where terminal allocs on a job already stopped were unnecessarily updated	2018-10-08 21:03:49 -05:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	3ba62efd5e	Failed/paused deployments do not block migrations This PR changes behavior of the scheduler such that a task group with a deployment that is failed or paused will not cause the scheduler to skip migrations. The reason for this change is that it causes a bad UX when draining nodes with allocations that are part of a failed/paused deployment. These operations should not be coupled in any way and this remedies that. Prior behavior was still correct, but required either jobs to transistion to a healthy state or for the node to hit its drain deadline.	2018-09-10 15:28:45 -07:00
Preetha Appan	3e264dcb79	Fix reconciler bug with deployment not being created if job create index is different This fixes an issue where if a job is purged and resubmitted Nomad does not create a new deployment. Adds unit test that failed before this fix	2018-06-05 13:58:53 -05:00
Preetha Appan	cf44670d56	Make sure that task group has a deployment state before using it	2018-05-07 14:55:01 -05:00
Alex Dadgar	768fec8505	Allow healthy canary deployment to skip progress deadline	2018-05-07 14:55:01 -05:00
Alex Dadgar	8626c1b94a	Reschedule when we have canaries properly	2018-05-07 14:55:01 -05:00
Alex Dadgar	550f5e31f8	Allow canary count greater than desired	2018-05-07 14:50:01 -05:00
Preetha Appan	5329900f6d	Only use DesiredTransition.Reschedule in reconciler when its an active deployment	2018-05-07 14:50:01 -05:00
Alex Dadgar	57969b4ee0	fix reconcile tests	2018-05-07 14:50:01 -05:00
Alex Dadgar	fcf4f582d0	small review feedback fixes	2018-05-07 14:50:01 -05:00
Alex Dadgar	1336002255	Progress deadline in deployment state	2018-05-07 14:50:01 -05:00
Alex Dadgar	ee50789c22	Initial implementation	2018-05-07 14:50:01 -05:00
Preetha Appan	a569d34f25	Add custom status description for rescheduling follow up evals, and make unit test robust	2018-04-10 15:30:15 -05:00
Preetha Appan	7e17bc231f	remove unnecessary check and other fixes from code review	2018-04-04 07:35:20 -05:00
Preetha Appan	00537c739b	Fixes edge cases around timing and task finish time being set more than once	2018-04-03 16:34:59 -05:00
Alex Dadgar	e106da84de	name and test	2018-03-26 11:06:21 -07:00
Alex Dadgar	e2a6e64fca	Don't create unnecessary deployments	2018-03-23 16:55:21 -07:00
Alex Dadgar	3b72dd94ba	Do not mark an allocation as an inplace update if specification hasn't changed	2018-03-23 14:36:05 -07:00
Michael Schurter	cb61a4bdc7	Fix linting errors	2018-03-21 16:51:45 -07:00
Alex Dadgar	92b636dd32	Fix deadline handling	2018-03-21 16:51:44 -07:00
Alex Dadgar	db4a634072	RPC, FSM, State Store for marking DesiredTransistion fix build tag	2018-03-21 16:49:48 -07:00
Preetha Appan	56e60e5840	Fix linting warning	2018-03-14 16:12:22 -05:00
Preetha Appan	9fed0d2103	Get reschedule policy from the alloc directly	2018-03-14 16:10:32 -05:00
Preetha Appan	e2656ef546	Cleaner handling of batched evals	2018-03-14 16:10:32 -05:00
Preetha Appan	47e0280d96	More small review feedback	2018-03-14 16:10:32 -05:00
Preetha Appan	5373ade731	Scheduler and Reconciler changes to support delayed rescheduling	2018-03-14 16:10:32 -05:00

1 2

95 commits