open-nomad/scheduler
Tim Gross fe5f5e35aa
mrd: reconcile should treat pending deployments as paused (#8446)
If a job update includes a task group that has no changes, those allocations
have their version bumped in-place. The ends up triggering an eval from
`deploymentwatcher` when it verifies their health. Although this eval is a
no-op, we were only treating pending deployments the same as paused when
the deployment was a new MRD. This means that any eval after the initial one
will kick off the deployment, and that caused pending deployments to "jump
the queue" and run ahead of schedule, breaking MRD invariants and resulting in
a state with all regions blocked.

This behavior can be replicated even in the case of job updates with no
in-place updates by patching `deploymentwatcher` to inject a spurious no-op
eval. This changeset fixes the behavior by treating pending deployments the
same as paused in all cases in the reconciler.
2020-07-16 13:00:08 -04:00
..
annotate.go Show canaries on plan 2017-07-07 12:03:11 -07:00
annotate_test.go Deprecate IOPS 2018-12-06 15:09:26 -08:00
context.go scheduler: fix panic when preempting and evicting 2019-12-02 20:22:22 -08:00
context_test.go scheduler: update tests with modern error helper 2019-12-02 20:25:52 -08:00
device.go Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
device_test.go Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
feasible.go scheduler: do network feasibility checking for system jobs (#8256) 2020-06-24 16:01:00 -04:00
feasible_test.go multi-interface network support 2020-06-19 09:42:10 -04:00
generic_sched.go this is OSS 2020-06-22 10:28:45 -04:00
generic_sched_test.go fix test failures from rebase 2020-06-18 11:05:32 -07:00
preemption.go Only preempt for network when there is a network 2019-06-07 18:55:55 -04:00
preemption_test.go tests and some clean up 2020-05-01 13:13:30 -04:00
propertyset.go server 2018-09-15 16:23:13 -07:00
rank.go ar: support opting into binding host ports to default network IP (#8321) 2020-07-06 18:51:46 -04:00
rank_test.go multi-interface network support 2020-06-19 09:42:10 -04:00
reconcile.go mrd: reconcile should treat pending deployments as paused (#8446) 2020-07-16 13:00:08 -04:00
reconcile_test.go add default update stanza and max_parallel=0 disables deployments (#6191) 2019-09-02 10:30:09 -07:00
reconcile_util.go scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105) (#8138) 2020-06-09 17:13:53 -04:00
reconcile_util_test.go Fix linting 2018-03-28 12:26:28 -05:00
scheduler.go csi: use node MaxVolumes during scheduling (#7565) 2020-03-31 17:16:47 -04:00
scheduler_oss.go include pro tag in serveral oss.go files 2020-02-10 15:56:14 -05:00
select.go Implement affinity support in generic scheduler 2018-09-04 16:10:11 -05:00
select_test.go Implement affinity support in generic scheduler 2018-09-04 16:10:11 -05:00
spread.go More error->debug for logging in the bin packing iterator 2019-12-12 15:50:16 -06:00
spread_test.go Fix comment and assert score in test case 2019-05-15 12:35:57 -05:00
stack.go scheduler: do network feasibility checking for system jobs (#8256) 2020-06-24 16:01:00 -04:00
stack_not_ent.go sync 2017-10-13 14:36:02 -07:00
stack_test.go csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049) 2020-03-23 13:58:29 -04:00
system_sched.go scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105) (#8138) 2020-06-09 17:13:53 -04:00
system_sched_test.go scheduler: do network feasibility checking for system jobs (#8256) 2020-06-24 16:01:00 -04:00
testing.go server: stop after client disconnect (#7939) 2020-05-13 16:39:04 -04:00
util.go Task DNS Options (#7661) 2020-06-18 11:01:31 -07:00
util_test.go Detect network mode change 2020-03-21 16:51:10 -04:00