open-nomad/scheduler
Luiz Aoqui 8f91be26ab
scheduler: create placements for non-register MRD (#15325)
* scheduler: create placements for non-register MRD

For multiregion jobs, the scheduler does not create placements on
registration because the deployment must wait for the other regions.
Once of these regions will then trigger the deployment to run.

Currently, this is done in the scheduler by considering any eval for a
multiregion job as "paused" since it's expected that another region will
eventually unpause it.

This becomes a problem where evals not triggered by a job registration
happen, such as on a node update. These types of regional changes do not
have other regions waiting to progress the deployment, and so they were
never resulting in placements.

The fix is to create a deployment at job registration time. This
additional piece of state allows the scheduler to differentiate between
a multiregion change, where there are other regions engaged in the
deployment so no placements are required, from a regional change, where
the scheduler does need to create placements.

This deployment starts in the new "initializing" status to signal to the
scheduler that it needs to compute the initial deployment state. The
multiregion deployment will wait until this deployment state is
persisted and its starts is set to "pending". Without this state
transition it's possible to hit a race condition where the plan applier
and the deployment watcher may step of each other and overwrite their
changes.

* changelog: add entry for #15325
2022-11-25 12:45:34 -05:00
..
benchmarks
annotate.go
annotate_test.go
context.go
context_test.go
device.go
device_test.go
feasible.go
feasible_test.go
generic_sched.go scheduler: log stack in case of panic (#15303) 2022-11-17 18:59:33 -05:00
generic_sched_test.go
preemption.go
preemption_test.go
propertyset.go
rank.go
rank_test.go
reconcile.go scheduler: create placements for non-register MRD (#15325) 2022-11-25 12:45:34 -05:00
reconcile_test.go scheduler: create placements for non-register MRD (#15325) 2022-11-25 12:45:34 -05:00
reconcile_util.go Update alloc after reconnect and enforece client heartbeat order (#15068) 2022-11-04 16:25:11 -04:00
reconcile_util_test.go Update alloc after reconnect and enforece client heartbeat order (#15068) 2022-11-04 16:25:11 -04:00
scheduler.go
scheduler_oss.go
scheduler_sysbatch_test.go
scheduler_system.go scheduler: log stack in case of panic (#15303) 2022-11-17 18:59:33 -05:00
scheduler_system_test.go scheduler: set job on system stack for CSI feasibility check (#15372) 2022-11-23 16:47:35 -05:00
select.go
select_test.go
spread.go
spread_test.go
stack.go scheduler: set job on system stack for CSI feasibility check (#15372) 2022-11-23 16:47:35 -05:00
stack_oss.go
stack_test.go
testing.go
util.go Update alloc after reconnect and enforece client heartbeat order (#15068) 2022-11-04 16:25:11 -04:00
util_test.go Update alloc after reconnect and enforece client heartbeat order (#15068) 2022-11-04 16:25:11 -04:00