Commit graph

33 commits

Author SHA1 Message Date
Mahmood Ali 8a342926b7 Respect alloc job version for lost/failed allocs
This change fixes a bug where lost/failed allocations are replaced by
allocations with the latest versions, even if the version hasn't been
promoted yet.

Now, when generating a plan for lost/failed allocations, the scheduler
first checks if the current deployment is in Canary stage, and if so, it
ensures that any lost/failed allocations is replaced one with the latest
promoted version instead.
2020-08-19 09:52:48 -04:00
Lang Martin e03c328792
csi: use node MaxVolumes during scheduling (#7565)
* nomad/state/state_store: CSIVolumesByNodeID ignores namespace

* scheduler/scheduler: add CSIVolumesByNodeID to the state interface

* scheduler/feasible: check node MaxVolumes

* nomad/csi_endpoint: no namespace inn CSIVolumesByNodeID anymore

* nomad/state/state_store: avoid DenormalizeAllocationSlice

* nomad/state/iterator: clean up SliceIterator Next

* scheduler/feasible_test: block with MaxVolumes

* nomad/state/state_store_test: fix args to CSIVolumesByNodeID
2020-03-31 17:16:47 -04:00
Lang Martin d994990ef0
csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Lang Martin 3621df1dbf csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Lang Martin a0a6766740 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Preetha Appan 25a047267f
Use scheduler config from state store to enable/disable preemption 2018-10-30 11:06:32 -05:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Josh Soref ca4ceb0e5c spelling: commits 2018-03-11 17:47:45 +00:00
Alex Dadgar c1cc51dbee sync 2017-10-13 14:36:02 -07:00
Alex Dadgar 84d06f6abe Sync namespace changes 2017-09-07 17:04:21 -07:00
Alex Dadgar b3f4db0930 cancel deployments 2017-07-07 12:01:17 -07:00
Alex Dadgar b69b357c7f Nomad builds 2017-02-07 20:31:23 -08:00
Diptanu Choudhury 5191b4d33a Making the status command return the allocs of currently registered job 2016-11-24 16:31:30 +01:00
Alex Dadgar a1d08c2aba Add scheduler version enforcement 2016-10-26 14:52:48 -07:00
Alex Dadgar 18d9e89065 Reuse the same evaluation and reblock it until there is no more work to do 2016-05-24 20:10:56 -07:00
Armon Dadgar 35741fcedd scheduler: Use AllocsByNodeTerminal to avoid filtering 2016-02-20 11:29:15 -08:00
Alex Dadgar 25c2c83d6f Revert "dont hard code scheduler type name"
This reverts commit fb0e0dfc2be596ad435a00a2e131a2000146b1f1.
2015-10-23 16:32:45 -07:00
Alex Dadgar 5b0a9b2469 dont hard code scheduler type name 2015-10-23 16:31:45 -07:00
Alex Dadgar 494244ed06 System scheduler and system stack 2015-10-14 18:39:44 -07:00
Armon Dadgar e7c0bb5e1e scheduler: Adding CreateEval to Planner 2015-09-07 14:26:29 -07:00
Armon Dadgar 8bedd3769c nomad: unifying the state store API 2015-09-06 20:56:38 -07:00
Armon Dadgar cae67b7f60 nomad: expose UpdateEval as a planner 2015-08-15 14:25:00 -07:00
Armon Dadgar 3f16a21658 nomad: remove NodesByDatacenterStatus 2015-08-15 13:11:42 -07:00
Armon Dadgar ddfe28e694 scheduler: ServiceScheduler is now GenericScheduler with service and batch modes 2015-08-13 22:28:37 -07:00
Armon Dadgar a077605114 nomad: adding NodesByDatacenterStatus 2015-08-13 13:17:03 -07:00
Armon Dadgar df21ab3d10 scheduler: working on bin pack 2015-08-13 11:54:59 -07:00
Armon Dadgar fb0c230f3b scheduler: working on job updates 2015-08-11 16:41:48 -07:00
Armon Dadgar dbb631d0b2 scheduler: check node status before evicting 2015-08-11 14:04:04 -07:00
Armon Dadgar 28f9234135 scheduler: first pass at job deregister 2015-08-06 17:46:14 -07:00
Armon Dadgar 09a8c15d7e scheduler: adding service scheduler definition 2015-08-06 17:25:14 -07:00
Armon Dadgar d7e9927661 nomad: integrating worker and scheduler 2015-07-28 16:15:32 -07:00
Armon Dadgar 90e8bc9b91 scheduler: initial interface 2015-07-28 16:03:15 -07:00