Commit graph

51 commits

Author SHA1 Message Date
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Mahmood Ali 8a342926b7 Respect alloc job version for lost/failed allocs
This change fixes a bug where lost/failed allocations are replaced by
allocations with the latest versions, even if the version hasn't been
promoted yet.

Now, when generating a plan for lost/failed allocations, the scheduler
first checks if the current deployment is in Canary stage, and if so, it
ensures that any lost/failed allocations is replaced one with the latest
promoted version instead.
2020-08-19 09:52:48 -04:00
Nick Ethier 416efd83ee
scheduler: do network feasibility checking for system jobs (#8256) 2020-06-24 16:01:00 -04:00
Nick Ethier f0559a8162
multi-interface network support 2020-06-19 09:42:10 -04:00
Nick Ethier 4a44deaa5c CNI Implementation (#7518) 2020-06-18 11:05:29 -07:00
Mahmood Ali d9792777d9 Open source Preemption code
Nomad 0.12 OSS is to include preemption feature.

This commit moves the private code for managing preemption to OSS
repository.
2020-05-27 15:02:01 -04:00
Mahmood Ali b9e3cde865 tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle d8e5e02398 Wiring algorithm to scheduler calls 2020-05-01 13:13:29 -04:00
Michael Schurter 4b475db408 core: fix comment on system stack
This makes me do a double take every time I run into it, so what if we
just changed it?
2020-04-09 15:19:11 -07:00
Lang Martin d994990ef0
csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Lang Martin 3621df1dbf csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Lang Martin a0a6766740 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Danielle Lancashire 073836ec67
scheduler: Add a feasability checker for Host Vols 2019-08-12 15:39:08 +02:00
Preetha Appan 8ddc076c1d
Refactor scheduler package to enable preemption for batch/service jobs 2019-04-10 20:24:01 -05:00
Preetha Appan c1c1c230e4
Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan 25a047267f
Use scheduler config from state store to enable/disable preemption 2018-10-30 11:06:32 -05:00
Alex Dadgar 1089e13b14 add to stack 2018-10-13 12:27:49 -07:00
Alex Dadgar bac5cb1e8b Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Preetha Appan db0d95b09c
Implement spread iterator that scores according to percentage of desired count in each target.
Added this as a new step in the stack and some unit tests
2018-09-04 16:10:11 -05:00
Preetha Appan d5cd2bbddb
Remove unnecessary reset 2018-09-04 16:10:11 -05:00
Preetha Appan 5eacd6ada4
Implement affinity support in generic scheduler 2018-09-04 16:10:11 -05:00
Preetha Appan dd4917c2f0
Add more clarification in comment 2018-01-31 09:58:05 -06:00
Preetha Appan 237beb49ae
Better score threshold 2018-01-31 09:58:05 -06:00
Preetha Appan a75540cec6
Limit iterator uses a score threshold and a maxSkip value to be able to skip lower scoring nodes 2018-01-31 09:58:05 -06:00
Preetha Appan fee4ccf154
Prevent side effect modification of select options when preferred nodes are set 2018-01-31 09:56:53 -06:00
Preetha Appan 21b7b79d5d
Add helper methods, use require and other code review feedback 2018-01-31 09:56:53 -06:00
Preetha Appan 031c566ada
Reschedule previous allocs and track their reschedule attempts 2018-01-31 09:56:53 -06:00
Alex Dadgar c1cc51dbee sync 2017-10-13 14:36:02 -07:00
Alex Dadgar 60c42f745a Split distinct property and host iterator and add iterator to system stack 2017-03-08 19:00:10 -08:00
Alex Dadgar 87d971a6b8 Double the anti-affinity for placing same task group on node 2017-03-06 11:52:53 -08:00
Diptanu Choudhury 52e9946da9 Implemented SetPrefferingNodes in stack 2016-08-30 16:17:50 -07:00
Diptanu Choudhury ec73c768f1 Making the scheduler use LocalDisk instead of Resources.DiskMB 2016-08-25 12:27:42 -05:00
Alex Dadgar 25cb7fc03d Fix computed class when the job has multiple task groups 2016-02-03 21:22:18 -08:00
Alex Dadgar 9dc22532e5 Respond to comments 2016-01-26 16:43:42 -08:00
Alex Dadgar 2b7d42bf9b FeasibilityWrapper uses computed node class eligibility to call feasibility checks minimally 2016-01-26 15:16:43 -08:00
Alex Dadgar 1784387e1d Rename Dynamic -> ProposedAllocConstraintIterator 2015-10-26 14:12:54 -07:00
Alex Dadgar 783b0b5aee Add dynamic constraint to generic_scheduler 2015-10-22 15:09:03 -07:00
Alex Dadgar d9b78ffdfe Remove base nodes from stack constructors 2015-10-16 17:05:23 -07:00
Alex Dadgar 1ec921a3c2 Refactor task group constraint logic in generic/system stack 2015-10-16 14:00:51 -07:00
Alex Dadgar 494244ed06 System scheduler and system stack 2015-10-14 18:39:44 -07:00
Armon Dadgar 90a82da0fd scheduler: do not skip job anti-affinity 2015-09-22 22:20:07 -07:00
Armon Dadgar c6f5a8e029 scheduler: thread through the TaskResources 2015-09-13 15:20:50 -07:00
Armon Dadgar 625308661a scheduler: binpacker makes network offers 2015-09-13 14:31:32 -07:00
Armon Dadgar 4333b7370b scheduler: recompute scan limit on SetNodes 2015-09-11 12:03:41 -07:00
Armon Dadgar 4a348cc0da scheduler: allow updating the base nodes 2015-09-07 11:30:13 -07:00
Armon Dadgar 383a6aa76d scheduler: adding job anti-affinity to the generic stack 2015-08-16 10:37:11 -07:00
Armon Dadgar 0b1d096f87 scheduler: adding minor specialization for batch 2015-08-13 22:35:48 -07:00
Armon Dadgar 64bab68680 scheduler: basic metrics integration 2015-08-13 21:46:33 -07:00
Armon Dadgar d4e3270757 scheduler: simply stack implementation 2015-08-13 18:44:27 -07:00
Armon Dadgar 6fa4822e61 scheduler: thread size through 2015-08-13 18:36:13 -07:00