Commit Graph

160 Commits

Author SHA1 Message Date
Alex Dadgar 248315a2d9 Handle destructive changes before placements
This PR updates the generic scheduler to handle destructive changes
before handling placements. This is important because the destructive
change may be due to a lowering of resources. If this is the case, the
handling of the destructive changes first may make it possible for the
placement to happen.

To reason about this imagine there is one node with CPU = 500.

If the group originally had:
* `count = 1`
* `cpu = 400`

And then the job was updated such that the group had:
* `count = 4`
* `cpu = 120`

If the original alloc isn't discounted first, nothing would be able to
place.
2017-07-20 12:24:27 -07:00
Alex Dadgar 641e178416 Stop before trying to place 2017-07-17 17:18:12 -07:00
Alex Dadgar 66a90326e1 Treat destructive updates atomically 2017-07-16 10:35:38 -07:00
Alex Dadgar f86760db3c Basic logs 2017-07-07 16:49:08 -07:00
Alex Dadgar 20005f925a Rolling node drains using max_parallel and stagger
This PR adds rolling node drains done at max_parallel and stagger of the
update spec. It brings it inline with old behavior.
2017-07-07 12:12:48 -07:00
Alex Dadgar d352d85bb9 Test scheduler's handling of canaries/inplace updates 2017-07-07 12:10:04 -07:00
Alex Dadgar e229d3650b Attach eval id 2017-07-07 12:10:04 -07:00
Alex Dadgar af1935e1e1 Mark complete 2017-07-07 12:10:04 -07:00
Alex Dadgar 8424a3b380 Change canary handling 2017-07-07 12:10:04 -07:00
Alex Dadgar 09dfa2fc10 Rename CreateDeployments and remove cancelling behavior in state_store 2017-07-07 12:10:04 -07:00
Alex Dadgar 067ed86a47 Client watches for allocation health using task state and Consul checks
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
2017-07-07 12:10:04 -07:00
Alex Dadgar 27a6e6b6d1 update description of the alloc update factory function 2017-07-07 12:03:11 -07:00
Alex Dadgar b2573b01f9 Fix canary handling 2017-07-07 12:03:11 -07:00
Alex Dadgar d111dd5c10 Pull out in-place updating into a passed in function; reduce inputs to reconciler 2017-07-07 12:03:11 -07:00
Alex Dadgar 4c123500ee Remove old 2017-07-07 12:03:11 -07:00
Alex Dadgar cf5baba808 handle annotations 2017-07-07 12:03:11 -07:00
Alex Dadgar 07b1c3e5db Only upsert a job if the spec changes and push deployment creation into reconciler 2017-07-07 12:03:11 -07:00
Alex Dadgar 0d42b5d421 initial reconciler 2017-07-07 12:01:17 -07:00
Alex Dadgar b3f4db0930 cancel deployments 2017-07-07 12:01:17 -07:00
Alex Dadgar 5a2449d236 Respond to review comments 2017-04-19 10:54:03 -07:00
Alex Dadgar 3145086a42 non-purge deregisters 2017-04-15 17:08:05 -07:00
Alex Dadgar b69b357c7f Nomad builds 2017-02-07 20:31:23 -08:00
Diptanu Choudhury 5191b4d33a Making the status command return the allocs of currently registered job 2016-11-24 16:31:30 +01:00
Diptanu Choudhury dae7f88118 Not setting a drained node as preferred node (#1740) 2016-09-23 21:15:50 -07:00
Diptanu Choudhury 1b3c5e98c8 Renaming LocalDisk to EphemeralDisk (#1710)
Renaming LocalDisk to EphemeralDisk
2016-09-14 15:43:42 -07:00
Diptanu Choudhury 52e9946da9 Implemented SetPrefferingNodes in stack 2016-08-30 16:17:50 -07:00
Diptanu Choudhury bfee7b30a3 Introducing shared resources in alloc 2016-08-29 13:49:25 -07:00
Diptanu Choudhury 13497913f9 Ensuring resources are re-calculated properly in fsm 2016-08-26 20:13:11 -07:00
Diptanu Choudhury 3447658bba Added scheduler tests to ensure disk constraints are honored 2016-08-25 15:31:56 -05:00
Diptanu Choudhury 1de89776d7 Marking an allocation chained if we are creating this to replace an old one 2016-08-15 17:52:41 -07:00
Diptanu Choudhury 13bab5b1ad Added scheduler tests 2016-08-09 14:52:25 -07:00
Diptanu Choudhury ab94c8eed9 Marking allocations which are not terminal and are on down nodes as lost 2016-08-09 13:11:58 -07:00
Alex Dadgar ac3328e812 Make scheduler mark allocations as lost 2016-08-03 15:57:46 -07:00
Alex Dadgar e661c09898 fix filter logic 2016-07-28 15:57:56 -07:00
Alex Dadgar ddbd9261c1 Merge pull request #1471 from hashicorp/b-handle-old-batch-allocs
filterCompleteAllocs filters replaced batch allocs
2016-07-28 14:31:19 -07:00
Diptanu Choudhury 48eda99dd9 Setting the queued count as zero if there is nothing to place 2016-07-28 12:13:35 -07:00
Alex Dadgar c132952ba2 filterCompleteAllocs filters replaced batch allocs 2016-07-27 11:54:55 -07:00
Diptanu Choudhury d1a6bdb4ba Making the queued allocations bind late 2016-07-25 22:11:11 -07:00
Diptanu Choudhury 51cb201a09 Initializing the queued allocations late 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 8f0d2a2775 Fixed some more tests 2016-07-25 17:26:38 -07:00
Diptanu Choudhury dabb83063b Review comments 2016-07-25 17:26:38 -07:00
Diptanu Choudhury a64785417d Fixed the logic for decrementing the count of queued based on plan result 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 1cc0bc392b Setting the number of queued allocations per task group 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 487c66b84d Removing the queued state of Job Summary and alloc desired status false 2016-07-13 13:20:46 -06:00
Alex Dadgar 67c0816726 Handle max plans 2016-06-20 17:43:02 -07:00
Sean Chittenden 2f036231e5 Merge pull request #1201 from hashicorp/f-dyn-server-list
Dynamic Server Lists/Client Bootstrapping via consul.
2016-06-11 18:58:25 -04:00
Alex Dadgar b064b392fc Only unblock if missed class was added after eval snapshot index 2016-06-10 15:24:06 -07:00
Sean Chittenden 95c9d1a63e
Per-comment, remove structs.Allocation's Services attribute.
Nuke PopulateServiceIDs() now that it's also no longer needed.
2016-06-10 15:54:39 -04:00
Alex Dadgar 57770de1fc Add eval-status and remove eval-monitor 2016-05-27 11:50:15 -07:00
Alex Dadgar fb8d79a908 Blocked evals don't store TG alloc metrics 2016-05-27 11:26:14 -07:00
Alex Dadgar 3fd51ecece Periodically unblock failed evaluations 2016-05-24 20:10:56 -07:00
Alex Dadgar 18d9e89065 Reuse the same evaluation and reblock it until there is no more work to do 2016-05-24 20:10:56 -07:00
Alex Dadgar 3cbb89c61e Merge pull request #1188 from hashicorp/f-no-failed-allocs
Failed Allocation Metrics stored in Evaluation
2016-05-24 20:06:28 -07:00
Alex Dadgar fcc57fbc66 rename SpawnedBlockedEval and simplify map safety check 2016-05-24 18:12:59 -07:00
Alex Dadgar b5ad18a7ea Dont restart successfully finished batch allocations 2016-05-24 17:23:18 -07:00
Alex Dadgar 1feb57b047 Evals track blocked evals they create 2016-05-19 13:09:52 -07:00
Alex Dadgar 8f5f12ae81 Scheduler no longer produces failed allocations; failed alloc metrics stored in evaluation 2016-05-18 18:11:40 -07:00
Alex Dadgar 117b926e2b inplaceUpdate returns the allocs that were updated in-place 2016-05-17 15:37:37 -07:00
Alex Dadgar ab0b57a9a1 Initial plan endpoint implementation - WIP 2016-05-05 11:21:58 -07:00
Alex Dadgar 733156c016 vendor 2016-04-19 17:12:44 -07:00
Alex Dadgar 7dc1a525cb more debug 2016-04-19 16:55:27 -07:00
Alex Dadgar 76e493dc16 base debugging 2016-04-19 16:33:25 -07:00
Alex Dadgar 1a31e5e137 Fix drained/batch allocations from continually migrating 2016-04-12 16:14:32 -07:00
Alex Dadgar f021c1a7b0 filtering failed batch allocs 2016-04-11 12:51:53 -07:00
Alex Dadgar 034bae90bb Revert "Remove client status from allocation TerminalStatus"
This reverts commit 819e1e4b3967c7029ee8221144666ff460fdd7ed.
2016-04-08 14:22:06 -07:00
Alex Dadgar 09f63fd3c0 Remove client status from allocation TerminalStatus 2016-03-25 12:53:37 -07:00
Diptanu Choudhury 3f0580f204 Added a note about backward compatibility 2016-03-23 19:08:07 -07:00
Alex Dadgar ad92e50a24 Avoid serializes Allocation.Resources 2016-03-01 14:09:25 -08:00
Alex Dadgar a2b56a5cff Generic Scheduler handles periodic eval type 2016-02-24 16:20:33 -08:00
Alex Dadgar fa8e2d31ee Revert "err logs in worker and scheduler"
This reverts commit 7befc586521b70eb84013bff367310e4cfa45c27.
2016-02-22 22:23:57 -08:00
Alex Dadgar f48eabe753 err logs in worker and scheduler 2016-02-22 14:47:59 -08:00
Armon Dadgar 2b7bdfee37 nomad: add a sanity check guard 2016-02-22 12:15:40 -08:00
Armon Dadgar 87447efa61 schedule: deduplicate the jobs 2016-02-21 11:32:56 -08:00
Armon Dadgar 0dbd4c46c9 nomad: make PopulateServiceIDs more efficient 2016-02-21 11:15:00 -08:00
Alex Dadgar 8e6544333e Only set eligibility if the eval hasn't escaped 2016-02-11 09:45:27 -08:00
Alex Dadgar a47d5260c5 Reset retry count if progress is made and fail by creating a blocked eval 2016-02-09 21:24:47 -08:00
Alex Dadgar d930d488b5 Fix node drain 2016-02-03 12:00:43 -08:00
Alex Dadgar c7821f13d7 Only replace batch allocations that have failed 2016-02-02 17:40:32 -08:00
Alex Dadgar 151fe5ed88 Make computed node class a string and add versioning 2016-01-31 18:04:45 -08:00
Alex Dadgar 9045d7e989 Schedulers create blocked eval if there are failed allocations 2016-01-31 18:04:45 -08:00
Alex Dadgar c0721e45f6 Fix bug, add tests, and cli output 2016-01-04 14:23:06 -08:00
Alex Dadgar 36752b9ed4 Store the available nodes in the alloc metric 2016-01-04 12:07:33 -08:00
Diptanu Choudhury b7f556fabc Changed some comments 2015-12-14 18:05:58 -08:00
Diptanu Choudhury 2eb03e1d23 Renamed serviceId to serviceID 2015-12-14 15:57:56 -08:00
Diptanu Choudhury 76486d71e2 Making the allocs hold service ids 2015-12-14 15:08:35 -08:00
Alex Dadgar bdf7497f1b Initialize task state in allocation sent by scheduler 2015-11-16 15:14:21 -08:00
Alex Dadgar d9b78ffdfe Remove base nodes from stack constructors 2015-10-16 17:05:23 -07:00
Alex Dadgar ab9acb9edf diffResult stores values not pointers 2015-10-16 11:43:09 -07:00
Alex Dadgar 406e135ce8 Add negative test to DriverIterator, increase system scheduler attempts, and fix evictAndPlace status message 2015-10-16 11:36:26 -07:00
Alex Dadgar 70c39bd5a4 Add diffSystemAlloc which gives richer information which node to place a system allocation 2015-10-15 13:14:44 -07:00
Alex Dadgar 65fd28d7d1 Refactor shared code between schedulers 2015-10-14 18:39:44 -07:00
Alex Dadgar 494244ed06 System scheduler and system stack 2015-10-14 18:39:44 -07:00
Ivo Verberk c6e1b13b51 Fix vet warnings 2015-10-07 12:26:58 +02:00
Armon Dadgar a23c8bf713 scheduler: Allow rolling update, assign eval first.Fixes #91 2015-09-22 21:45:25 -07:00
Armon Dadgar ee2fd15cc2 scheduler: ignore allocations in terminal state 2015-09-17 21:25:55 -07:00
Armon Dadgar cece4473b8 scheduler: in-place update should preserve network offer 2015-09-13 17:06:34 -07:00
Armon Dadgar c6f5a8e029 scheduler: thread through the TaskResources 2015-09-13 15:20:50 -07:00
Armon Dadgar ea0795995d Use a single implementation of GenerateUUID 2015-09-07 15:23:03 -07:00
Armon Dadgar b0ef497f92 scheduler: use update strategy for rolling updates 2015-09-07 15:17:39 -07:00
Armon Dadgar c8bcb292a0 scheduler: support in-place allocation updates 2015-09-07 12:27:12 -07:00