Commit graph

11400 commits

Author SHA1 Message Date
Michael Schurter a4f346abeb remove spurious TODOs and FIXMEs 2018-03-21 16:55:22 -07:00
Michael Schurter 9f3086a268 test: must initialize jobResults with new func 2018-03-21 16:51:45 -07:00
Michael Schurter e432c9af55 test: disable node drainer during tests
Node drainer would throw off the index checks
2018-03-21 16:51:45 -07:00
Michael Schurter 5c8c4bce2a test: disable drain during fsm test
drainer was unsetting drain before fsm could read written value
2018-03-21 16:51:45 -07:00
Michael Schurter 0e0b04afec test: fix by using mock.BatchJob 2018-03-21 16:51:45 -07:00
Michael Schurter 8b346c6176 test: try to prevent flakiness on travis 2018-03-21 16:51:45 -07:00
Michael Schurter 39cef16c73 test: don't call t.Fatal from within a goroutine 2018-03-21 16:51:45 -07:00
Michael Schurter 341d87aa48 tests: use mock.BatchJob to fix tests 2018-03-21 16:51:45 -07:00
Michael Schurter 8b107acc06 mock: add BatchJob() helper 2018-03-21 16:51:45 -07:00
Michael Schurter 1b7ac447e9 alloc_runner: watch health for deployed batch jobs 2018-03-21 16:51:45 -07:00
Michael Schurter 1cc012966b api: fix tests to expect default migrate strategy 2018-03-21 16:51:45 -07:00
Michael Schurter cb61a4bdc7 Fix linting errors 2018-03-21 16:51:45 -07:00
Michael Schurter 33488ee2f0 rpcapi: remove; unused 2018-03-21 16:51:45 -07:00
Alex Dadgar 640ebdaef6 fix race in drain integration tests 2018-03-21 16:51:45 -07:00
Michael Schurter c401d5a098 Refactor assertOps into a helper func 2018-03-21 16:51:45 -07:00
Michael Schurter 187b0e1a48 Remove debug prints 2018-03-21 16:51:45 -07:00
Michael Schurter f67eca48ac Deregister garbage collected jobs 2018-03-21 16:51:45 -07:00
Michael Schurter 922842546c JobNs -> NamespacedID
Also drop the New func as it's easy to swap the order of arguments since
they're both strings.
2018-03-21 16:51:45 -07:00
Michael Schurter 8dc7d9fb6a drainer: RegisterJob -> RegisterJobs
Test job watcher
2018-03-21 16:51:45 -07:00
Michael Schurter 3116897099 Fix deadline heap triggering
Chan must be buffered to avoid skipping triggering altogether

Also made timing in a test a bit more lenient
2018-03-21 16:51:45 -07:00
Michael Schurter be7c759867 Improve drain log messages
Also delay "node complete" after the node has been marked complete to
capture a few more alloc events. There are other ways to implement this
that could trade off correctness for responsiveness as technically a
node is considered drained when all of its allocs have been marked to
stop and not when they've actually stopped (which may not happen for a
long time).
2018-03-21 16:51:45 -07:00
Michael Schurter 5eebd53223 Monitor node drains until completion in CLI
allow -detach like other commands
2018-03-21 16:51:45 -07:00
Michael Schurter 2832853bfa Add DesiredTransition.ShouldMigrate to api pkg 2018-03-21 16:51:45 -07:00
Michael Schurter 3907766b6d Fix node eligibility test 2018-03-21 16:51:45 -07:00
Alex Dadgar 9d23c965da fix comment 2018-03-21 16:51:45 -07:00
Alex Dadgar fb4badf1bc sharding 2018-03-21 16:51:44 -07:00
Alex Dadgar 2d91b9dfba Batch drain update 2018-03-21 16:51:44 -07:00
Alex Dadgar 92b636dd32 Fix deadline handling 2018-03-21 16:51:44 -07:00
Michael Schurter 9898edfa90 Switch to drainerv2 impl 2018-03-21 16:51:44 -07:00
Alex Dadgar 7b2bad8c5e Toggle Drain allows resetting eligibility
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar ad80e655cc code review 2018-03-21 16:51:44 -07:00
Alex Dadgar 11f9fe4960 spelling fixes 2018-03-21 16:51:44 -07:00
Alex Dadgar bc7385812d Comments 2018-03-21 16:51:44 -07:00
Alex Dadgar e87c677a42 handle empty node case 2018-03-21 16:51:44 -07:00
Alex Dadgar 405dab2253 integration test and basic fixes 2018-03-21 16:51:44 -07:00
Alex Dadgar e63bcb474d Drainer 2018-03-21 16:51:44 -07:00
Alex Dadgar 4754366640 job watcher 2018-03-21 16:51:44 -07:00
Alex Dadgar 504bfabb4d Node's being untracked or having updated deadlines, updates the deadliner 2018-03-21 16:51:44 -07:00
Alex Dadgar 66eaaa6a4d node watcher 2018-03-21 16:51:44 -07:00
Alex Dadgar 527ac0b39d drain heap 2018-03-21 16:51:44 -07:00
Alex Dadgar 2d4c193a0a Initial design 2018-03-21 16:51:44 -07:00
Alex Dadgar 33ca319080 System test runs on mac 2018-03-21 16:51:44 -07:00
Alex Dadgar f8d4a3a9e6 Fix file names 2018-03-21 16:51:44 -07:00
Alex Dadgar 02019f216a Correct defaulting 2018-03-21 16:51:44 -07:00
Michael Schurter 32a7649359 refactor main drainloop into 2 more methods 2018-03-21 16:51:44 -07:00
Michael Schurter 5e52f84bb7 drainer: refactor newStopAllocs, applyMigrations 2018-03-21 16:51:44 -07:00
Michael Schurter 62960ed7bd client: don't monitor health of non-service jobs
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar a37329189a Improve DeadlineTime helper 2018-03-21 16:51:44 -07:00
Michael Schurter b7c993f0e5 drainer: convert fsm errors to go errors 2018-03-21 16:51:44 -07:00
Michael Schurter ab0de41884 drainer: factor job & node watchers out of drainer.go 2018-03-21 16:51:44 -07:00