Preetha Appan
bcb5c8c70d
remove stray new line
2019-04-12 10:32:48 -05:00
Preetha Appan
8ddc076c1d
Refactor scheduler package to enable preemption for batch/service jobs
2019-04-10 20:24:01 -05:00
James Rasell
9470507cf4
Add NodeName to the alloc/job status outputs.
...
Currently when operators need to log onto a machine where an alloc
is running they will need to perform both an alloc/job status
call and then a call to discover the node name from the node list.
This updates both the job status and alloc status output to include
the node name within the information to make operator use easier.
Closes #2359
Cloess #1180
2019-04-10 10:34:10 -05:00
Nick Ethier
24cbf42798
scheduler: fix NPE when deployment is nil, but placement is a canary
2019-01-28 20:22:59 -06:00
Alex Dadgar
4bdccab550
goimports
2019-01-22 15:44:31 -08:00
Preetha Appan
0494a098ce
More style and readablity fixes from review
2018-10-30 11:06:32 -05:00
Preetha Appan
cc295b90de
Implement preemption for system jobs.
...
This commit implements an allocation selection algorithm for finding
allocations to preempt. It currently special cases network resource asks
from others (cpu/memory/disk/iops).
2018-10-30 11:06:32 -05:00
Alex Dadgar
a78cefec18
use int64
2018-10-16 15:34:32 -07:00
Preetha Appan
7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs
2018-10-16 16:21:42 -05:00
Alex Dadgar
bac5cb1e8b
Scheduler uses allocated resources
2018-10-02 17:08:25 -07:00
Preetha Appan
a10118c461
Add failed follow up to the list of allowed eval trigger reasons
...
needs unit test
2018-09-25 10:49:55 -07:00
Alex Dadgar
6a21f9fe96
Unique TriggerBy for blocked evals
...
Give blocked evals a unique triggerby reason to make debugging a chain
of evaluations easier.
2018-09-24 14:47:49 -07:00
Alex Dadgar
3c19d01d7a
server
2018-09-15 16:23:13 -07:00
Preetha Appan
6ed527c636
Use heap to store top K scoring nodes.
...
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Alex Dadgar
e1c239daae
Merge pull request #4414 from hashicorp/b-stop-summary
...
Reset Queued allocs to zero when job stopped
2018-07-16 14:32:55 -07:00
Nick Ethier
6b6777359b
scheduler: fix missing err assignment
2018-07-11 14:27:10 -04:00
Nick Ethier
5f6def5b04
scheduler: better error handling
2018-07-05 11:00:03 -04:00
Nick Ethier
030e650e78
scheduler: fix nil pointer exception
2018-07-02 16:05:38 -04:00
Alex Dadgar
c3c79c408e
Reset Queued allocs to zero when job stopped
...
When a job is stopped but not purged, we should set the Queued count to
be zero.
2018-06-13 10:46:39 -07:00
Alex Dadgar
f95ab4ade8
Mark canaries on creation, and unmark on promotion
2018-05-07 14:50:01 -05:00
Preetha Appan
a569d34f25
Add custom status description for rescheduling follow up evals, and make unit test robust
2018-04-10 15:30:15 -05:00
Alex Dadgar
e5b5803265
Only mark allocs as part of deployment if deployment is active
2018-04-05 15:40:49 -07:00
Preetha Appan
00537c739b
Fixes edge cases around timing and task finish time being set more than once
2018-04-03 16:34:59 -05:00
Alex Dadgar
9d60e2cebf
Correct status desc on draining system allocs
2018-03-26 17:54:46 -07:00
Alex Dadgar
e106da84de
name and test
2018-03-26 11:06:21 -07:00
Alex Dadgar
e2a6e64fca
Don't create unnecessary deployments
2018-03-23 16:55:21 -07:00
Alex Dadgar
92b636dd32
Fix deadline handling
2018-03-21 16:51:44 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Preetha Appan
3e96c6c4e0
Address more code review feedback
2018-03-14 16:10:32 -05:00
Preetha Appan
9fed0d2103
Get reschedule policy from the alloc directly
2018-03-14 16:10:32 -05:00
Preetha Appan
e89bbf7289
Update comment about WaitTime
2018-03-14 16:10:32 -05:00
Preetha Appan
47e0280d96
More small review feedback
2018-03-14 16:10:32 -05:00
Preetha Appan
5373ade731
Scheduler and Reconciler changes to support delayed rescheduling
2018-03-14 16:10:32 -05:00
Preetha Appan
2ed4de7e7b
Track previous node id correctly, plus unit test
2018-01-31 09:58:05 -06:00
Preetha Appan
ea4a889e28
Address more code review feedback
2018-01-31 09:56:53 -06:00
Preetha Appan
bd89d2b39e
Make sure that reschedule trackers are not added for node drain replacements
2018-01-31 09:56:53 -06:00
Preetha Appan
21b7b79d5d
Add helper methods, use require and other code review feedback
2018-01-31 09:56:53 -06:00
Preetha Appan
d0f9d59abb
Reconile with changes to structs for reschedule tracking
2018-01-31 09:56:53 -06:00
Preetha Appan
fbb1936dee
Fix some comments and lint warnings, remove unused method
2018-01-31 09:56:53 -06:00
Preetha Appan
031c566ada
Reschedule previous allocs and track their reschedule attempts
2018-01-31 09:56:53 -06:00
Preetha Appan
3b4d7ac2a3
Fix some typos
2017-12-14 13:29:27 -06:00
Alex Dadgar
c1cc51dbee
sync
2017-10-13 14:36:02 -07:00
Michael Schurter
a66c53d45a
Remove structs
import from api
...
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar
3904bde9a3
Fix batch handling of complete allocs/node drains
...
This PR fixes:
* An issue in which a node-drain that contains a complete batch alloc
would cause a replacement
* An issue in which allocations with the same name during a scale
down/stop event wouldn't be properly stopped.
* An issue in which batch allocations from previous job versions may not
have been stopped properly.
Fixes https://github.com/hashicorp/nomad/issues/3210
2017-09-14 15:08:57 -07:00
Alex Dadgar
84d06f6abe
Sync namespace changes
2017-09-07 17:04:21 -07:00
Alex Dadgar
184bfd4836
Better comment
2017-07-20 12:31:08 -07:00
Alex Dadgar
248315a2d9
Handle destructive changes before placements
...
This PR updates the generic scheduler to handle destructive changes
before handling placements. This is important because the destructive
change may be due to a lowering of resources. If this is the case, the
handling of the destructive changes first may make it possible for the
placement to happen.
To reason about this imagine there is one node with CPU = 500.
If the group originally had:
* `count = 1`
* `cpu = 400`
And then the job was updated such that the group had:
* `count = 4`
* `cpu = 120`
If the original alloc isn't discounted first, nothing would be able to
place.
2017-07-20 12:24:27 -07:00
Alex Dadgar
641e178416
Stop before trying to place
2017-07-17 17:18:12 -07:00
Alex Dadgar
66a90326e1
Treat destructive updates atomically
2017-07-16 10:35:38 -07:00
Alex Dadgar
f86760db3c
Basic logs
2017-07-07 16:49:08 -07:00