Alex Dadgar
f9b056e1d1
Replace attributes map with new Attribute object
2018-10-13 14:08:58 -07:00
Alex Dadgar
04ba425dd5
validate constraints/affinities
2018-10-13 12:27:49 -07:00
Alex Dadgar
9b5aaac410
Device feasability checker
2018-10-13 12:27:49 -07:00
Alex Dadgar
bfb4caa2e7
node devices
2018-10-13 12:27:49 -07:00
Alex Dadgar
5a07f9f96e
parse affinities and constraints on devices
2018-10-11 14:05:19 -07:00
Alex Dadgar
6b08b9d6b6
Define device request structs
2018-10-08 15:38:03 -07:00
Alex Dadgar
01f8e5b95f
renames
2018-10-04 14:57:25 -07:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
bac5cb1e8b
Scheduler uses allocated resources
2018-10-02 17:08:25 -07:00
Alex Dadgar
147d2430a1
allocated resources structs
2018-09-29 18:47:28 -07:00
Alex Dadgar
5c8697667e
Node reserved resources
2018-09-29 18:44:55 -07:00
Alex Dadgar
3183153315
Node resources on client
2018-09-29 17:23:41 -07:00
Alex Dadgar
6a21f9fe96
Unique TriggerBy for blocked evals
...
Give blocked evals a unique triggerby reason to make debugging a chain
of evaluations easier.
2018-09-24 14:47:49 -07:00
Alex Dadgar
d7f5be9148
Better comment on snapshotindex
2018-09-24 13:53:43 -07:00
Alex Dadgar
99498da6ed
Denormalize jobs in plan and ignore resources of terminal allocs
...
Denormalize jobs in AppendAllocs:
AppendAlloc was originally only ever called for inplace upgrades and new
allocations. Both these code paths would remove the job from the
allocation. Now we use this to also add fields such as FollowupEvalID
which did not normalize the job. This is only a performance enhancement.
Ignore terminal allocs:
Failed allocations are annotated with the followup Eval ID when one is
created to replace the failed allocation. However, in the plan applier,
when we check if allocations fit, these terminal allocations were not
filtered. This could result in the plan being rejected if the node would
be overcommited if the terminal allocations resources were considered.
2018-09-24 13:53:43 -07:00
Preetha Appan
996484981c
Fix panic when reschedule policy for allocation can't be looked up
...
because its task group changed
2018-09-05 17:01:02 -05:00
Alex Dadgar
cc92cd92cd
Merge pull request #4642 from hashicorp/b-vet
...
Fix vet errors and use newer go version in travis
2018-09-04 17:04:02 -07:00
Alex Dadgar
c6576ddac1
Fix make check errors
2018-09-04 16:03:52 -07:00
Preetha Appan
26288b9522
Fix more review feedback
2018-09-04 16:10:11 -05:00
Preetha Appan
751c0eb5a5
code review feedback
2018-09-04 16:10:11 -05:00
Preetha Appan
4f8e925b54
Move topk and delay heap to separate packages under lib
2018-09-04 16:10:11 -05:00
Preetha Appan
9bc0962527
Track top k nodes by norm score rather than top k nodes per scorer
2018-09-04 16:10:11 -05:00
Preetha Appan
6ed527c636
Use heap to store top K scoring nodes.
...
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Preetha Appan
dd5fe6373f
Fix scoring logic for uneven spread to incorporate current alloc count
...
Also addressed other small code review comments
2018-09-04 16:10:11 -05:00
Preetha Appan
e72c0fe527
more cleanup
2018-09-04 16:10:11 -05:00
Preetha Appan
92d37acc2a
comment and formatting cleanup
2018-09-04 16:10:11 -05:00
Preetha Appan
5812f906c8
Allow empty spread targets, and validate target percentages.
2018-09-04 16:10:11 -05:00
Preetha Appan
71bff00326
validate spread from job/task group validate methods
2018-09-04 16:10:11 -05:00
Preetha Appan
fbd0004707
Fix warnings
2018-09-04 16:10:11 -05:00
Preetha Appan
5eb82b6260
Validate method, and rename ratio field to percent
2018-09-04 16:10:11 -05:00
Preetha Appan
0037d72fa8
Structs and validation for spread
2018-09-04 16:10:11 -05:00
Preetha Appan
c407e3626f
More review comments
2018-09-04 16:10:11 -05:00
Preetha Appan
dbbb4a957a
Fail validation if system job has affinities
2018-09-04 16:10:11 -05:00
Preetha Appan
0bc030c6fb
Treat set_contains as a synonym of set_contains_all
2018-09-04 16:10:11 -05:00
Preetha Appan
f06c7ab2ad
Fix Copy method for job and task to include affinities
2018-09-04 16:10:11 -05:00
Preetha Appan
9f0caa9c3d
Affinity parsing, api and structs
2018-09-04 16:10:11 -05:00
Nick Ethier
41e010cdc2
nomad: add 'Dispatch' field to Job
...
New -bash: Dispatch: command not found field is used to denote if the Job is a child dispatched job of
a parameterized job.
2018-06-11 11:59:03 -04:00
Alex Dadgar
f2b2e0482b
code review fixes
2018-05-31 10:57:08 -07:00
Alex Dadgar
195e19827b
Deployment adds JobSpecModifyIndex
...
Deployment tracks the Job.JobModifyIndex so that PUTS against /v1/jobs
can be more easily coorelated with the created deployment.
Fixes https://github.com/hashicorp/nomad/issues/4301
2018-05-30 11:33:56 -07:00
Alex Dadgar
21c5ed850d
Register events
2018-05-22 14:06:33 -07:00
Alex Dadgar
17aac1c9de
node heartbeat missed event
2018-05-22 14:05:46 -07:00
Alex Dadgar
5f2080bc26
Emit events based on eligibility
2018-05-22 14:04:59 -07:00
Alex Dadgar
86be50fa05
Merge pull request #4284 from hashicorp/f-drain-event
...
Emit Node Events for draining
2018-05-22 21:04:18 +00:00
Preetha
159888a856
Merge pull request #4274 from hashicorp/f-force-rescheduling
...
Add CLI and API support for forcing rescheduling of failed allocs
2018-05-21 16:24:22 -07:00
Alex Dadgar
0cb31feb1f
Add node event when draining is set/removed/updated
2018-05-10 16:54:43 -07:00
Alex Dadgar
a35248d1d8
Plumb event via FSM
2018-05-10 16:30:54 -07:00
Preetha Appan
b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload
2018-05-09 15:04:27 -05:00
Chelsea Holland Komlo
d51611040f
Add driver health information to node list stub
2018-05-09 11:21:54 -04:00
Preetha Appan
c1b92c284e
Work in progress - force rescheduling of failed allocs
2018-05-08 17:26:57 -05:00
Michael Schurter
e90d051c43
consul: change hashed canary bytes
2018-05-07 14:55:01 -05:00
Alex Dadgar
8626c1b94a
Reschedule when we have canaries properly
2018-05-07 14:55:01 -05:00
Michael Schurter
50e04c976e
consul: support canary tags for services
...
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.
Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar
f4af30fbb5
Canary tags structs
2018-05-07 14:50:01 -05:00
Alex Dadgar
f95ab4ade8
Mark canaries on creation, and unmark on promotion
2018-05-07 14:50:01 -05:00
Alex Dadgar
224b3092ae
change default to 10m and docs
2018-05-07 14:50:01 -05:00
Alex Dadgar
8a81038cdb
Set Reschedule from deployment watcher
2018-05-07 14:50:01 -05:00
Alex Dadgar
fcf4f582d0
small review feedback fixes
2018-05-07 14:50:01 -05:00
Alex Dadgar
e5caaf3358
Small test fix
2018-05-07 14:50:01 -05:00
Alex Dadgar
99e00fb774
Pass through timestamp
2018-05-07 14:50:01 -05:00
Alex Dadgar
1336002255
Progress deadline in deployment state
2018-05-07 14:50:01 -05:00
Alex Dadgar
ee50789c22
Initial implementation
2018-05-07 14:50:01 -05:00
Michael Schurter
f6a4713141
consul: make grpc checks more like http checks
2018-05-04 11:08:11 -07:00
Michael Schurter
382caec1e1
consul: initial grpc implementation
...
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Preetha Appan
274bed1892
Add RescheduleTracker to allocs list stub struct
2018-05-01 14:53:47 -05:00
Alex Dadgar
15ad3f94af
Fix command line
2018-04-26 15:46:22 -07:00
Alex Dadgar
d0f237086b
UX touchups
2018-04-26 15:24:27 -07:00
Chelsea Holland Komlo
fca0169dbc
handle potential panic in cron parsing
2018-04-26 16:57:45 -04:00
Alex Dadgar
4f2a7b6949
Fix copying drivers
2018-04-16 15:45:51 -07:00
Preetha Appan
9f84e17bfd
dont print reschedule policy in error message
2018-04-11 17:07:14 -05:00
Preetha Appan
a7b7b662ed
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 14:56:20 -05:00
Preetha
6254d75eee
Merge pull request #4101 from hashicorp/b-rescheduling-edge-fixes
...
Fixes edge cases around timing/ task finish time being set more than once
2018-04-04 16:18:21 -05:00
Preetha Appan
5e4525bd30
Moves setting finishedAt to the right place and adds two unit tests.
2018-04-04 14:38:15 -05:00
Michael Schurter
b1a90462a8
Merge pull request #4094 from hashicorp/b-drain-panic
...
drain: fix double-close panic on drain future
2018-04-04 10:31:14 -07:00
Alex Dadgar
4c9c6decd3
Merge pull request #4100 from hashicorp/b-vault-no-auth
...
Improve handling of Vault errors
2018-04-03 17:23:43 -07:00
Alex Dadgar
9617a13a2b
Correctly handle the upgrade path of a node being drained when applying Raft logs
2018-04-03 15:32:44 -07:00
Preetha Appan
00537c739b
Fixes edge cases around timing and task finish time being set more than once
2018-04-03 16:34:59 -05:00
Alex Dadgar
58a3ec3fb2
Improve Vault error handling
2018-04-03 14:29:22 -07:00
Michael Schurter
6840becf46
drain: refactor batch_future into its own file
...
aka What If structs.go Wasn't So Big?
2018-04-02 16:40:06 -07:00
Alex Dadgar
dc03fab29b
Canonicalize migrate
2018-03-29 17:42:58 -07:00
Michael Schurter
62e9553333
Merge pull request #4069 from hashicorp/f-hashealth
...
add HasHealth helper for nil checks
2018-03-29 17:03:20 -07:00
Alex Dadgar
301704091b
Handle upgrade where Node doesn't have eligiblity
...
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Preetha
9a732c4acb
Merge pull request #4071 from hashicorp/b-handle-missing-finishedat
...
handle missing finishedAt
2018-03-29 17:11:34 -05:00
Preetha
81d48fc7cf
Merge pull request #4079 from hashicorp/b-filter-desiredstop
...
Filter desired status stop allocs correctly
2018-03-29 15:36:22 -05:00
Preetha Appan
c8317532ff
Use time from task events if task state does not have FinishedAt set
2018-03-29 14:05:56 -05:00
Alex Dadgar
b194f93f2f
Disallow Update stanza on Batch
2018-03-29 11:28:56 -07:00
Michael Schurter
91b5bb58d9
add HasHealth helper for nil checks
...
We performed the DeploymentStatus nil checks a couple different ways, so
hopefully this helper will consoldiate them and make it more clear what
the code is doing.
2018-03-29 09:29:19 -07:00
Preetha Appan
5090fefe96
Filter out allocs with DesiredState = stop, and unit tests
2018-03-29 09:28:52 -05:00
Preetha Appan
2da661595d
If FinishedAt is not set use alloc's modify time for rescheduling logic
2018-03-29 07:42:58 -05:00
Alex Dadgar
de4b3772f1
Create evals for system jobs when drain is unset
...
This PR creates evals for system jobs when:
* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo
b522a0fadc
fix up to string to use time.Time
2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Alex Dadgar
59005d1d26
Merge pull request #4049 from hashicorp/b-tunnel
...
Only track nodes if the conn is from the node
2018-03-27 12:39:34 -07:00
Alex Dadgar
5dacb057b7
Only track nodes if the conn is from the node
...
Fixes a bug in which a connection to a Nomad server was treated as a
connection to a node because the server forwarded a node specific RPC.
2018-03-27 09:59:31 -07:00
Preetha Appan
33e170c15d
s/linear/constant/g
2018-03-26 14:45:09 -05:00
Preetha Appan
7db930b3c3
Extra test case and better error message for ambiguous config
2018-03-26 13:30:09 -05:00
Preetha Appan
fbd56c35a8
Adds additional validation for ambigous settings (having both unlimited and attempts set)
2018-03-24 10:29:20 -05:00
Alex Dadgar
39987d5236
Merge branch 'master' into b-acl-name
2018-03-22 14:51:40 -07:00
Michael Schurter
a7f627e34c
eligbile -> eligible
2018-03-21 16:55:22 -07:00
Michael Schurter
a4f346abeb
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
922842546c
JobNs -> NamespacedID
...
Also drop the New func as it's easy to swap the order of arguments since
they're both strings.
2018-03-21 16:51:45 -07:00
Michael Schurter
8dc7d9fb6a
drainer: RegisterJob -> RegisterJobs
...
Test job watcher
2018-03-21 16:51:45 -07:00
Alex Dadgar
2d91b9dfba
Batch drain update
2018-03-21 16:51:44 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
405dab2253
integration test and basic fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
e63bcb474d
Drainer
2018-03-21 16:51:44 -07:00
Alex Dadgar
4754366640
job watcher
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
0fba0101b6
RPC/FSM/State Store for Eligibility
2018-03-21 16:51:44 -07:00
Alex Dadgar
2f5309d82a
Remove update time
2018-03-21 16:51:43 -07:00
Alex Dadgar
010228577e
Drain cli, api, http
2018-03-21 16:51:43 -07:00
Alex Dadgar
e459a666ed
Node.Drain takes strategy
2018-03-21 16:49:48 -07:00
Michael Schurter
d1ec65d765
switch to new raft DesiredTransition message
2018-03-21 16:49:48 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Komlo
6fc9231dac
Merge pull request #3856 from hashicorp/f-client-add-health-checks
...
Client driver health checks for Docker
2018-03-21 18:05:00 -04:00
Preetha
01898b2c25
Merge pull request #4007 from hashicorp/f-show-rescheduling-cli-job-status
...
Show a section on upcoming delayed evaluations when applicable
2018-03-21 14:37:38 -05:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Preetha
17f2f52f08
Merge pull request #3979 from hashicorp/b_update_compat_delete
...
Delete compatibility code for job level update stanza
2018-03-21 09:17:01 -05:00
Preetha Appan
31a3c81c3b
Show a section on upcoming delayed evaluations when applicable
2018-03-19 21:42:37 -05:00
Preetha Appan
33a5a72323
Make suggested interval round to seconds, and more end to end test cases
2018-03-19 14:56:52 -05:00
Alex Dadgar
586ae36d13
Batch Deregister RPC
2018-03-16 10:53:03 -07:00
Preetha Appan
9a5e6edf1f
Rename DelayCeiling to MaxDelay
2018-03-14 16:10:32 -05:00
Preetha Appan
9fed0d2103
Get reschedule policy from the alloc directly
2018-03-14 16:10:32 -05:00
Preetha Appan
4d5e9bcb45
Extra comments, remove unnecessary if condition
2018-03-14 16:10:32 -05:00
Preetha Appan
1ab8f2b57a
Address some code review comments
2018-03-14 16:10:32 -05:00
Preetha Appan
342c3fb961
Added FollowupEvalID field and helper methods to calculate reschedule eligibility based on delay
2018-03-14 16:10:32 -05:00
Preetha Appan
87538fc87d
Fix formatting
2018-03-14 16:10:32 -05:00
Preetha Appan
5f50c3d618
Add new reschedule options to API layer and unit tests
2018-03-14 16:10:32 -05:00
Preetha Appan
10c9662222
New delayed rescheduling options, validation function and unit tests
2018-03-14 16:10:32 -05:00
Preetha Appan
a924183604
Remove compat code for upgrade stanza that copied state from job level update stanza
2018-03-14 10:21:46 -05:00
Chelsea Komlo
810eedfa2a
Merge pull request #3945 from hashicorp/f-add-node-events
...
Add node events
2018-03-14 08:42:55 -04:00
Preetha
360d6e5a92
Merge pull request #3968 from hashicorp/f-nicer-vault-error
...
Make server side error messages from vault more clearer
2018-03-13 20:49:39 -05:00
Preetha Appan
7b5955826d
Fix lint warning
2018-03-13 20:49:01 -05:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Alex Dadgar
63e14b7d63
nodeevents -> events
2018-03-13 18:08:22 -07:00
Alex Dadgar
d3c3deffad
fixes
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
b41501e442
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8bcbd81e6
batch submitting node events
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
d30c269fbe
code review feedback
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
0f306aa0dd
move all structs to structs file
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
00d9923454
Ensure node updates don't strip node events
...
Add node events to CLI
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
4ede27a3c8
RPC, FSM, state store for Node.EmitEvent
...
add node event when registering a node for the first time
2018-03-13 18:05:40 -07:00
Preetha Appan
e08ecb7da2
Fix incorrect comment
2018-03-13 18:25:41 -05:00
Preetha Appan
9618f52746
Remove error wrapping and make vault connection server side errors clearer.
2018-03-13 17:09:03 -05:00
Michael Schurter
7dd7fbcda2
non-Existent -> nonexistent
...
Reverting from #3963
https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Josh Soref
173ce63fe9
spelling: transition
2018-03-11 19:06:05 +00:00
Josh Soref
c4c4645f46
spelling: summary
2018-03-11 19:00:07 +00:00
Josh Soref
3140a5dcf9
spelling: response
2018-03-11 18:48:24 +00:00