Preetha Appan
bfa0937bbb
Code review feedback
2018-05-10 14:42:24 -05:00
Preetha Appan
ca5758741b
Update serf to pick up graceful leave fix
2018-05-10 11:16:24 -05:00
Chelsea Holland Komlo
620558c107
log error if unable to create TLS configuration
2018-05-10 11:51:54 -04:00
Chelsea Holland Komlo
44f536f18e
add support for configurable TLS minimum version
2018-05-09 18:07:12 -04:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Preetha Appan
b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload
2018-05-09 15:04:27 -05:00
Preetha Appan
ef531b0f34
Add unit tests for forced rescheduling
2018-05-09 11:30:42 -05:00
Chelsea Holland Komlo
d51611040f
Add driver health information to node list stub
2018-05-09 11:21:54 -04:00
Preetha Appan
1b8d8b2186
Fix logic inversion in force rescheduling
2018-05-08 20:00:06 -05:00
Preetha Appan
c1b92c284e
Work in progress - force rescheduling of failed allocs
2018-05-08 17:26:57 -05:00
Preetha Appan
c7edbd5f41
newlines in test
2018-05-07 14:55:01 -05:00
Preetha Appan
4e75456beb
Fix deadlock in deadline timer logic when progress deadline is passed and the deployment is updated.
2018-05-07 14:55:01 -05:00
Preetha Appan
cba13e4ec5
Fix test set up to set ModifyTime for alloc
2018-05-07 14:55:01 -05:00
Preetha Appan
19b096d203
Set modify time for allocs in unit test, and define current time in one spot
2018-05-07 14:55:01 -05:00
Preetha Appan
4c377b112e
Fix panic in deployment watcher when deployment is not in the state store due to a gc
2018-05-07 14:55:01 -05:00
Preetha
02d63432b4
Fix typo
2018-05-07 14:55:01 -05:00
Alex Dadgar
738056634e
Fix the initial progress deadline calculation when the alloc is inplace updated to be part of a new deployment
2018-05-07 14:55:01 -05:00
Michael Schurter
e90d051c43
consul: change hashed canary bytes
2018-05-07 14:55:01 -05:00
Alex Dadgar
768fec8505
Allow healthy canary deployment to skip progress deadline
2018-05-07 14:55:01 -05:00
Alex Dadgar
8626c1b94a
Reschedule when we have canaries properly
2018-05-07 14:55:01 -05:00
Michael Schurter
50e04c976e
consul: support canary tags for services
...
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.
Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Michael Schurter
a3038cefb4
typo: transistion -> transition
2018-05-07 14:50:01 -05:00
Alex Dadgar
bd38675365
Fix tests
2018-05-07 14:50:01 -05:00
Alex Dadgar
319763a5d8
remove unnessary merge of DeploymentStatus.Timestamp
2018-05-07 14:50:01 -05:00
Alex Dadgar
f4af30fbb5
Canary tags structs
2018-05-07 14:50:01 -05:00
Alex Dadgar
f95ab4ade8
Mark canaries on creation, and unmark on promotion
2018-05-07 14:50:01 -05:00
Preetha Appan
b2b773e696
better comments and remove commented code
2018-05-07 14:50:01 -05:00
Preetha Appan
90a2311cef
Fix deadlock in deployment watcher when deployment starts with no allocations and eventually has failed allocations
2018-05-07 14:50:01 -05:00
Alex Dadgar
224b3092ae
change default to 10m and docs
2018-05-07 14:50:01 -05:00
Alex Dadgar
c91ce5cc38
Fix not enqueuing eval
2018-05-07 14:50:01 -05:00
Alex Dadgar
8d50955054
Fix typos
2018-05-07 14:50:01 -05:00
Alex Dadgar
641ef81cbf
Test fixes
2018-05-07 14:50:01 -05:00
Alex Dadgar
8a81038cdb
Set Reschedule from deployment watcher
2018-05-07 14:50:01 -05:00
Alex Dadgar
a510774451
Use UpdateAllocDesiredTransistion instead of UpsertEval but no transistions yet
2018-05-07 14:50:01 -05:00
Alex Dadgar
fcf4f582d0
small review feedback fixes
2018-05-07 14:50:01 -05:00
Alex Dadgar
e5caaf3358
Small test fix
2018-05-07 14:50:01 -05:00
Alex Dadgar
9bff9024b3
add latest eval back
2018-05-07 14:50:01 -05:00
Alex Dadgar
99e00fb774
Pass through timestamp
2018-05-07 14:50:01 -05:00
Alex Dadgar
c49b5f9949
Handle progressed deployments and tests
2018-05-07 14:50:01 -05:00
Alex Dadgar
9e75ea0a11
Deployment watcher based on deployment having progress deadline
2018-05-07 14:50:01 -05:00
Alex Dadgar
1336002255
Progress deadline in deployment state
2018-05-07 14:50:01 -05:00
Alex Dadgar
55b483709f
Fix tests
2018-05-07 14:50:01 -05:00
Alex Dadgar
ee50789c22
Initial implementation
2018-05-07 14:50:01 -05:00
Michael Schurter
a4caf8208b
tests: fix grpc fields in task diff
2018-05-04 11:08:45 -07:00
Michael Schurter
f6a4713141
consul: make grpc checks more like http checks
2018-05-04 11:08:11 -07:00
Michael Schurter
382caec1e1
consul: initial grpc implementation
...
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Preetha Appan
52b3b53181
Update ModifyIndex of alloc when setting NextAllocation value
2018-05-03 17:04:36 -05:00
Preetha Appan
274bed1892
Add RescheduleTracker to allocs list stub struct
2018-05-01 14:53:47 -05:00
Alex Dadgar
de4af37249
version bump and remove generated
2018-04-27 11:10:00 -07:00
Alex Dadgar
845a43864a
generated files
2018-04-27 10:45:40 -07:00
Alex Dadgar
d03c881802
small cleanup and logging
2018-04-27 10:36:28 -07:00
Alex Dadgar
da3a552d8d
Fix issue where node connection map wasn't being pruned
2018-04-27 10:16:03 -07:00
Alex Dadgar
35e06ddb31
Remove generated and version bump
2018-04-26 16:49:19 -07:00
Alex Dadgar
43192cefae
generated files
2018-04-26 16:28:58 -07:00
Alex Dadgar
265a6d4f8b
Merge pull request #4224 from hashicorp/b-cron-parse
...
Handle potential panic in cron parsing
2018-04-26 16:22:37 -07:00
Alex Dadgar
05eccb063f
Merge branch 'b-cron-parse' of github.com:hashicorp/nomad into b-cron-parse
2018-04-26 15:51:56 -07:00
Alex Dadgar
ea24513d38
Allow nomad to restore bad periodic job
2018-04-26 15:51:47 -07:00
Chelsea Holland Komlo
ce1c3e0c2d
add unit tests for panic cron parsing bug
...
add comments for cron parsing wrapper
2018-04-26 18:47:08 -04:00
Alex Dadgar
15ad3f94af
Fix command line
2018-04-26 15:46:22 -07:00
Alex Dadgar
dc2907c2c9
Codecgen full package
2018-04-26 15:24:53 -07:00
Alex Dadgar
d0f237086b
UX touchups
2018-04-26 15:24:27 -07:00
Chelsea Holland Komlo
fca0169dbc
handle potential panic in cron parsing
2018-04-26 16:57:45 -04:00
Alex Dadgar
ff7e2b960f
Add test
2018-04-26 13:28:24 -07:00
Alex Dadgar
4a23307baf
Track all client connections
2018-04-26 13:22:09 -07:00
Alex Dadgar
5320205853
Sort signals in implicit constraint
...
Fixes https://github.com/hashicorp/nomad/issues/4212
2018-04-26 10:12:47 -07:00
Alex Dadgar
79844f1d01
Safety guard
2018-04-25 16:00:56 -07:00
Alex Dadgar
d45f39f24e
Fix detecting drain strategy on GC'd node
2018-04-25 16:00:56 -07:00
Nick Ethier
2e6c95f511
Merge pull request #4138 from hashicorp/i-hcl-json-endpoint
...
HCL to JSON api endpoint
2018-04-19 14:18:34 -04:00
Alex Dadgar
eeb85299ff
gofmt -s nomad/structs/structs_test.go
2018-04-17 13:39:32 -07:00
Chelsea Holland Komlo
788b23e17e
add test for node copy
2018-04-17 12:58:07 -04:00
Nick Ethier
31da01856a
command/agent: add HCL mock for parse endpoint
2018-04-16 19:21:09 -04:00
Alex Dadgar
4f2a7b6949
Fix copying drivers
2018-04-16 15:45:51 -07:00
Alex Dadgar
adaf4fa7e0
Remove generated structs
2018-04-12 16:35:31 -07:00
Alex Dadgar
663c4d0433
Version bump and generated files
2018-04-12 16:21:50 -07:00
Preetha
bdc17ebf10
Merge pull request #4139 from hashicorp/b-reschedule-invalid-system-jobs
...
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 20:01:19 -05:00
Preetha Appan
9f84e17bfd
dont print reschedule policy in error message
2018-04-11 17:07:14 -05:00
Preetha Appan
fa90f036c6
Fix more tests
2018-04-11 15:51:24 -05:00
Preetha Appan
81f856e7c9
Fix one more failing test
2018-04-11 15:49:23 -05:00
Preetha
0b6fbb8e16
Merge pull request #4131 from hashicorp/b-rescheduling-fix-gc
...
Update garbage collection logic to make sure allocs with pending evals are not GCed
2018-04-11 15:44:36 -05:00
Preetha Appan
1da4d88f3d
Make test descriptions better
2018-04-11 15:12:23 -05:00
Preetha Appan
a7b7b662ed
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 14:56:20 -05:00
Preetha Appan
688fd9ee37
Update alloc GC eligility logic to not rely on follow up evals
2018-04-11 13:58:02 -05:00
Charlie Voiselle
ba88f00ccb
Changed "til" to "until"
...
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Preetha
dec5b99478
Merge pull request #4120 from hashicorp/b-rescheduling-minimize-evals
...
Batch evals for rescheduling failed allocs correctly
2018-04-10 17:18:35 -05:00
Preetha Appan
59cce1d620
Fix unit test for core scheduler GC
2018-04-10 17:12:06 -05:00
Preetha Appan
7040884002
Simplify and update allocation gc eligibility logic
2018-04-10 16:08:37 -05:00
Preetha
c88fef4c4b
Merge pull request #4127 from hashicorp/b-autopilot-removepeer-fixes
...
Add node id persistence
2018-04-10 16:05:00 -05:00
Preetha Appan
a569d34f25
Add custom status description for rescheduling follow up evals, and make unit test robust
2018-04-10 15:30:15 -05:00
Preetha Appan
d17bfd8045
Make leader election test run on all three protocol versions
2018-04-10 14:20:02 -05:00
Preetha Appan
b3402efd0b
Adds a new custom description for update alloc triggered evals to make it easier to unit test.
2018-04-10 14:00:07 -05:00
Preetha Appan
6d0e1c9fea
Use preconfigured nodeID if there isn't a persisted node ID, and persist it if its not persisted.
2018-04-10 08:47:33 -05:00
Preetha Appan
216c053742
Remove debug print statements
2018-04-10 08:16:50 -05:00
Alex Dadgar
d179a09b83
WIP: Not setting node id properlperly
2018-04-09 18:01:28 -07:00
Preetha Appan
868f4f19f4
Unit tests for rolling upgrade and killing a leader
2018-04-09 17:42:30 -05:00
Preetha Appan
24203ae2f7
Remove duplicate commit
2018-04-09 15:08:09 -05:00
Preetha Appan
d1cb5df477
Batch evals for rescheduling failed allocs correctly and group them by job ID
2018-04-09 14:05:31 -05:00
Michael Schurter
d086f17708
rpc: wrap up old version check in a helper
...
DRY it up
2018-04-09 11:09:05 -07:00
Michael Schurter
e1cbcf0b3c
rpc: give min rpc version variable a better name
2018-04-09 11:09:05 -07:00
Michael Schurter
88a9409f8e
rpc: only attempt NodeRpc for nodes>=0.8
...
Attempting NodeRpc (or streaming node rpc) for clients that do not
support it causes it to hang indefinitely because while the TCP
connection exists, the client will never respond.
2018-04-09 11:08:06 -07:00
Preetha
6254d75eee
Merge pull request #4101 from hashicorp/b-rescheduling-edge-fixes
...
Fixes edge cases around timing/ task finish time being set more than once
2018-04-04 16:18:21 -05:00
Preetha Appan
5e4525bd30
Moves setting finishedAt to the right place and adds two unit tests.
2018-04-04 14:38:15 -05:00
Michael Schurter
b1a90462a8
Merge pull request #4094 from hashicorp/b-drain-panic
...
drain: fix double-close panic on drain future
2018-04-04 10:31:14 -07:00
Alex Dadgar
4c9c6decd3
Merge pull request #4100 from hashicorp/b-vault-no-auth
...
Improve handling of Vault errors
2018-04-03 17:23:43 -07:00
Alex Dadgar
af1b185ce4
Fix flaky deadline tests
2018-04-03 16:51:57 -07:00
Michael Schurter
ba6628a1b6
drain: return on first error
...
If one error is encountered it is unlikely any further attempts will
succeed, so fail fast.
2018-04-03 16:46:35 -07:00
Alex Dadgar
2b14371db5
Fix spelling
2018-04-03 15:58:03 -07:00
Alex Dadgar
9617a13a2b
Correctly handle the upgrade path of a node being drained when applying Raft logs
2018-04-03 15:32:44 -07:00
Preetha Appan
00537c739b
Fixes edge cases around timing and task finish time being set more than once
2018-04-03 16:34:59 -05:00
Alex Dadgar
58a3ec3fb2
Improve Vault error handling
2018-04-03 14:29:22 -07:00
Michael Schurter
edc4891283
drain: improve tests and fix spelling
...
* transistion -> transition
* don't t.Fatal in goroutines
* don't mutate global state
2018-04-02 16:40:47 -07:00
Michael Schurter
6840becf46
drain: refactor batch_future into its own file
...
aka What If structs.go Wasn't So Big?
2018-04-02 16:40:06 -07:00
Michael Schurter
44a749a7cc
drain: fix double-close panic on drain future
2018-04-02 16:39:18 -07:00
Alex Dadgar
86f9044676
remove generated files
2018-03-30 16:52:49 -07:00
Alex Dadgar
af81349dbe
Generated files
2018-03-30 16:14:40 -07:00
Alex Dadgar
23ec54a372
Merge pull request #4089 from hashicorp/tls-error-fix
...
Check for nil for RPC listener; prevent double closing of listener channel
2018-03-30 16:08:13 -07:00
Alex Dadgar
7f28cfcdfe
small cleanup
2018-03-30 15:49:56 -07:00
Chelsea Holland Komlo
a77dd08dd9
prevent double close due to error in creating listener
2018-03-30 17:15:56 -04:00
Chelsea Holland Komlo
402a026c88
add further error handling for rpc connection handling
2018-03-30 17:03:36 -04:00
Alex Dadgar
e8809f40dc
Test transistion from both infinite and a future deadline to force
2018-03-30 11:24:39 -07:00
Alex Dadgar
32a673a7e1
Fix force deadline notification
2018-03-30 09:58:29 -07:00
Alex Dadgar
1aa415b0d8
Integration test
2018-03-30 09:33:23 -07:00
Alex Dadgar
dc03fab29b
Canonicalize migrate
2018-03-29 17:42:58 -07:00
Alex Dadgar
e458ab9031
Merge branch 'master' into b-drain-batch
2018-03-29 17:10:34 -07:00
Michael Schurter
62e9553333
Merge pull request #4069 from hashicorp/f-hashealth
...
add HasHealth helper for nil checks
2018-03-29 17:03:20 -07:00
Alex Dadgar
301704091b
Handle upgrade where Node doesn't have eligiblity
...
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Alex Dadgar
7d2aae2c11
test handleTaskGroup
2018-03-29 16:38:47 -07:00
Alex Dadgar
049a9213d2
Watch batch jobs
2018-03-29 16:07:51 -07:00
Preetha
9a732c4acb
Merge pull request #4071 from hashicorp/b-handle-missing-finishedat
...
handle missing finishedAt
2018-03-29 17:11:34 -05:00
Alex Dadgar
f12194328c
Integration test for batch complete case
2018-03-29 13:51:04 -07:00
Preetha
81d48fc7cf
Merge pull request #4079 from hashicorp/b-filter-desiredstop
...
Filter desired status stop allocs correctly
2018-03-29 15:36:22 -05:00
Preetha Appan
c8317532ff
Use time from task events if task state does not have FinishedAt set
2018-03-29 14:05:56 -05:00
Alex Dadgar
b194f93f2f
Disallow Update stanza on Batch
2018-03-29 11:28:56 -07:00
Michael Schurter
91b5bb58d9
add HasHealth helper for nil checks
...
We performed the DeploymentStatus nil checks a couple different ways, so
hopefully this helper will consoldiate them and make it more clear what
the code is doing.
2018-03-29 09:29:19 -07:00
Chelsea Komlo
607e631714
Merge pull request #4046 from hashicorp/tls-same-file-reload
...
Check file contents when determining if agent should reload TLS confi…
2018-03-29 10:51:32 -04:00
Preetha Appan
5090fefe96
Filter out allocs with DesiredState = stop, and unit tests
2018-03-29 09:28:52 -05:00
Preetha Appan
8776f4b942
Fix failing test
2018-03-29 07:59:38 -05:00
Preetha Appan
2da661595d
If FinishedAt is not set use alloc's modify time for rescheduling logic
2018-03-29 07:42:58 -05:00
Alex Dadgar
b18f789020
Unmark drain when nodes hit their deadline and only batch/system left and add all job type integration test
2018-03-28 17:25:58 -07:00
Chelsea Holland Komlo
b33d909bf9
add test to assert invalid files return error
2018-03-28 18:31:35 -04:00
Chelsea Holland Komlo
58ada9bc42
return error when setting checksum; don't reload
2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo
2d5af7ff4d
set TLS checksum when parsing config
...
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Michael Schurter
65ddae86f8
Merge pull request #4054 from hashicorp/b-drainer-index-fix
...
drainer: reset index when new job registered
2018-03-27 16:28:25 -07:00
Michael Schurter
79a2781585
Merge pull request #4053 from hashicorp/b-drain-sys-jobs-2
...
drain: fix draining of system jobs
2018-03-27 16:26:45 -07:00
Alex Dadgar
de4b3772f1
Create evals for system jobs when drain is unset
...
This PR creates evals for system jobs when:
* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo
dd5f627feb
set server configuration checksum on reload
2018-03-27 18:03:52 -04:00
Michael Schurter
ec60a1d3e3
drain: improve comments
2018-03-27 14:27:09 -07:00
Michael Schurter
e5dfb7e487
drain: unittest draining node logic
2018-03-27 14:24:01 -07:00
Michael Schurter
a1ed305a24
test: add mock batch and system allocs
...
Since the BatchJob helper had a different task group than the Alloc
helper, it was difficult to create a valid batch alloc.
2018-03-27 14:24:01 -07:00
Michael Schurter
77bddc7941
drain: stop sys jobs after drain completes
...
System allocs should be drained when a node's deadline is hit or when
all other allocs on the node have stopped/migrated.
2018-03-27 14:24:01 -07:00
Michael Schurter
fae77b874b
drainer: reset index when new job registered
2018-03-27 14:12:59 -07:00
Chelsea Holland Komlo
b522a0fadc
fix up to string to use time.Time
2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo
31557cc44f
move tests to use time.Time
2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo
6e6d6b7e33
check file contents when determining if agent should reload TLS configuration
2018-03-27 15:42:20 -04:00
Alex Dadgar
59005d1d26
Merge pull request #4049 from hashicorp/b-tunnel
...
Only track nodes if the conn is from the node
2018-03-27 12:39:34 -07:00
Alex Dadgar
5dacb057b7
Only track nodes if the conn is from the node
...
Fixes a bug in which a connection to a Nomad server was treated as a
connection to a node because the server forwarded a node specific RPC.
2018-03-27 09:59:31 -07:00
Chelsea Komlo
57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
...
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Preetha Appan
539114124e
Fix too long token test case
2018-03-26 16:28:33 -05:00
Preetha Appan
33e170c15d
s/linear/constant/g
2018-03-26 14:45:09 -05:00
Preetha Appan
7db930b3c3
Extra test case and better error message for ambiguous config
2018-03-26 13:30:09 -05:00
Chelsea Holland Komlo
c2a95f9d7d
add test for upgrading only RPC connections
2018-03-26 10:55:27 -04:00
Preetha Appan
fbd56c35a8
Adds additional validation for ambigous settings (having both unlimited and attempts set)
2018-03-24 10:29:20 -05:00
Alex Dadgar
39987d5236
Merge branch 'master' into b-acl-name
2018-03-22 14:51:40 -07:00
Michael Schurter
a7f627e34c
eligbile -> eligible
2018-03-21 16:55:22 -07:00
Michael Schurter
a4f346abeb
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
9f3086a268
test: must initialize jobResults with new func
2018-03-21 16:51:45 -07:00
Michael Schurter
e432c9af55
test: disable node drainer during tests
...
Node drainer would throw off the index checks
2018-03-21 16:51:45 -07:00
Michael Schurter
5c8c4bce2a
test: disable drain during fsm test
...
drainer was unsetting drain before fsm could read written value
2018-03-21 16:51:45 -07:00
Michael Schurter
341d87aa48
tests: use mock.BatchJob to fix tests
2018-03-21 16:51:45 -07:00
Michael Schurter
8b107acc06
mock: add BatchJob() helper
2018-03-21 16:51:45 -07:00
Michael Schurter
cb61a4bdc7
Fix linting errors
2018-03-21 16:51:45 -07:00
Alex Dadgar
640ebdaef6
fix race in drain integration tests
2018-03-21 16:51:45 -07:00
Michael Schurter
c401d5a098
Refactor assertOps into a helper func
2018-03-21 16:51:45 -07:00
Michael Schurter
187b0e1a48
Remove debug prints
2018-03-21 16:51:45 -07:00
Michael Schurter
f67eca48ac
Deregister garbage collected jobs
2018-03-21 16:51:45 -07:00
Michael Schurter
922842546c
JobNs -> NamespacedID
...
Also drop the New func as it's easy to swap the order of arguments since
they're both strings.
2018-03-21 16:51:45 -07:00
Michael Schurter
8dc7d9fb6a
drainer: RegisterJob -> RegisterJobs
...
Test job watcher
2018-03-21 16:51:45 -07:00
Michael Schurter
3116897099
Fix deadline heap triggering
...
Chan must be buffered to avoid skipping triggering altogether
Also made timing in a test a bit more lenient
2018-03-21 16:51:45 -07:00
Alex Dadgar
9d23c965da
fix comment
2018-03-21 16:51:45 -07:00
Alex Dadgar
fb4badf1bc
sharding
2018-03-21 16:51:44 -07:00
Alex Dadgar
2d91b9dfba
Batch drain update
2018-03-21 16:51:44 -07:00
Alex Dadgar
92b636dd32
Fix deadline handling
2018-03-21 16:51:44 -07:00
Michael Schurter
9898edfa90
Switch to drainerv2 impl
2018-03-21 16:51:44 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
ad80e655cc
code review
2018-03-21 16:51:44 -07:00
Alex Dadgar
11f9fe4960
spelling fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
bc7385812d
Comments
2018-03-21 16:51:44 -07:00
Alex Dadgar
e87c677a42
handle empty node case
2018-03-21 16:51:44 -07:00
Alex Dadgar
405dab2253
integration test and basic fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
e63bcb474d
Drainer
2018-03-21 16:51:44 -07:00
Alex Dadgar
4754366640
job watcher
2018-03-21 16:51:44 -07:00
Alex Dadgar
504bfabb4d
Node's being untracked or having updated deadlines, updates the deadliner
2018-03-21 16:51:44 -07:00
Alex Dadgar
66eaaa6a4d
node watcher
2018-03-21 16:51:44 -07:00
Alex Dadgar
527ac0b39d
drain heap
2018-03-21 16:51:44 -07:00
Alex Dadgar
2d4c193a0a
Initial design
2018-03-21 16:51:44 -07:00
Alex Dadgar
33ca319080
System test runs on mac
2018-03-21 16:51:44 -07:00
Alex Dadgar
f8d4a3a9e6
Fix file names
2018-03-21 16:51:44 -07:00
Michael Schurter
32a7649359
refactor main drainloop into 2 more methods
2018-03-21 16:51:44 -07:00
Michael Schurter
5e52f84bb7
drainer: refactor newStopAllocs, applyMigrations
2018-03-21 16:51:44 -07:00
Michael Schurter
62960ed7bd
client: don't monitor health of non-service jobs
...
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Michael Schurter
b7c993f0e5
drainer: convert fsm errors to go errors
2018-03-21 16:51:44 -07:00
Michael Schurter
ab0de41884
drainer: factor job & node watchers out of drainer.go
2018-03-21 16:51:44 -07:00
Michael Schurter
5922aef623
Restart every time SetEnabled(true) is called
2018-03-21 16:51:44 -07:00
Michael Schurter
959d447d38
Remove unused context
2018-03-21 16:51:44 -07:00
Michael Schurter
8b41e9b2e1
drainer: drainer should shutdown with server
2018-03-21 16:51:44 -07:00
Michael Schurter
0a17076ad2
refactor drainer into a subpkg
2018-03-21 16:51:44 -07:00
Alex Dadgar
93871c18f8
Fix retaining the drain
2018-03-21 16:51:44 -07:00
Alex Dadgar
010a6b8ca5
Unblock evals once eligible
2018-03-21 16:51:44 -07:00
Alex Dadgar
8289cc3c6f
HTTP and API
2018-03-21 16:51:44 -07:00
Alex Dadgar
0fba0101b6
RPC/FSM/State Store for Eligibility
2018-03-21 16:51:44 -07:00
Alex Dadgar
b3d2346419
Upgrade path
2018-03-21 16:51:43 -07:00
Alex Dadgar
2f5309d82a
Remove update time
2018-03-21 16:51:43 -07:00
Alex Dadgar
0965c9ed28
Fix tests
2018-03-21 16:51:43 -07:00
Alex Dadgar
010228577e
Drain cli, api, http
2018-03-21 16:51:43 -07:00
Alex Dadgar
e459a666ed
Node.Drain takes strategy
2018-03-21 16:49:48 -07:00
Michael Schurter
03d0e5b8a0
improve drain fsm/statestore tests
2018-03-21 16:49:48 -07:00
Michael Schurter
d1ec65d765
switch to new raft DesiredTransition message
2018-03-21 16:49:48 -07:00
Michael Schurter
acf59ee75e
drainer: switch to job based watching
2018-03-21 16:49:48 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Komlo
6fc9231dac
Merge pull request #3856 from hashicorp/f-client-add-health-checks
...
Client driver health checks for Docker
2018-03-21 18:05:00 -04:00
Chelsea Holland Komlo
66e44cdb73
Allow TLS configurations for HTTP and RPC connections to be reloaded separately
2018-03-21 17:51:08 -04:00
Preetha
01898b2c25
Merge pull request #4007 from hashicorp/f-show-rescheduling-cli-job-status
...
Show a section on upcoming delayed evaluations when applicable
2018-03-21 14:37:38 -05:00
Chelsea Holland Komlo
f801709a0a
fix issue when updating node events
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
86b7b3d2d9
fix up health check logic comparison; add node events to client driver checks
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d8f68e5ef8
fix up codereview feedback
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
c7fd0bd8a1
fix up scheduler mocks
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
c50d02ae93
go style; update comments
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
a522da6994
fix up gofmt
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3cba95e8a7
allow nomad to schedule based on the status of a client driver health check
...
Slight updates for go style
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Preetha
17f2f52f08
Merge pull request #3979 from hashicorp/b_update_compat_delete
...
Delete compatibility code for job level update stanza
2018-03-21 09:17:01 -05:00
Michael Schurter
70c370c6fe
Merge pull request #4003 from jrasell/f_gh_3988
...
Allow Nomads Consul health check names to be configurable.
2018-03-20 16:44:08 -07:00
James Rasell
121c3bc997
Update Consul check params from using health-check to check.
2018-03-20 16:03:58 +01:00
Preetha Appan
31a3c81c3b
Show a section on upcoming delayed evaluations when applicable
2018-03-19 21:42:37 -05:00
Preetha Appan
33a5a72323
Make suggested interval round to seconds, and more end to end test cases
2018-03-19 14:56:52 -05:00
James Rasell
15afef9b77
Allow Nomads Consul health checks to be configurable.
...
This change allows the client HTTP and the server HTTP, Serf and
RPC health check names within Consul to be configurable with the
defaults as previous. The configuration can be done via either a
config file or using CLI flags.
Closes #3988
2018-03-19 19:37:56 +01:00
Alex Dadgar
9e05c9a50e
Merge pull request #3997 from hashicorp/b-serf-addr
...
RPC Advertise used exclusively for Clients
2018-03-19 09:30:20 -07:00
Alex Dadgar
2baa1c38f2
clarify comment
2018-03-16 16:47:08 -07:00
Alex Dadgar
b8607ad6d6
Heartbeat uses client rpc advertise and server defaults server rpc advertise addr
2018-03-16 16:47:08 -07:00
Alex Dadgar
52b7fb5361
Separate client and server rpc advertise addresses
2018-03-16 16:47:08 -07:00
Michael Schurter
c3e8f6319c
gofmt -s (simplify) files
2018-03-16 16:31:16 -07:00
Alex Dadgar
b3ab063132
Merge pull request #3992 from hashicorp/f-vault-orphan
...
Allow and recommend Orphaned Vault tokens
2018-03-16 10:59:54 -07:00
Alex Dadgar
6a44e6092f
Pull snapshotting out of loop
2018-03-16 10:54:26 -07:00
Alex Dadgar
7545c0053e
job gc uses batch endpoint
2018-03-16 10:53:03 -07:00
Alex Dadgar
586ae36d13
Batch Deregister RPC
2018-03-16 10:53:03 -07:00
Alex Dadgar
c152774997
Allow and recommend Orphaned Vault tokens
...
This PR removes enforcement that the Vault token role disallows orphaned
tokens and recommends orphaned tokens to simplify the
bootstrapping/upgrading of Nomad clusters. The requirement that Nomad's
Vault token never expire and be shared by all instances of Nomad servers
is not operationally friendly.
2018-03-15 15:32:08 -07:00
Alex Dadgar
fc782d5942
List unblocks on summary changes
2018-03-15 10:22:03 -07:00