Commit graph

1282 commits

Author SHA1 Message Date
Preetha Appan 9bc0962527
Track top k nodes by norm score rather than top k nodes per scorer 2018-09-04 16:10:11 -05:00
Preetha Appan 6ed527c636
Use heap to store top K scoring nodes.
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Preetha Appan dd5fe6373f
Fix scoring logic for uneven spread to incorporate current alloc count
Also addressed other small code review comments
2018-09-04 16:10:11 -05:00
Preetha Appan e72c0fe527
more cleanup 2018-09-04 16:10:11 -05:00
Preetha Appan 92d37acc2a
comment and formatting cleanup 2018-09-04 16:10:11 -05:00
Preetha Appan 5812f906c8
Allow empty spread targets, and validate target percentages. 2018-09-04 16:10:11 -05:00
Preetha Appan 71bff00326
validate spread from job/task group validate methods 2018-09-04 16:10:11 -05:00
Preetha Appan fbd0004707
Fix warnings 2018-09-04 16:10:11 -05:00
Preetha Appan 5eb82b6260
Validate method, and rename ratio field to percent 2018-09-04 16:10:11 -05:00
Preetha Appan 0037d72fa8
Structs and validation for spread 2018-09-04 16:10:11 -05:00
Preetha Appan c407e3626f
More review comments 2018-09-04 16:10:11 -05:00
Preetha Appan dbbb4a957a
Fail validation if system job has affinities 2018-09-04 16:10:11 -05:00
Preetha Appan 0bc030c6fb
Treat set_contains as a synonym of set_contains_all 2018-09-04 16:10:11 -05:00
Preetha Appan e85a721cfb
Include affinities in job and task diff, and more test cases 2018-09-04 16:10:11 -05:00
Preetha Appan f06c7ab2ad
Fix Copy method for job and task to include affinities 2018-09-04 16:10:11 -05:00
Preetha Appan 9f0caa9c3d
Affinity parsing, api and structs 2018-09-04 16:10:11 -05:00
Alex Dadgar bff1669ee4 Plugin config parsing 2018-08-29 17:06:01 -07:00
Chelsea Holland Komlo 31d6d00381 add simple getter for certificate 2018-08-10 12:37:21 -04:00
Alex Dadgar d0043691fb remove structs + bump version 2018-06-11 13:52:19 -07:00
Alex Dadgar af5753d2cd bump version + generated files 2018-06-11 13:39:42 -07:00
Nick Ethier a581cc9c01
nomad/structs: fix job diff test 2018-06-11 13:06:49 -04:00
Nick Ethier 41e010cdc2
nomad: add 'Dispatch' field to Job
New -bash: Dispatch: command not found field is used to denote if the Job is a child dispatched job of
a parameterized job.
2018-06-11 11:59:03 -04:00
Chelsea Holland Komlo 282f37b1ee fix for dynamically reloading vault 2018-06-07 15:34:18 -04:00
Alex Dadgar 23cd56dc78 remove generated structs 2018-06-01 16:11:28 -07:00
Alex Dadgar c0386819b3 bump version/lint/generated files 2018-06-01 15:23:10 -07:00
Alex Dadgar 7f25fcc1bd
Merge pull request #4354 from hashicorp/b-job-modify
Deployment adds JobSpecModifyIndex
2018-05-31 17:57:38 +00:00
Alex Dadgar f2b2e0482b code review fixes 2018-05-31 10:57:08 -07:00
Alex Dadgar 195e19827b Deployment adds JobSpecModifyIndex
Deployment tracks the Job.JobModifyIndex so that PUTS against /v1/jobs
can be more easily coorelated with the created deployment.

Fixes https://github.com/hashicorp/nomad/issues/4301
2018-05-30 11:33:56 -07:00
Chelsea Holland Komlo 19e4a5489b add support for tls PreferServerCipherSuites
add further tests for tls configuration
2018-05-25 13:20:00 -04:00
Chelsea Holland Komlo 38f611a7f2 refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Alex Dadgar 21c5ed850d Register events 2018-05-22 14:06:33 -07:00
Alex Dadgar 17aac1c9de node heartbeat missed event 2018-05-22 14:05:46 -07:00
Alex Dadgar 5f2080bc26 Emit events based on eligibility 2018-05-22 14:04:59 -07:00
Alex Dadgar 86be50fa05
Merge pull request #4284 from hashicorp/f-drain-event
Emit Node Events for draining
2018-05-22 21:04:18 +00:00
Preetha 159888a856
Merge pull request #4274 from hashicorp/f-force-rescheduling
Add CLI and API support for forcing rescheduling of failed allocs
2018-05-21 16:24:22 -07:00
Chelsea Komlo 687c26093c
Merge pull request #4269 from hashicorp/f-tls-remove-weak-standards
Configurable TLS cipher suites and versions; disallow weak ciphers
2018-05-11 08:11:46 -04:00
Alex Dadgar 0cb31feb1f Add node event when draining is set/removed/updated 2018-05-10 16:54:43 -07:00
Alex Dadgar a35248d1d8 Plumb event via FSM 2018-05-10 16:30:54 -07:00
Chelsea Holland Komlo 44f536f18e add support for configurable TLS minimum version 2018-05-09 18:07:12 -04:00
Chelsea Holland Komlo 796bae6f1b allow configurable cipher suites
disallow 3DES and RC4 ciphers

add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Preetha Appan b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload 2018-05-09 15:04:27 -05:00
Chelsea Holland Komlo d51611040f Add driver health information to node list stub 2018-05-09 11:21:54 -04:00
Preetha Appan c1b92c284e
Work in progress - force rescheduling of failed allocs 2018-05-08 17:26:57 -05:00
Michael Schurter e90d051c43
consul: change hashed canary bytes 2018-05-07 14:55:01 -05:00
Alex Dadgar 8626c1b94a
Reschedule when we have canaries properly 2018-05-07 14:55:01 -05:00
Michael Schurter 50e04c976e
consul: support canary tags for services
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.

Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar f4af30fbb5
Canary tags structs 2018-05-07 14:50:01 -05:00
Alex Dadgar f95ab4ade8
Mark canaries on creation, and unmark on promotion 2018-05-07 14:50:01 -05:00
Alex Dadgar 224b3092ae
change default to 10m and docs 2018-05-07 14:50:01 -05:00
Alex Dadgar 8a81038cdb
Set Reschedule from deployment watcher 2018-05-07 14:50:01 -05:00
Alex Dadgar fcf4f582d0
small review feedback fixes 2018-05-07 14:50:01 -05:00
Alex Dadgar e5caaf3358
Small test fix 2018-05-07 14:50:01 -05:00
Alex Dadgar 99e00fb774
Pass through timestamp 2018-05-07 14:50:01 -05:00
Alex Dadgar 1336002255
Progress deadline in deployment state 2018-05-07 14:50:01 -05:00
Alex Dadgar ee50789c22
Initial implementation 2018-05-07 14:50:01 -05:00
Michael Schurter a4caf8208b tests: fix grpc fields in task diff 2018-05-04 11:08:45 -07:00
Michael Schurter f6a4713141 consul: make grpc checks more like http checks 2018-05-04 11:08:11 -07:00
Michael Schurter 382caec1e1 consul: initial grpc implementation
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Preetha Appan 274bed1892
Add RescheduleTracker to allocs list stub struct 2018-05-01 14:53:47 -05:00
Alex Dadgar de4af37249 version bump and remove generated 2018-04-27 11:10:00 -07:00
Alex Dadgar 845a43864a generated files 2018-04-27 10:45:40 -07:00
Alex Dadgar 35e06ddb31 Remove generated and version bump 2018-04-26 16:49:19 -07:00
Alex Dadgar 43192cefae generated files 2018-04-26 16:28:58 -07:00
Alex Dadgar 265a6d4f8b
Merge pull request #4224 from hashicorp/b-cron-parse
Handle potential panic in cron parsing
2018-04-26 16:22:37 -07:00
Chelsea Holland Komlo ce1c3e0c2d add unit tests for panic cron parsing bug
add comments for cron parsing wrapper
2018-04-26 18:47:08 -04:00
Alex Dadgar 15ad3f94af Fix command line 2018-04-26 15:46:22 -07:00
Alex Dadgar dc2907c2c9 Codecgen full package 2018-04-26 15:24:53 -07:00
Alex Dadgar d0f237086b UX touchups 2018-04-26 15:24:27 -07:00
Chelsea Holland Komlo fca0169dbc handle potential panic in cron parsing 2018-04-26 16:57:45 -04:00
Alex Dadgar eeb85299ff gofmt -s nomad/structs/structs_test.go 2018-04-17 13:39:32 -07:00
Chelsea Holland Komlo 788b23e17e add test for node copy 2018-04-17 12:58:07 -04:00
Alex Dadgar 4f2a7b6949 Fix copying drivers 2018-04-16 15:45:51 -07:00
Alex Dadgar adaf4fa7e0 Remove generated structs 2018-04-12 16:35:31 -07:00
Alex Dadgar 663c4d0433 Version bump and generated files 2018-04-12 16:21:50 -07:00
Preetha bdc17ebf10
Merge pull request #4139 from hashicorp/b-reschedule-invalid-system-jobs
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 20:01:19 -05:00
Preetha Appan 9f84e17bfd
dont print reschedule policy in error message 2018-04-11 17:07:14 -05:00
Preetha Appan 81f856e7c9
Fix one more failing test 2018-04-11 15:49:23 -05:00
Preetha Appan a7b7b662ed
Make system jobs fail validation if they contain a reschedule stanza 2018-04-11 14:56:20 -05:00
Michael Schurter 88a9409f8e rpc: only attempt NodeRpc for nodes>=0.8
Attempting NodeRpc (or streaming node rpc) for clients that do not
support it causes it to hang indefinitely because while the TCP
connection exists, the client will never respond.
2018-04-09 11:08:06 -07:00
Preetha 6254d75eee
Merge pull request #4101 from hashicorp/b-rescheduling-edge-fixes
Fixes edge cases around timing/ task finish time being set more than once
2018-04-04 16:18:21 -05:00
Preetha Appan 5e4525bd30
Moves setting finishedAt to the right place and adds two unit tests. 2018-04-04 14:38:15 -05:00
Michael Schurter b1a90462a8
Merge pull request #4094 from hashicorp/b-drain-panic
drain: fix double-close panic on drain future
2018-04-04 10:31:14 -07:00
Alex Dadgar 4c9c6decd3
Merge pull request #4100 from hashicorp/b-vault-no-auth
Improve handling of Vault errors
2018-04-03 17:23:43 -07:00
Alex Dadgar 9617a13a2b Correctly handle the upgrade path of a node being drained when applying Raft logs 2018-04-03 15:32:44 -07:00
Preetha Appan 00537c739b
Fixes edge cases around timing and task finish time being set more than once 2018-04-03 16:34:59 -05:00
Alex Dadgar 58a3ec3fb2 Improve Vault error handling 2018-04-03 14:29:22 -07:00
Michael Schurter 6840becf46 drain: refactor batch_future into its own file
aka What If structs.go Wasn't So Big?
2018-04-02 16:40:06 -07:00
Alex Dadgar 86f9044676 remove generated files 2018-03-30 16:52:49 -07:00
Alex Dadgar af81349dbe Generated files 2018-03-30 16:14:40 -07:00
Alex Dadgar dc03fab29b Canonicalize migrate 2018-03-29 17:42:58 -07:00
Michael Schurter 62e9553333
Merge pull request #4069 from hashicorp/f-hashealth
add HasHealth helper for nil checks
2018-03-29 17:03:20 -07:00
Alex Dadgar 301704091b Handle upgrade where Node doesn't have eligiblity
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Preetha 9a732c4acb
Merge pull request #4071 from hashicorp/b-handle-missing-finishedat
handle missing finishedAt
2018-03-29 17:11:34 -05:00
Preetha 81d48fc7cf
Merge pull request #4079 from hashicorp/b-filter-desiredstop
Filter desired status stop allocs correctly
2018-03-29 15:36:22 -05:00
Preetha Appan c8317532ff
Use time from task events if task state does not have FinishedAt set 2018-03-29 14:05:56 -05:00
Alex Dadgar b194f93f2f Disallow Update stanza on Batch 2018-03-29 11:28:56 -07:00
Michael Schurter 91b5bb58d9 add HasHealth helper for nil checks
We performed the DeploymentStatus nil checks a couple different ways, so
hopefully this helper will consoldiate them and make it more clear what
the code is doing.
2018-03-29 09:29:19 -07:00
Chelsea Komlo 607e631714
Merge pull request #4046 from hashicorp/tls-same-file-reload
Check file contents when determining if agent should reload TLS confi…
2018-03-29 10:51:32 -04:00
Preetha Appan 5090fefe96
Filter out allocs with DesiredState = stop, and unit tests 2018-03-29 09:28:52 -05:00
Preetha Appan 8776f4b942
Fix failing test 2018-03-29 07:59:38 -05:00
Preetha Appan 2da661595d
If FinishedAt is not set use alloc's modify time for rescheduling logic 2018-03-29 07:42:58 -05:00
Chelsea Holland Komlo b33d909bf9 add test to assert invalid files return error 2018-03-28 18:31:35 -04:00
Chelsea Holland Komlo 58ada9bc42 return error when setting checksum; don't reload 2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo 2d5af7ff4d set TLS checksum when parsing config
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Alex Dadgar de4b3772f1 Create evals for system jobs when drain is unset
This PR creates evals for system jobs when:

* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo b522a0fadc fix up to string to use time.Time 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo 003bc209b9 use time.Time for node events for compatibility 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo 6e6d6b7e33 check file contents when determining if agent should reload TLS configuration 2018-03-27 15:42:20 -04:00
Alex Dadgar 59005d1d26
Merge pull request #4049 from hashicorp/b-tunnel
Only track nodes if the conn is from the node
2018-03-27 12:39:34 -07:00
Alex Dadgar 5dacb057b7 Only track nodes if the conn is from the node
Fixes a bug in which a connection to a Nomad server was treated as a
connection to a node because the server forwarded a node specific RPC.
2018-03-27 09:59:31 -07:00
Chelsea Komlo 57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Preetha Appan 539114124e
Fix too long token test case 2018-03-26 16:28:33 -05:00
Preetha Appan 33e170c15d
s/linear/constant/g 2018-03-26 14:45:09 -05:00
Preetha Appan 7db930b3c3
Extra test case and better error message for ambiguous config 2018-03-26 13:30:09 -05:00
Preetha Appan fbd56c35a8
Adds additional validation for ambigous settings (having both unlimited and attempts set) 2018-03-24 10:29:20 -05:00
Alex Dadgar 39987d5236
Merge branch 'master' into b-acl-name 2018-03-22 14:51:40 -07:00
Michael Schurter a7f627e34c eligbile -> eligible 2018-03-21 16:55:22 -07:00
Michael Schurter a4f346abeb remove spurious TODOs and FIXMEs 2018-03-21 16:55:22 -07:00
Michael Schurter 922842546c JobNs -> NamespacedID
Also drop the New func as it's easy to swap the order of arguments since
they're both strings.
2018-03-21 16:51:45 -07:00
Michael Schurter 8dc7d9fb6a drainer: RegisterJob -> RegisterJobs
Test job watcher
2018-03-21 16:51:45 -07:00
Alex Dadgar 2d91b9dfba Batch drain update 2018-03-21 16:51:44 -07:00
Alex Dadgar 7b2bad8c5e Toggle Drain allows resetting eligibility
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar 405dab2253 integration test and basic fixes 2018-03-21 16:51:44 -07:00
Alex Dadgar e63bcb474d Drainer 2018-03-21 16:51:44 -07:00
Alex Dadgar 4754366640 job watcher 2018-03-21 16:51:44 -07:00
Alex Dadgar a37329189a Improve DeadlineTime helper 2018-03-21 16:51:44 -07:00
Alex Dadgar 0fba0101b6 RPC/FSM/State Store for Eligibility 2018-03-21 16:51:44 -07:00
Alex Dadgar 2f5309d82a Remove update time 2018-03-21 16:51:43 -07:00
Alex Dadgar 010228577e Drain cli, api, http 2018-03-21 16:51:43 -07:00
Alex Dadgar e459a666ed Node.Drain takes strategy 2018-03-21 16:49:48 -07:00
Michael Schurter d1ec65d765 switch to new raft DesiredTransition message 2018-03-21 16:49:48 -07:00
Alex Dadgar db4a634072 RPC, FSM, State Store for marking DesiredTransistion
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter c0542474db drain: initial drainv2 structs and impl 2018-03-21 16:49:48 -07:00
Chelsea Komlo 6fc9231dac
Merge pull request #3856 from hashicorp/f-client-add-health-checks
Client driver health checks for Docker
2018-03-21 18:05:00 -04:00
Chelsea Holland Komlo 66e44cdb73 Allow TLS configurations for HTTP and RPC connections to be reloaded separately 2018-03-21 17:51:08 -04:00
Preetha 01898b2c25
Merge pull request #4007 from hashicorp/f-show-rescheduling-cli-job-status
Show a section on upcoming delayed evaluations when applicable
2018-03-21 14:37:38 -05:00
Chelsea Holland Komlo f801709a0a fix issue when updating node events 2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo 86b7b3d2d9 fix up health check logic comparison; add node events to client driver checks 2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo d8f68e5ef8 fix up codereview feedback 2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo c50d02ae93 go style; update comments 2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo a522da6994 fix up gofmt 2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo 3aa726baab fix scheduler driver name; create node structs file 2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo 0bde357731 add concept of health checks to fingerprinters and nodes
fix up feedback from code review

add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Preetha 17f2f52f08
Merge pull request #3979 from hashicorp/b_update_compat_delete
Delete compatibility code for job level update stanza
2018-03-21 09:17:01 -05:00
Michael Schurter 70c370c6fe
Merge pull request #4003 from jrasell/f_gh_3988
Allow Nomads Consul health check names to be configurable.
2018-03-20 16:44:08 -07:00
James Rasell 121c3bc997 Update Consul check params from using health-check to check. 2018-03-20 16:03:58 +01:00
Preetha Appan 31a3c81c3b
Show a section on upcoming delayed evaluations when applicable 2018-03-19 21:42:37 -05:00
Preetha Appan 33a5a72323
Make suggested interval round to seconds, and more end to end test cases 2018-03-19 14:56:52 -05:00
James Rasell 15afef9b77 Allow Nomads Consul health checks to be configurable.
This change allows the client HTTP and the server HTTP, Serf and
RPC health check names within Consul to be configurable with the
defaults as previous. The configuration can be done via either a
config file or using CLI flags.

Closes #3988
2018-03-19 19:37:56 +01:00
Alex Dadgar 586ae36d13 Batch Deregister RPC 2018-03-16 10:53:03 -07:00