Commit Graph

184 Commits

Author SHA1 Message Date
Alex Dadgar 08dc2ea702
Merge pull request #4867 from hashicorp/b-deployment-progress-deadline
Blocked evaluation fixes
2018-11-13 10:29:03 -08:00
Alex Dadgar 6d8bb3a7bd Duplicate blocked evals cancelling improved
The old logic for cancelling duplicate blocked evaluations by job id had
the issue where the newer evaluation could have additional node classes
that it is (in)eligible for that we would not capture. This could make
it such that cluster state could change such that the job would make
progress but no evaluation was unblocked.
2018-11-07 10:08:23 -08:00
Preetha Appan 32cc764072
Add fsm layer tests 2018-10-30 11:06:32 -05:00
Preetha Appan 7b8156fc47
Restore/Snapshot plus unit tests for scheduler configuration 2018-10-30 11:06:32 -05:00
Preetha Appan c1c1c230e4
Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption 2018-10-30 11:06:32 -05:00
Alex Dadgar 52f9cd7637 fixing tests 2018-10-04 14:26:19 -07:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Alex Dadgar 300b1a7a15 Tests only use testlog package logger 2018-06-13 15:40:56 -07:00
Preetha Appan 4134fcd2c7
Fix test setup for FSMSnapshotRestore_Deployments to use a valid job that exists 2018-05-31 14:39:39 -05:00
Alex Dadgar 352f2e03b5 Clean up leaked deployments on restoration
This PR cancels deployments that are active but do not have a job
associated with them. This is a broken invariant that causes issues in
the deployment watcher since it will not track them. Thus they are
objects that can't be operated on or cleaned up.

Fixes https://github.com/hashicorp/nomad/issues/4286
2018-05-23 16:44:21 -07:00
Alex Dadgar 17aac1c9de node heartbeat missed event 2018-05-22 14:05:46 -07:00
Alex Dadgar 5f2080bc26 Emit events based on eligibility 2018-05-22 14:04:59 -07:00
Alex Dadgar a35248d1d8 Plumb event via FSM 2018-05-10 16:30:54 -07:00
Alex Dadgar 9617a13a2b Correctly handle the upgrade path of a node being drained when applying Raft logs 2018-04-03 15:32:44 -07:00
Alex Dadgar 301704091b Handle upgrade where Node doesn't have eligiblity
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Chelsea Holland Komlo 31557cc44f move tests to use time.Time 2018-03-27 15:43:57 -04:00
Michael Schurter cb61a4bdc7 Fix linting errors 2018-03-21 16:51:45 -07:00
Alex Dadgar 9d23c965da fix comment 2018-03-21 16:51:45 -07:00
Alex Dadgar 2d91b9dfba Batch drain update 2018-03-21 16:51:44 -07:00
Alex Dadgar 010a6b8ca5 Unblock evals once eligible 2018-03-21 16:51:44 -07:00
Alex Dadgar 0fba0101b6 RPC/FSM/State Store for Eligibility 2018-03-21 16:51:44 -07:00
Alex Dadgar 2f5309d82a Remove update time 2018-03-21 16:51:43 -07:00
Alex Dadgar 0965c9ed28 Fix tests 2018-03-21 16:51:43 -07:00
Alex Dadgar e459a666ed Node.Drain takes strategy 2018-03-21 16:49:48 -07:00
Michael Schurter 03d0e5b8a0 improve drain fsm/statestore tests 2018-03-21 16:49:48 -07:00
Michael Schurter d1ec65d765 switch to new raft DesiredTransition message 2018-03-21 16:49:48 -07:00
Alex Dadgar db4a634072 RPC, FSM, State Store for marking DesiredTransistion
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter c3e8f6319c gofmt -s (simplify) files 2018-03-16 16:31:16 -07:00
Alex Dadgar 586ae36d13 Batch Deregister RPC 2018-03-16 10:53:03 -07:00
Alex Dadgar 63e14b7d63 nodeevents -> events 2018-03-13 18:08:22 -07:00
Alex Dadgar d3c3deffad fixes 2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo 1488b076d1 code review feedback 2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo a8bcbd81e6 batch submitting node events 2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo 311cb857ac add node id to tests; check for node events in state store 2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo 00d9923454 Ensure node updates don't strip node events
Add node events to CLI
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo ae7a2d6eb1 code review feedback 2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo 4ede27a3c8 RPC, FSM, state store for Node.EmitEvent
add node event when registering a node for the first time
2018-03-13 18:05:40 -07:00
Michael Schurter 7dd7fbcda2 non-Existent -> nonexistent
Reverting from #3963

https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Josh Soref 7f6e4012a0 spelling: existent 2018-03-11 18:30:37 +00:00
Josh Soref d9ce1f7882 spelling: deregister 2018-03-11 17:53:22 +00:00
Preetha Appan 9c600c6d6c
Clean up comments, and one more unit test 2018-01-31 09:56:53 -06:00
Kyle Havlovitz 2ccf565bf6 Refactor redundancy_zone/upgrade_version out of client meta 2018-01-29 20:03:38 -08:00
Kyle Havlovitz 12ff22ea70 Merge branch 'master' into autopilot 2018-01-18 13:29:25 -08:00
Michael Schurter 5d65eba2e6 Strip mocked dynamic port for fsm test 2017-12-19 16:41:35 -08:00
Kyle Havlovitz 1c07066064 Add autopilot functionality based on Consul's autopilot 2017-12-18 14:29:41 -08:00
Preetha Appan 3c36abfe14
Update eval modify index as part of plan apply. 2017-12-18 10:03:55 -06:00
Alex Dadgar 86608124ca Fix followers not creating periodic launch
Fix an issue in which periodic launches wouldn't be made on followers.
2017-12-11 13:55:17 -08:00
Alex Dadgar c1cc51dbee sync 2017-10-13 14:36:02 -07:00
Alex Dadgar 4173834231 Enable more linters 2017-09-26 15:26:33 -07:00
Alex Dadgar e5ec915ac3 sync 2017-09-19 10:08:23 -05:00
Armon Dadgar 3d5ecaafff Address @dadgar feedback 2017-09-11 10:30:59 -07:00
Armon Dadgar 20a8e590a0 nomad: support ACL bootstrap reset 2017-09-10 16:03:30 -07:00
Alex Dadgar 84d06f6abe Sync namespace changes 2017-09-07 17:04:21 -07:00
Armon Dadgar e24a4abf2c nomad: adding ACL bootstrap endpoints 2017-09-04 13:05:53 -07:00
Armon Dadgar f91d2608cb nomad: renambe PublicID to AccessorID for consistency 2017-09-04 13:04:45 -07:00
Armon Dadgar e5c69f162c nomad: implement ACL token endpoints 2017-09-04 13:04:45 -07:00
Armon Dadgar e9bad0bf37 nomad: Add ACL Token snapshot/restore to FSM 2017-09-04 13:04:45 -07:00
Armon Dadgar afdde24799 nomad: adding upsert policy endpoint 2017-09-04 13:03:15 -07:00
Armon Dadgar e3e243f433 nomad: implement policy delete endpoint 2017-09-04 13:03:15 -07:00
Armon Dadgar 8a4dda8577 nomad: update method signature for tests 2017-09-04 13:03:15 -07:00
Armon Dadgar 10b583ea38 nomad: adding FSM snapshot/restore of ACL policies 2017-09-04 13:03:14 -07:00
Alex Dadgar ad87c6fba5 Include google compare library 2017-09-01 16:42:09 -07:00
Alex Dadgar 06eddf243c parallel nomad tests 2017-07-25 17:39:36 -07:00
Alex Dadgar 5457bb7962 Job stability 2017-07-07 12:10:04 -07:00
Alex Dadgar 454083ba1b Remove canary 2017-07-07 12:10:04 -07:00
Alex Dadgar c10d7ab871 Remove promoted bit from allocation 2017-07-07 12:10:04 -07:00
Alex Dadgar 09dfa2fc10 Rename CreateDeployments and remove cancelling behavior in state_store 2017-07-07 12:10:04 -07:00
Alex Dadgar b64185a3f1 Deployment GC
This PR implements the garbage collector for deployments. Deployments
will by default be garbage collected after 1 hour.
2017-07-07 12:05:57 -07:00
Alex Dadgar 8f4b22c1e1 Fix tests 2017-07-07 12:03:11 -07:00
Alex Dadgar 6688a3f76c FSM Tests 2017-07-07 12:03:11 -07:00
Alex Dadgar 7078d563cb Create Deployments through plan application 2017-05-05 15:33:19 -07:00
Alex Dadgar 343ff03f02 Deployment struct, state store, fsm persist/restore 2017-05-04 13:37:18 -07:00
Alex Dadgar 3145086a42 non-purge deregisters 2017-04-15 17:08:05 -07:00
Alex Dadgar 5d293c0f1e Add abandon tests and use snapshot for blocking queries 2017-02-08 11:18:03 -08:00
Alex Dadgar 04862ca10e Tests compile 2017-02-07 21:30:57 -08:00
Alex Dadgar 86980e08f0 Cancel blocked evals upon successful one for job
This PR causes blocked evaluations to be cancelled if there is a
subsequent successful evaluation for the job. This fixes UX problems
showing failed placements when there are not any in reality and makes GC
possible for these jobs in certain cases.

Fixes https://github.com/hashicorp/nomad/issues/2124
2017-01-04 16:16:04 -08:00
Diptanu Choudhury 1b3c5e98c8 Renaming LocalDisk to EphemeralDisk (#1710)
Renaming LocalDisk to EphemeralDisk
2016-09-14 15:43:42 -07:00
Alex Dadgar 3c9936ae4a Merge pull request #1659 from hashicorp/f-revoke-accessors
Token revocation and keeping only a single Vault client active among servers
2016-08-31 14:10:46 -07:00
Diptanu Choudhury bfee7b30a3 Introducing shared resources in alloc 2016-08-29 13:49:25 -07:00
Alex Dadgar 48696ba0cc Use tomb to shutdown
Token revocation

Remove from the statestore

Revoke tokens

Don't error when Vault is disabled as this could cause issue if the operator ever goes from enabled to disabled

update server interface to allow enable/disable and config loading

test the new functions

Leader revoke

Use active
2016-08-28 14:06:25 -07:00
Diptanu Choudhury 13497913f9 Ensuring resources are re-calculated properly in fsm 2016-08-26 20:13:11 -07:00
Alex Dadgar 909f552d9e tests 2016-08-19 16:40:37 -07:00
Alex Dadgar 901000f789 Raft message, fsm and state store table 2016-08-19 16:40:37 -07:00
Diptanu Choudhury 1518f23d0a Making servers reconcile job summaries when they acquire leadership 2016-08-05 16:47:36 -07:00
Diptanu Choudhury 6dc5b1972c Setting job's create index as summary create index during reconciliation 2016-08-04 15:14:01 -07:00
Diptanu Choudhury 88d383c47f Updated tests and comments 2016-08-04 11:29:36 -07:00
Diptanu Choudhury 74caed0c7a Added an endpoint for users to reconcile job summaries 2016-08-03 16:12:47 -07:00
Diptanu Choudhury 1b60e0823a Added a test for restoring the summaries in fsm 2016-08-03 11:58:36 -07:00
Alex Dadgar 2332a58944 Do not update the job of allocations that are being stopped 2016-08-02 17:53:31 -07:00
Diptanu Choudhury 9943053239 Fixed a test 2016-07-25 22:22:55 -07:00
Diptanu Choudhury 3089833397 Reconciling the queued allocations during restore 2016-07-25 17:31:40 -07:00
Diptanu Choudhury 6193529040 Fixed more tests 2016-07-25 17:31:40 -07:00
Diptanu Choudhury 50842b88c7 Fixed some bugs 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 6937c0f7f3 Added test for job summary restore 2016-07-12 11:41:13 -06:00
Diptanu Choudhury fd3c05a2be Making driver fingerprinters run periodically and unblocking evals when node registration happens 2016-03-31 18:11:27 -07:00
Alex Dadgar 94522e7bed Successful allocations are marked as complete instead of dead 2016-03-23 18:08:19 -07:00
Alex Dadgar ad92e50a24 Avoid serializes Allocation.Resources 2016-03-01 14:09:25 -08:00
Armon Dadgar 6e8d6e61fc nomad: fixing test name 2016-02-21 18:00:33 -08:00
Armon Dadgar 0e1f9a3def nomad: FSM supports multiple client updates 2016-02-21 17:58:56 -08:00