Preetha Appan
32cc764072
Add fsm layer tests
2018-10-30 11:06:32 -05:00
Preetha Appan
7b8156fc47
Restore/Snapshot plus unit tests for scheduler configuration
2018-10-30 11:06:32 -05:00
Preetha Appan
bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption
2018-10-30 11:06:32 -05:00
Preetha Appan
cc295b90de
Implement preemption for system jobs.
...
This commit implements an allocation selection algorithm for finding
allocations to preempt. It currently special cases network resource asks
from others (cpu/memory/disk/iops).
2018-10-30 11:06:32 -05:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
3c19d01d7a
server
2018-09-15 16:23:13 -07:00
Alex Dadgar
300b1a7a15
Tests only use testlog package logger
2018-06-13 15:40:56 -07:00
Alex Dadgar
352f2e03b5
Clean up leaked deployments on restoration
...
This PR cancels deployments that are active but do not have a job
associated with them. This is a broken invariant that causes issues in
the deployment watcher since it will not track them. Thus they are
objects that can't be operated on or cleaned up.
Fixes https://github.com/hashicorp/nomad/issues/4286
2018-05-23 16:44:21 -07:00
Alex Dadgar
17aac1c9de
node heartbeat missed event
2018-05-22 14:05:46 -07:00
Alex Dadgar
5f2080bc26
Emit events based on eligibility
2018-05-22 14:04:59 -07:00
Alex Dadgar
a35248d1d8
Plumb event via FSM
2018-05-10 16:30:54 -07:00
Alex Dadgar
c91ce5cc38
Fix not enqueuing eval
2018-05-07 14:50:01 -05:00
Alex Dadgar
641ef81cbf
Test fixes
2018-05-07 14:50:01 -05:00
Alex Dadgar
d0f237086b
UX touchups
2018-04-26 15:24:27 -07:00
Alex Dadgar
2b14371db5
Fix spelling
2018-04-03 15:58:03 -07:00
Alex Dadgar
9617a13a2b
Correctly handle the upgrade path of a node being drained when applying Raft logs
2018-04-03 15:32:44 -07:00
Alex Dadgar
301704091b
Handle upgrade where Node doesn't have eligiblity
...
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Alex Dadgar
2d91b9dfba
Batch drain update
2018-03-21 16:51:44 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
010a6b8ca5
Unblock evals once eligible
2018-03-21 16:51:44 -07:00
Alex Dadgar
0fba0101b6
RPC/FSM/State Store for Eligibility
2018-03-21 16:51:44 -07:00
Alex Dadgar
2f5309d82a
Remove update time
2018-03-21 16:51:43 -07:00
Alex Dadgar
e459a666ed
Node.Drain takes strategy
2018-03-21 16:49:48 -07:00
Michael Schurter
d1ec65d765
switch to new raft DesiredTransition message
2018-03-21 16:49:48 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Alex Dadgar
586ae36d13
Batch Deregister RPC
2018-03-16 10:53:03 -07:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Alex Dadgar
d3c3deffad
fixes
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8bcbd81e6
batch submitting node events
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
d30c269fbe
code review feedback
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
4ede27a3c8
RPC, FSM, state store for Node.EmitEvent
...
add node event when registering a node for the first time
2018-03-13 18:05:40 -07:00
Josh Soref
f28efbbc79
spelling: sanitize
2018-03-11 18:52:59 +00:00
Josh Soref
7ab998803b
spelling: periodic
2018-03-11 18:37:05 +00:00
Josh Soref
d300623abe
spelling: evaluation
2018-03-11 18:01:35 +00:00
Josh Soref
d9ce1f7882
spelling: deregister
2018-03-11 17:53:22 +00:00
Preetha Appan
52a665836e
Remove extra fields set in client allocations during update
2018-01-31 09:58:05 -06:00
Preetha Appan
2567b51c58
Edge trigger evaluation when allocations client status is failed
2018-01-31 09:56:53 -06:00
Kyle Havlovitz
12ff22ea70
Merge branch 'master' into autopilot
2018-01-18 13:29:25 -08:00
Kyle Havlovitz
1c07066064
Add autopilot functionality based on Consul's autopilot
2017-12-18 14:29:41 -08:00
Preetha Appan
51bd0b59c7
Return an error if evaluation doesn't exist in state store at plan apply time.
2017-12-18 14:55:36 -06:00
Alex Dadgar
86608124ca
Fix followers not creating periodic launch
...
Fix an issue in which periodic launches wouldn't be made on followers.
2017-12-11 13:55:17 -08:00
Alex Dadgar
c1cc51dbee
sync
2017-10-13 14:36:02 -07:00
Michael Schurter
a66c53d45a
Remove `structs` import from `api`
...
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar
54e04b5c0e
Merge pull request #3201 from hashicorp/b-periodic-restore
...
Fix restoration of stopped periodic jobs
2017-09-13 11:42:29 -07:00
Alex Dadgar
a2363e7583
sync acls
2017-09-13 11:38:29 -07:00
Alex Dadgar
e3dbcdcb44
Fix restoration of stopped periodic jobs
...
This PR fixes an issue in which we would add a stopped periodic job to
the periodic launcher.
2017-09-12 14:25:40 -07:00
Armon Dadgar
20a8e590a0
nomad: support ACL bootstrap reset
2017-09-10 16:03:30 -07:00
Alex Dadgar
84d06f6abe
Sync namespace changes
2017-09-07 17:04:21 -07:00
Armon Dadgar
e24a4abf2c
nomad: adding ACL bootstrap endpoints
2017-09-04 13:05:53 -07:00
Armon Dadgar
e5c69f162c
nomad: implement ACL token endpoints
2017-09-04 13:04:45 -07:00
Armon Dadgar
e9bad0bf37
nomad: Add ACL Token snapshot/restore to FSM
2017-09-04 13:04:45 -07:00
Armon Dadgar
afdde24799
nomad: adding upsert policy endpoint
2017-09-04 13:03:15 -07:00
Armon Dadgar
e3e243f433
nomad: implement policy delete endpoint
2017-09-04 13:03:15 -07:00
Armon Dadgar
f147f25fe5
Addressing @dadgar review feedback
2017-09-04 13:03:15 -07:00
Armon Dadgar
10b583ea38
nomad: adding FSM snapshot/restore of ACL policies
2017-09-04 13:03:14 -07:00
Alex Dadgar
5457bb7962
Job stability
2017-07-07 12:10:04 -07:00
Alex Dadgar
b64185a3f1
Deployment GC
...
This PR implements the garbage collector for deployments. Deployments
will by default be garbage collected after 1 hour.
2017-07-07 12:05:57 -07:00
Alex Dadgar
dad9e69822
more comment fixes
2017-07-07 12:03:11 -07:00
Alex Dadgar
6688a3f76c
FSM Tests
2017-07-07 12:03:11 -07:00
Alex Dadgar
d04877d23c
initial impl
2017-07-07 12:03:11 -07:00
Alex Dadgar
919c50ba5c
Merge branch 'master' into f-update-block
2017-05-11 13:08:31 -07:00
Alex Dadgar
50eec3ef35
handle upgrading old update block syntax
2017-05-10 13:48:53 -07:00
Alex Dadgar
7078d563cb
Create Deployments through plan application
2017-05-05 15:33:19 -07:00
Alex Dadgar
343ff03f02
Deployment struct, state store, fsm persist/restore
2017-05-04 13:37:18 -07:00
Alex Dadgar
b67c40f717
Proper denormalization in optimistic state store
2017-05-01 14:49:57 -07:00
Alex Dadgar
3145086a42
non-purge deregisters
2017-04-15 17:08:05 -07:00
Alex Dadgar
103e8d21fb
Fix dispatch of periodic job
...
This PR fixes an issue in which when a periodic and parameterized job
was dispatched, an allocation would be immediately created.
Fixes https://github.com/hashicorp/nomad/issues/2470
2017-03-27 16:55:17 -07:00
Alex Dadgar
eaf285b208
Fix missing summary restoration
2017-02-08 11:51:48 -08:00
Alex Dadgar
5d293c0f1e
Add abandon tests and use snapshot for blocking queries
2017-02-08 11:18:03 -08:00
Alex Dadgar
b69b357c7f
Nomad builds
2017-02-07 20:31:23 -08:00
Alex Dadgar
570efcaebd
Update state store and blocking query helper
2017-02-05 12:03:11 -08:00
Alex Dadgar
a616e9d970
Merge pull request #2163 from hashicorp/b-summary
...
Job Summary: Fix queued accounting and remove in-place state store updates
2017-01-11 13:36:36 -08:00
Alex Dadgar
efbb2894c7
Review fixes
2017-01-11 13:18:36 -08:00
Alex Dadgar
d0f918495b
Store pointer of JobSummary in state store and remove in-place modifications of the object and replace with Copy-Update-Insert operations
2017-01-08 13:55:03 -08:00
Alex Dadgar
86980e08f0
Cancel blocked evals upon successful one for job
...
This PR causes blocked evaluations to be cancelled if there is a
subsequent successful evaluation for the job. This fixes UX problems
showing failed placements when there are not any in reality and makes GC
possible for these jobs in certain cases.
Fixes https://github.com/hashicorp/nomad/issues/2124
2017-01-04 16:16:04 -08:00
Diptanu Choudhury
56ed1d3cd8
Fixing the upgrade path for ephemeral disk
2016-11-08 15:24:51 -08:00
Alex Dadgar
3c9936ae4a
Merge pull request #1659 from hashicorp/f-revoke-accessors
...
Token revocation and keeping only a single Vault client active among servers
2016-08-31 14:10:46 -07:00
Alex Dadgar
6047414fb9
address comments
2016-08-31 14:10:33 -07:00
Diptanu Choudhury
bfee7b30a3
Introducing shared resources in alloc
2016-08-29 13:49:25 -07:00
Alex Dadgar
48696ba0cc
Use tomb to shutdown
...
Token revocation
Remove from the statestore
Revoke tokens
Don't error when Vault is disabled as this could cause issue if the operator ever goes from enabled to disabled
update server interface to allow enable/disable and config loading
test the new functions
Leader revoke
Use active
2016-08-28 14:06:25 -07:00
Diptanu Choudhury
13497913f9
Ensuring resources are re-calculated properly in fsm
2016-08-26 20:13:11 -07:00
Alex Dadgar
76d324a8f0
fix comment
2016-08-22 11:41:47 -07:00
Alex Dadgar
901000f789
Raft message, fsm and state store table
2016-08-19 16:40:37 -07:00
Diptanu Choudhury
1518f23d0a
Making servers reconcile job summaries when they acquire leadership
2016-08-05 16:47:36 -07:00
Diptanu Choudhury
88d383c47f
Updated tests and comments
2016-08-04 11:29:36 -07:00
Diptanu Choudhury
74caed0c7a
Added an endpoint for users to reconcile job summaries
2016-08-03 16:12:47 -07:00
Diptanu Choudhury
1b60e0823a
Added a test for restoring the summaries in fsm
2016-08-03 11:58:36 -07:00
Alex Dadgar
2332a58944
Do not update the job of allocations that are being stopped
2016-08-02 17:53:31 -07:00
Diptanu Choudhury
10a5c06a5a
Running the tests in verbose mode
2016-07-26 14:02:47 -07:00
Diptanu Choudhury
d1a6bdb4ba
Making the queued allocations bind late
2016-07-25 22:11:11 -07:00
Diptanu Choudhury
3089833397
Reconciling the queued allocations during restore
2016-07-25 17:31:40 -07:00
Diptanu Choudhury
f1c9427c37
Added code to create missing job summaries
2016-07-25 17:26:38 -07:00
Diptanu Choudhury
50842b88c7
Fixed some bugs
2016-07-25 17:26:38 -07:00
Diptanu Choudhury
7bafb7c675
Updating the job summary while mutating jobs and allocation objects
2016-07-25 17:26:38 -07:00
Diptanu Choudhury
109b05cb29
Applying changes to job updates via FSM
2016-07-25 17:26:38 -07:00
Alex Dadgar
e0114fee05
InitFields to Canonicalize
2016-07-20 16:08:52 -07:00
Alex Dadgar
0634eeb3e0
Sanatize incoming slices/maps
2016-07-20 16:00:02 -07:00
Diptanu Choudhury
5e6f9ef69e
Added methods to save and restore job summary snapshots
2016-07-12 11:41:13 -06:00
Alex Dadgar
bfdd5846e1
Track unblock indexes and check evals on block to see if they missed an update while in the scheduler
2016-05-24 20:10:56 -07:00
Alex Dadgar
045f7807e0
eval_broker.Enqueue no longer returns an error
2016-05-18 11:35:15 -07:00
Diptanu Choudhury
778167e26f
Using the node in the req to unblock evals
2016-03-31 18:19:01 -07:00
Diptanu Choudhury
fd3c05a2be
Making driver fingerprinters run periodically and unblocking evals when node registration happens
2016-03-31 18:11:27 -07:00
Alex Dadgar
94522e7bed
Successful allocations are marked as complete instead of dead
2016-03-23 18:08:19 -07:00
Alex Dadgar
ad92e50a24
Avoid serializes Allocation.Resources
2016-03-01 14:09:25 -08:00
Armon Dadgar
0e1f9a3def
nomad: FSM supports multiple client updates
2016-02-21 17:58:56 -08:00
Armon Dadgar
0e24df554e
nomad: allow for job normalization
2016-02-21 11:42:54 -08:00
Alex Dadgar
2a19e179bc
Switch to using go/codec and use code generation
2016-02-20 18:05:17 -08:00
Armon Dadgar
623b473ee6
Revert "Lzw compress raft entries"
...
This reverts commit a687ee6df316ffadac5305f4be9cf9a5642cd6af.
The LZW compression reduces the file size by ~60%, but increases
our encode and decode costs by 2x in a CPU bottlenecked path.
2016-02-20 17:36:39 -08:00
Alex Dadgar
6e8a57ba1a
Lzw compress raft entries
2016-02-20 13:21:38 -08:00
Alex Dadgar
74135f02a4
Blocked Eval tracker
2016-01-31 18:04:45 -08:00
Alex Dadgar
9244037bba
Only add periodic job insertion time once
2015-12-23 19:02:31 -08:00
Alex Dadgar
bff2768060
Use desired launch time in periodic launch table
2015-12-23 18:26:39 -08:00
Alex Dadgar
bf2aa9f733
Always remove periodic jobs in fsm
2015-12-23 18:26:39 -08:00
Alex Dadgar
e3231171b8
Fix deadlock and test
2015-12-23 18:26:39 -08:00
Alex Dadgar
3a400432dc
Always add jobs to periodic tracker
2015-12-23 18:26:39 -08:00
Alex Dadgar
b3e87b6719
Remove the periodicRunner interface and pass the server as an interface to the periodicDispatcher
2015-12-23 18:26:39 -08:00
Alex Dadgar
49dd0dc461
fixes from review
2015-12-23 18:26:39 -08:00
Alex Dadgar
37c1f56921
Snapshot and restore
2015-12-23 18:26:39 -08:00
Alex Dadgar
ea799b88cb
merge
2015-12-23 18:26:39 -08:00
Alex Dadgar
a892d61ae7
FSM integration
2015-12-23 18:26:39 -08:00
Alex Dadgar
4d817f5cbc
Revert "Revert "Make drivers take arguments as a list and not as a string""
2015-11-18 15:16:42 -08:00
Alex Dadgar
0e51375285
Revert "Make drivers take arguments as a list and not as a string"
2015-11-18 13:46:43 -08:00
Alex Dadgar
07a5ceed9c
Use one msgpack codec and have it decode []string in nil interfaces
2015-11-18 13:15:13 -08:00
Armon Dadgar
ad681be59c
nomad: adding node GC
2015-09-07 11:01:29 -07:00
Armon Dadgar
1a5579384a
nomad: cleanup API descrepencies
2015-09-06 20:47:42 -07:00
Armon Dadgar
21000d41ec
nomad: adding FSM support for node drain update
2015-09-06 19:55:38 -07:00
Armon Dadgar
a2010b9f2a
nomad: FSM support for AllocClientUpdateRequestType
2015-08-25 18:00:14 -07:00
Armon Dadgar
2ea99f211a
nomad: updating for new alloc representation
2015-08-25 17:36:52 -07:00
Armon Dadgar
3b97632a99
nomad: integrate time table with FSM
2015-08-15 17:38:13 -07:00
Armon Dadgar
a80249204f
nomad: DeleteEval request can take a list of evals and allocations
2015-08-15 15:39:29 -07:00
Armon Dadgar
343b1b9c89
nomad: move state and mocks into shared packages
2015-08-11 14:27:14 -07:00
Armon Dadgar
a2feb20903
nomad: UpsertEvals can take a list of Evals
2015-08-06 14:51:15 -07:00
Armon Dadgar
b5100eeefb
nomad: FSM enqueues evaluations
2015-08-06 11:32:42 -07:00
Armon Dadgar
a8a01f8828
nomad: remove eval broker from state store
2015-08-06 11:09:59 -07:00
Armon Dadgar
6adf1d2af0
nomad: adding the FSM CRUD
2015-08-04 14:04:26 -07:00
Armon Dadgar
487acdd3c7
nomad: threading eval broker through fsm and state store
2015-07-23 22:30:08 -07:00
Armon Dadgar
9b921d9f92
nomad: adding FSM support for evaluations
2015-07-23 15:52:38 -07:00
Armon Dadgar
d39d3585a2
nomad: reduce mixed terminology by adding ID to Job
2015-07-23 15:15:48 -07:00
Armon Dadgar
bc0cea6d26
nomad: add FSM support for Jobs
2015-07-07 10:55:47 -06:00
Armon Dadgar
1947a19885
nomad: make struct names more specific to avoid confusion
2015-07-07 10:51:42 -06:00
Armon Dadgar
8ba9e09596
nomad: support FSM snap/restore of indexes table
2015-07-06 15:51:01 -06:00
Armon Dadgar
f16fc93d85
nomad: testing fsm snapshot and restore
2015-07-06 14:01:10 -06:00
Armon Dadgar
185e3cb3c7
nomad: FSM support for node updates
2015-07-03 18:41:36 -07:00
Armon Dadgar
4b48b2c1b6
nomad: change to use MemDB instead of raw iradix
2015-07-03 14:46:30 -07:00
Armon Dadgar
06d0e5f9b1
nomad: adding immutable radix, snapshot support
2015-06-03 11:21:59 +02:00
Armon Dadgar
4c554f14bb
nomad: support DevMode configuration
2015-06-01 21:11:40 +02:00
Armon Dadgar
1e7f84f3e6
nomad: adding basic structure for raft
2015-06-01 17:49:10 +02:00