Preetha Appan
be36fee48e
Use IsParameterized/isPeriodic methods
2019-01-17 12:15:42 -06:00
Preetha Appan
81a8f18cac
Fix bug in reconcile summaries that affects periodic/parameterized jobs
...
This fixes incorrect parent job summaries by recomputing them in the
ReconcileJobSummaries method in the state store
2019-01-17 12:01:01 -06:00
Mahmood Ali
a4a9347501
fix comment typos
2018-11-14 08:36:14 -05:00
Alex Dadgar
08dc2ea702
Merge pull request #4867 from hashicorp/b-deployment-progress-deadline
...
Blocked evaluation fixes
2018-11-13 10:29:03 -08:00
Mahmood Ali
8513b3cccb
Comment public functions and batch write txn
2018-11-12 16:09:39 -05:00
Mahmood Ali
9c0a15f3ce
Run job deregistering in a single transaction
...
Fixes https://github.com/hashicorp/nomad/issues/4299
Upon investigating this case further, we determined the issue to be a race between applying `JobBatchDeregisterRequest` fsm operation and processing job-deregister evals.
Processing job-deregister evals should wait until the FSM log message finishes applying, by using the snapshot index. However, with `JobBatchDeregister`, any single individual job deregistering was applied accidentally incremented the snapshot index and resulted into processing job-deregister evals. When a Nomad server receives an eval for a job in the batch that is yet to be deleted, we accidentally re-run it depending on the state of allocation.
This change ensures that we delete deregister all of the jobs and inserts all evals in a single transactions, thus blocking processing related evals until deregistering complete.
2018-11-09 22:35:26 -05:00
Alex Dadgar
b1c5d52817
Track jobs by namespace
2018-11-07 10:22:08 -08:00
Preetha Appan
32cc764072
Add fsm layer tests
2018-10-30 11:06:32 -05:00
Preetha Appan
7b8156fc47
Restore/Snapshot plus unit tests for scheduler configuration
2018-10-30 11:06:32 -05:00
Preetha Appan
bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption
2018-10-30 11:06:32 -05:00
Preetha Appan
cc295b90de
Implement preemption for system jobs.
...
This commit implements an allocation selection algorithm for finding
allocations to preempt. It currently special cases network resource asks
from others (cpu/memory/disk/iops).
2018-10-30 11:06:32 -05:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
3c19d01d7a
server
2018-09-15 16:23:13 -07:00
Alex Dadgar
300b1a7a15
Tests only use testlog package logger
2018-06-13 15:40:56 -07:00
Alex Dadgar
352f2e03b5
Clean up leaked deployments on restoration
...
This PR cancels deployments that are active but do not have a job
associated with them. This is a broken invariant that causes issues in
the deployment watcher since it will not track them. Thus they are
objects that can't be operated on or cleaned up.
Fixes https://github.com/hashicorp/nomad/issues/4286
2018-05-23 16:44:21 -07:00
Alex Dadgar
17aac1c9de
node heartbeat missed event
2018-05-22 14:05:46 -07:00
Alex Dadgar
5f2080bc26
Emit events based on eligibility
2018-05-22 14:04:59 -07:00
Alex Dadgar
a35248d1d8
Plumb event via FSM
2018-05-10 16:30:54 -07:00
Alex Dadgar
c91ce5cc38
Fix not enqueuing eval
2018-05-07 14:50:01 -05:00
Alex Dadgar
641ef81cbf
Test fixes
2018-05-07 14:50:01 -05:00
Alex Dadgar
d0f237086b
UX touchups
2018-04-26 15:24:27 -07:00
Alex Dadgar
2b14371db5
Fix spelling
2018-04-03 15:58:03 -07:00
Alex Dadgar
9617a13a2b
Correctly handle the upgrade path of a node being drained when applying Raft logs
2018-04-03 15:32:44 -07:00
Alex Dadgar
301704091b
Handle upgrade where Node doesn't have eligiblity
...
This PR handles upgrading a node that has no scheduling eligiblity set.
2018-03-29 16:52:23 -07:00
Alex Dadgar
2d91b9dfba
Batch drain update
2018-03-21 16:51:44 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
010a6b8ca5
Unblock evals once eligible
2018-03-21 16:51:44 -07:00
Alex Dadgar
0fba0101b6
RPC/FSM/State Store for Eligibility
2018-03-21 16:51:44 -07:00
Alex Dadgar
2f5309d82a
Remove update time
2018-03-21 16:51:43 -07:00
Alex Dadgar
e459a666ed
Node.Drain takes strategy
2018-03-21 16:49:48 -07:00
Michael Schurter
d1ec65d765
switch to new raft DesiredTransition message
2018-03-21 16:49:48 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Alex Dadgar
586ae36d13
Batch Deregister RPC
2018-03-16 10:53:03 -07:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Alex Dadgar
d3c3deffad
fixes
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8bcbd81e6
batch submitting node events
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
d30c269fbe
code review feedback
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
4ede27a3c8
RPC, FSM, state store for Node.EmitEvent
...
add node event when registering a node for the first time
2018-03-13 18:05:40 -07:00
Josh Soref
f28efbbc79
spelling: sanitize
2018-03-11 18:52:59 +00:00
Josh Soref
7ab998803b
spelling: periodic
2018-03-11 18:37:05 +00:00
Josh Soref
d300623abe
spelling: evaluation
2018-03-11 18:01:35 +00:00
Josh Soref
d9ce1f7882
spelling: deregister
2018-03-11 17:53:22 +00:00
Preetha Appan
52a665836e
Remove extra fields set in client allocations during update
2018-01-31 09:58:05 -06:00
Preetha Appan
2567b51c58
Edge trigger evaluation when allocations client status is failed
2018-01-31 09:56:53 -06:00
Kyle Havlovitz
12ff22ea70
Merge branch 'master' into autopilot
2018-01-18 13:29:25 -08:00
Kyle Havlovitz
1c07066064
Add autopilot functionality based on Consul's autopilot
2017-12-18 14:29:41 -08:00
Preetha Appan
51bd0b59c7
Return an error if evaluation doesn't exist in state store at plan apply time.
2017-12-18 14:55:36 -06:00
Alex Dadgar
86608124ca
Fix followers not creating periodic launch
...
Fix an issue in which periodic launches wouldn't be made on followers.
2017-12-11 13:55:17 -08:00