open-nomad

Author	SHA1	Message	Date
Mahmood Ali	b93643cd96	Fix a panic related to batch GC `deleteJobVersions` does concurrent modifications to iterated items while iterating, by deleting job versions while it's iterating on them,	2018-11-19 20:59:45 -05:00
Mahmood Ali	a4a9347501	fix comment typos	2018-11-14 08:36:14 -05:00
Mahmood Ali	1403ad21b9	Changelog job re-run fix	2018-11-13 07:52:51 -05:00
Mahmood Ali	e2d668f21c	Merge pull request #4861 from hashicorp/b-batch-deregister-transaction Run job deregistering in a single transaction	2018-11-12 20:59:44 -05:00
Mahmood Ali	8513b3cccb	Comment public functions and batch write txn	2018-11-12 16:09:39 -05:00
Preetha Appan	7ef126a027	Smaller methods, and added tests for RPC layer	2018-11-10 17:37:33 -06:00
Mahmood Ali	9c0a15f3ce	Run job deregistering in a single transaction Fixes https://github.com/hashicorp/nomad/issues/4299 Upon investigating this case further, we determined the issue to be a race between applying `JobBatchDeregisterRequest` fsm operation and processing job-deregister evals. Processing job-deregister evals should wait until the FSM log message finishes applying, by using the snapshot index. However, with `JobBatchDeregister`, any single individual job deregistering was applied accidentally incremented the snapshot index and resulted into processing job-deregister evals. When a Nomad server receives an eval for a job in the batch that is yet to be deleted, we accidentally re-run it depending on the state of allocation. This change ensures that we delete deregister all of the jobs and inserts all evals in a single transactions, thus blocking processing related evals until deregistering complete.	2018-11-09 22:35:26 -05:00
Alex Dadgar	98398a8a44	Merge pull request #4842 from hashicorp/b-deployment-progress-deadline Fix multiple bugs with progress deadline handling	2018-11-08 13:31:54 -08:00
Alex Dadgar	261aae32b1	more robust merging of the deployment status when getting updates from the client	2018-11-05 16:39:09 -08:00
Alex Dadgar	1c31970464	Fix multiple tgs with progress deadline handling Fix an issue in which the deployment watcher would fail the deployment based on the earliest progress deadline of the deployment regardless of if the task group has finished. Further fix an issue where the blocked eval optimization would make it so no evals were created to progress the deployment. To reproduce this issue, prior to this commit, you can create a job with two task groups. The first group has count 1 and resources such that it can not be placed. The second group has count 3, max_parallel=1, and can be placed. Run this first and then update the second group to do a deployment. It will place the first of three, but never progress since there exists a blocked eval. However, that doesn't capture the fact that there are two groups being deployed.	2018-11-05 16:06:17 -08:00
Preetha Appan	57fe5050f0	more minor review feedback	2018-11-01 17:05:17 -05:00
Preetha Appan	1415032c13	More review comments	2018-10-30 11:06:32 -05:00
Preetha Appan	7b8156fc47	Restore/Snapshot plus unit tests for scheduler configuration	2018-10-30 11:06:32 -05:00
Preetha Appan	bd34cbb1f7	Support for new scheduler config API, first use case is to disable preemption	2018-10-30 11:06:32 -05:00
Preetha Appan	eb38488d08	Fix logic bug, unit test for plan apply method in state store	2018-10-30 11:06:32 -05:00
Preetha Appan	cc295b90de	Implement preemption for system jobs. This commit implements an allocation selection algorithm for finding allocations to preempt. It currently special cases network resource asks from others (cpu/memory/disk/iops).	2018-10-30 11:06:32 -05:00
Alex Dadgar	52f9cd7637	fixing tests	2018-10-04 14:26:19 -07:00
Alex Dadgar	b2449ae1ce	Fix deployment watcher index usage Fixes three issues: 1. Retrieving the latest evaluation index was not properly selecting the greatest index. This would undermine checks we had to reduce the number of evaluations created when the latest eval index was greater than any alloc change 2. Fix an issue where the blocking query code was using the incorrect index such that the index was higher than necassary. 3. Special case handling of blocked evaluation since the create/snapshot index is no particularly useful since they can be reblocked.	2018-09-21 13:59:11 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	300b1a7a15	Tests only use testlog package logger	2018-06-13 15:40:56 -07:00
Alex Dadgar	21c5ed850d	Register events	2018-05-22 14:06:33 -07:00
Alex Dadgar	17aac1c9de	node heartbeat missed event	2018-05-22 14:05:46 -07:00
Alex Dadgar	5f2080bc26	Emit events based on eligibility	2018-05-22 14:04:59 -07:00
Alex Dadgar	a35248d1d8	Plumb event via FSM	2018-05-10 16:30:54 -07:00
Preetha	02d63432b4	Fix typo	2018-05-07 14:55:01 -05:00
Alex Dadgar	738056634e	Fix the initial progress deadline calculation when the alloc is inplace updated to be part of a new deployment	2018-05-07 14:55:01 -05:00
Alex Dadgar	319763a5d8	remove unnessary merge of DeploymentStatus.Timestamp	2018-05-07 14:50:01 -05:00
Alex Dadgar	f95ab4ade8	Mark canaries on creation, and unmark on promotion	2018-05-07 14:50:01 -05:00
Alex Dadgar	641ef81cbf	Test fixes	2018-05-07 14:50:01 -05:00
Alex Dadgar	99e00fb774	Pass through timestamp	2018-05-07 14:50:01 -05:00
Alex Dadgar	1336002255	Progress deadline in deployment state	2018-05-07 14:50:01 -05:00
Preetha Appan	52b3b53181	Update ModifyIndex of alloc when setting NextAllocation value	2018-05-03 17:04:36 -05:00
Michael Schurter	91b5bb58d9	add HasHealth helper for nil checks We performed the DeploymentStatus nil checks a couple different ways, so hopefully this helper will consoldiate them and make it more clear what the code is doing.	2018-03-29 09:29:19 -07:00
Chelsea Holland Komlo	003bc209b9	use time.Time for node events for compatibility	2018-03-27 15:43:57 -04:00
Alex Dadgar	2d91b9dfba	Batch drain update	2018-03-21 16:51:44 -07:00
Alex Dadgar	7b2bad8c5e	Toggle Drain allows resetting eligibility This PR allows marking a node as eligible for scheduling while toggling drain. By default the `nomad node drain -disable` commmand will mark it as eligible but the drainer will maintain in-eligibility.	2018-03-21 16:51:44 -07:00
Alex Dadgar	93871c18f8	Fix retaining the drain	2018-03-21 16:51:44 -07:00
Alex Dadgar	0fba0101b6	RPC/FSM/State Store for Eligibility	2018-03-21 16:51:44 -07:00
Alex Dadgar	2f5309d82a	Remove update time	2018-03-21 16:51:43 -07:00
Alex Dadgar	e459a666ed	Node.Drain takes strategy	2018-03-21 16:49:48 -07:00
Michael Schurter	d1ec65d765	switch to new raft DesiredTransition message	2018-03-21 16:49:48 -07:00
Alex Dadgar	db4a634072	RPC, FSM, State Store for marking DesiredTransistion fix build tag	2018-03-21 16:49:48 -07:00
Michael Schurter	c0542474db	drain: initial drainv2 structs and impl	2018-03-21 16:49:48 -07:00
Alex Dadgar	de6ebb6e6c	small cleanup	2018-03-13 18:08:22 -07:00
Alex Dadgar	63e14b7d63	nodeevents -> events	2018-03-13 18:08:22 -07:00
Alex Dadgar	d3c3deffad	fixes	2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo	b41501e442	code review feedback	2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo	1488b076d1	code review feedback	2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo	19ef872769	keep state store functions in one file	2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo	d30c269fbe	code review feedback	2018-03-13 18:05:40 -07:00

1 2 3 4 5 ...

256 commits