open-nomad

Author	SHA1	Message	Date
Preetha Appan	5373ade731	Scheduler and Reconciler changes to support delayed rescheduling	2018-03-14 16:10:32 -05:00
Josh Soref	e0f6a33fe5	spelling: system	2018-03-11 19:01:19 +00:00
Josh Soref	a89e1b8395	spelling: strategy	2018-03-11 18:58:19 +00:00
Josh Soref	f8eb766fb5	spelling: reschedulable	2018-03-11 18:48:12 +00:00
Josh Soref	ed8db9992e	spelling: feasibility	2018-03-11 18:07:09 +00:00
Josh Soref	bf9283c606	spelling: corresponding	2018-03-11 17:51:41 +00:00
Josh Soref	ca4ceb0e5c	spelling: commits	2018-03-11 17:47:45 +00:00
Preetha Appan	7b6ba7a1f4	Fixes bug in reconciler where previously rescheduled allocs are rescheduled again. Simplified logic and added test case to catch this.	2018-02-20 12:07:56 -06:00
Preetha Appan	7c57303dd2	Clarify comment	2018-02-05 16:37:07 -06:00
Preetha Appan	d48c411692	Reconciler should consider failed allocs when marking deployment as failed.	2018-02-02 19:40:25 -06:00
Preetha Appan	a1237d627a	code review feedback	2018-01-31 09:58:05 -06:00
Preetha Appan	5ad892026a	Add a field to track the next allocation during a replacement	2018-01-31 09:58:05 -06:00
Preetha Appan	2ed4de7e7b	Track previous node id correctly, plus unit test	2018-01-31 09:58:05 -06:00
Preetha Appan	dd4917c2f0	Add more clarification in comment	2018-01-31 09:58:05 -06:00
Preetha Appan	09bef7d1ce	Preallocate slice for skipped nodes	2018-01-31 09:58:05 -06:00
Preetha Appan	237beb49ae	Better score threshold	2018-01-31 09:58:05 -06:00
Preetha Appan	fa18c0def4	Add one more unit test	2018-01-31 09:58:05 -06:00
Preetha Appan	a75540cec6	Limit iterator uses a score threshold and a maxSkip value to be able to skip lower scoring nodes	2018-01-31 09:58:05 -06:00
Preetha Appan	b6268a5fab	Beef up unit test for rescheduling batch jobs	2018-01-31 09:56:53 -06:00
Preetha Appan	ea4a889e28	Address more code review feedback	2018-01-31 09:56:53 -06:00
Preetha Appan	bd89d2b39e	Make sure that reschedule trackers are not added for node drain replacements	2018-01-31 09:56:53 -06:00
Preetha Appan	a662b38801	Improve reconciler unit tests	2018-01-31 09:56:53 -06:00
Preetha Appan	fee4ccf154	Prevent side effect modification of select options when preferred nodes are set	2018-01-31 09:56:53 -06:00
Preetha Appan	21b7b79d5d	Add helper methods, use require and other code review feedback	2018-01-31 09:56:53 -06:00
Preetha Appan	d0f9d59abb	Reconile with changes to structs for reschedule tracking	2018-01-31 09:56:53 -06:00
Preetha Appan	fbb1936dee	Fix some comments and lint warnings, remove unused method	2018-01-31 09:56:53 -06:00
Preetha Appan	031c566ada	Reschedule previous allocs and track their reschedule attempts	2018-01-31 09:56:53 -06:00
Preetha Appan	fd2fbefa4c	Add a field to track the next allocation during a replacement	2018-01-24 17:55:05 -06:00
Alex Dadgar	6dda0ebaed	gofmt	2018-01-04 14:45:15 -08:00
Alex Dadgar	2f561609b7	Fix detection of successful batch allocations This PR restores older behavior of detecting successful batch allocations (04d86ffd1006fde9dfb2ca8c1237fe60b995b0e3). This has the side effect that we correctly filter desired status stop but not successful batch allocations and create their replacements.	2018-01-04 14:20:32 -08:00
Preetha	1712b03705	Merge branch 'master' into 0.8	2018-01-03 16:06:38 -06:00
Preetha Appan	51bd0b59c7	Return an error if evaluation doesn't exist in state store at plan apply time.	2017-12-18 14:55:36 -06:00
Preetha Appan	3c36abfe14	Update eval modify index as part of plan apply.	2017-12-18 10:03:55 -06:00
Preetha Appan	3b4d7ac2a3	Fix some typos	2017-12-14 13:29:27 -06:00
Michael Schurter	45494f7304	Fix port labels on mock Alloc/Job/Node	2017-12-08 14:50:06 -08:00
Alex Dadgar	44240ce440	Merge pull request #3375 from hashicorp/b-batch Allow batch jobs to be rerun if purged	2017-10-13 17:11:45 -07:00
Alex Dadgar	c1cc51dbee	sync	2017-10-13 14:36:02 -07:00
Alex Dadgar	746cd7403f	Allow batch jobs to be rerun if purged This PR allows batch jobs to be rerun if they have been purged.	2017-10-13 12:40:37 -07:00
Michael Schurter	a66c53d45a	Remove `structs` import from `api` Goes a step further and removes structs import from api's tests as well by moving GenerateUUID to its own package.	2017-09-29 10:36:08 -07:00
Alex Dadgar	4173834231	Enable more linters	2017-09-26 15:26:33 -07:00
Alex Dadgar	3904bde9a3	Fix batch handling of complete allocs/node drains This PR fixes: * An issue in which a node-drain that contains a complete batch alloc would cause a replacement * An issue in which allocations with the same name during a scale down/stop event wouldn't be properly stopped. * An issue in which batch allocations from previous job versions may not have been stopped properly. Fixes https://github.com/hashicorp/nomad/issues/3210	2017-09-14 15:08:57 -07:00
Alex Dadgar	84d06f6abe	Sync namespace changes	2017-09-07 17:04:21 -07:00
Alex Dadgar	0aef02a4f9	fix test	2017-08-21 14:07:54 -07:00
Alex Dadgar	27256ebcc6	Placing allocs counts towards placement limit This PR makes placing new allocations count towards the limit. We do not restrict how many new placements are made by the limit but we still count towards the limit. This has the nice affect that if you have a group with count = 5 and max_parallel = 1 but only 3 allocs exist for it and a change is made, you will create 2 more at the new version but not destroy one, taking you down to two running as you would have previously. Fixes https://github.com/hashicorp/nomad/issues/3053	2017-08-21 12:41:19 -07:00
Alex Dadgar	2453f13fc5	fixes	2017-08-15 12:27:05 -07:00
Alex Dadgar	0570e09feb	Fix panic occuring from improper bitmap size This PR fixes an allignment calculation when determining the bitmap size. Fixes https://github.com/hashicorp/nomad/issues/3008	2017-08-12 15:37:02 -07:00
Luke Farnell	f0ced87b95	fixed all spelling mistakes for goreport	2017-08-07 17:13:05 -04:00
Alex Dadgar	7b13c0d702	Lost allocs replaced even if deployment failed This PR allows the scheduler to replace lost allocations even if the job has a failed or paused deployment. The prior behavior was confusing to users. Fixes https://github.com/hashicorp/nomad/issues/2958	2017-08-03 17:42:14 -07:00
Alex Dadgar	7d2b84ab01	Review fixes	2017-08-01 14:18:52 -07:00
Alex Dadgar	2650bb1d12	Distinct Property supports arbitrary limit This PR enhances the distinct_property constraint such that a limit can be specified in the RTarget/value parameter. This allows constraints such as: ``` constraint { distinct_property = "${meta.rack}" value = "2" } ``` This restricts any given rack from running more than 2 allocations from the task group. Fixes https://github.com/hashicorp/nomad/issues/1146	2017-07-31 16:52:13 -07:00
Alex Dadgar	4f69355a66	Fix incorrect destructive update with distinct_property constraint This PR fixes an issue in which an update to a task group with a distinct property constraint would result in an incorrect destructive update.	2017-07-31 11:17:35 -07:00
Michael Schurter	5f1f91a46c	Use go-testing-interface instead of testing This drops the testings stdlib pkg from our dependencies. Saves a whopping 46kb on our binary (was really hoping for more of a win there), but also avoids potential ugliness with how testing sets flags.	2017-07-25 15:35:19 -07:00
Alex Dadgar	492239d3ee	Improve multiple group handling in a deployment This PR resolves a bug in which a job with multiple task groups would create new deployment objects each, thus clearing out all other task groups deployment state.	2017-07-25 11:27:47 -07:00
Alex Dadgar	184bfd4836	Better comment	2017-07-20 12:31:08 -07:00
Alex Dadgar	248315a2d9	Handle destructive changes before placements This PR updates the generic scheduler to handle destructive changes before handling placements. This is important because the destructive change may be due to a lowering of resources. If this is the case, the handling of the destructive changes first may make it possible for the placement to happen. To reason about this imagine there is one node with CPU = 500. If the group originally had: * `count = 1` * `cpu = 400` And then the job was updated such that the group had: * `count = 4` * `cpu = 120` If the original alloc isn't discounted first, nothing would be able to place.	2017-07-20 12:24:27 -07:00
Alex Dadgar	ce265e0aff	Update full node test to test more advanced case	2017-07-20 12:23:40 -07:00
Alex Dadgar	a9ec1d6ca7	Fix update limit calculation to avoid panic This PR fixes the rolling update limit calculation to avoid a panic when there are more allocations for a deployment that haven't determined their health than the max_parallel count of the task group. Fixes https://github.com/hashicorp/nomad/issues/2820	2017-07-19 11:11:47 -07:00
Alex Dadgar	22e84d00ab	Fix deep copy of driver config	2017-07-17 17:53:21 -07:00
Alex Dadgar	641e178416	Stop before trying to place	2017-07-17 17:18:12 -07:00
Alex Dadgar	66a90326e1	Treat destructive updates atomically	2017-07-16 10:35:38 -07:00
Alex Dadgar	f86760db3c	Basic logs	2017-07-07 16:49:08 -07:00
Alex Dadgar	20005f925a	Rolling node drains using max_parallel and stagger This PR adds rolling node drains done at max_parallel and stagger of the update spec. It brings it inline with old behavior.	2017-07-07 12:12:48 -07:00
Alex Dadgar	3a29b38108	Status description shows requiring promotion	2017-07-07 12:12:48 -07:00
Alex Dadgar	9f016606aa	Fix some tests, eval monitor shows deployment id and deployment cancels based on version	2017-07-07 12:12:48 -07:00
Alex Dadgar	9aa1f2fea2	Respond to comments	2017-07-07 12:10:04 -07:00
Alex Dadgar	454083ba1b	Remove canary	2017-07-07 12:10:04 -07:00
Alex Dadgar	d352d85bb9	Test scheduler's handling of canaries/inplace updates	2017-07-07 12:10:04 -07:00
Alex Dadgar	83c60483f2	Test marking as complete	2017-07-07 12:10:04 -07:00
Alex Dadgar	477c713df5	Plan apply handles canaries and success is set via update	2017-07-07 12:10:04 -07:00
Alex Dadgar	1e8b5e75a5	Fix handling of failed job	2017-07-07 12:10:04 -07:00
Alex Dadgar	e229d3650b	Attach eval id	2017-07-07 12:10:04 -07:00
Alex Dadgar	af1935e1e1	Mark complete	2017-07-07 12:10:04 -07:00
Alex Dadgar	8424a3b380	Change canary handling	2017-07-07 12:10:04 -07:00
Alex Dadgar	c10d7ab871	Remove promoted bit from allocation	2017-07-07 12:10:04 -07:00
Alex Dadgar	09dfa2fc10	Rename CreateDeployments and remove cancelling behavior in state_store	2017-07-07 12:10:04 -07:00
Alex Dadgar	067ed86a47	Client watches for allocation health using task state and Consul checks This PR adds watching of allocation health at the client. The client can watch for health based on the tasks running on time and also based on the consul checks passing.	2017-07-07 12:10:04 -07:00
Alex Dadgar	e7034691ea	deployment status	2017-07-07 12:07:07 -07:00
Alex Dadgar	d04877d23c	initial impl	2017-07-07 12:03:11 -07:00
Alex Dadgar	27a6e6b6d1	update description of the alloc update factory function	2017-07-07 12:03:11 -07:00
Alex Dadgar	ce2319be9b	cleanup limit detection	2017-07-07 12:03:11 -07:00
Alex Dadgar	b2573b01f9	Fix canary handling	2017-07-07 12:03:11 -07:00
Alex Dadgar	7952240d69	Deployment tests	2017-07-07 12:03:11 -07:00
Alex Dadgar	ce55559f12	Non-Canary/Deployment Tests	2017-07-07 12:03:11 -07:00
Alex Dadgar	d111dd5c10	Pull out in-place updating into a passed in function; reduce inputs to reconciler	2017-07-07 12:03:11 -07:00
Alex Dadgar	c77944ed29	assign names	2017-07-07 12:03:11 -07:00
Alex Dadgar	ecacd44888	handle batch filtering	2017-07-07 12:03:11 -07:00
Alex Dadgar	4c123500ee	Remove old	2017-07-07 12:03:11 -07:00
Alex Dadgar	270e26c600	Populate desired state per tg	2017-07-07 12:03:11 -07:00
Alex Dadgar	23dcd175ef	Show canaries on plan	2017-07-07 12:03:11 -07:00
Alex Dadgar	cf5baba808	handle annotations	2017-07-07 12:03:11 -07:00
Alex Dadgar	a46f7c3eb8	Todos	2017-07-07 12:03:11 -07:00
Alex Dadgar	00d962b8b5	Some comments and cleanup	2017-07-07 12:03:11 -07:00
Alex Dadgar	994ad285b7	Split reconcile file	2017-07-07 12:03:11 -07:00
Alex Dadgar	07b1c3e5db	Only upsert a job if the spec changes and push deployment creation into reconciler	2017-07-07 12:03:11 -07:00
Alex Dadgar	0d42b5d421	initial reconciler	2017-07-07 12:01:17 -07:00
Alex Dadgar	b3f4db0930	cancel deployments	2017-07-07 12:01:17 -07:00
Alex Dadgar	8169590d76	Fix tests	2017-05-01 13:54:26 -07:00
Alex Dadgar	5a2449d236	Respond to review comments	2017-04-19 10:54:03 -07:00
Alex Dadgar	3145086a42	non-purge deregisters	2017-04-15 17:08:05 -07:00
Alex Dadgar	2c31d4036b	Skip inplace update on terminal batch allocation This PR skips adding an inplace update to a successfully terminal batch job to the plan. This avoids extra data in the plan and avoids triggering updates on all clients that have the terminal allocation. This is matching behavior of the service scheduler. /cc @armon for review	2017-03-11 17:19:22 -08:00

1 2 3 4 5 ...

451 commits