open-nomad

Author	SHA1	Message	Date
Neil Mock	f749de8543	Fix multi-interface networking in the system scheduler (#8822 )	2020-09-22 12:54:34 -04:00
Mahmood Ali	def768728e	Have Plan.AppendAlloc accept the job	2020-08-25 17:22:09 -04:00
Mahmood Ali	8a342926b7	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Chris Baker	179ab68258	wip: added job.scale rpc endpoint, needs explicit test (tested via http now)	2020-03-24 13:57:09 +00:00
Mahmood Ali	b880607bad	update scheduler to account for hooks	2020-03-21 17:52:45 -04:00
Drew Bailey	e613a258da	ignore computed diffs if node is ineligible test flakey, add temp sleeps for debugging fix computed class	2020-02-03 09:02:08 -05:00
Drew Bailey	63ddda71e1	Return FailedTGAlloc metric instead of no node err If an existing system allocation is running and the node its running on is marked as ineligible, subsequent plan/applys return an RPC error instead of a more helpful plan result. This change logs the error, and appends a failedTGAlloc for the placement.	2020-01-22 10:07:15 -05:00
Preetha Appan	d21c708c4a	Fix inplace updates bug with group level networks During inplace updates, we should be using network information from the previous allocation being updated.	2019-09-05 18:37:24 -05:00
Preetha Appan	99eca85206	Scheduler changes to support network at task group level Also includes unit tests for binpacker and preemption. The tests verify that network resources specified at the task group level are properly accounted for	2019-07-31 01:04:08 -04:00
Lang Martin	8157a7b6f8	system_sched submits failed evals as blocked	2019-07-18 10:32:12 -04:00
Lang Martin	29ea112586	system_sched & test cleanup comments	2019-05-01 12:25:26 -04:00
Lang Martin	c43bcbd35e	system_sched when a node is filtered, don't mark failure	2019-05-01 12:25:26 -04:00
Arshneet Singh	4cf4324b8f	Remove allowPlanOptimization from schedulers	2019-04-23 09:18:02 -07:00
Arshneet Singh	b977748a4b	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Danielle Lancashire	832f607433	allocs: Add nomad alloc stop This adds a `nomad alloc stop` command that can be used to stop and force migrate an allocation to a different node. This is built on top of the AllocUpdateDesiredTransitionRequest and explicitly limits the scope of access to that transition to expose it under the alloc-lifecycle ACL. The API returns the follow up eval that can be used as part of monitoring in the CLI or parsed and used in an external tool.	2019-04-23 12:50:23 +02:00
James Rasell	9470507cf4	Add NodeName to the alloc/job status outputs. Currently when operators need to log onto a machine where an alloc is running they will need to perform both an alloc/job status call and then a call to discover the node name from the node list. This updates both the job status and alloc status output to include the node name within the information to make operator use easier. Closes #2359 Cloess #1180	2019-04-10 10:34:10 -05:00
Preetha Appan	0494a098ce	More style and readablity fixes from review	2018-10-30 11:06:32 -05:00
Preetha Appan	cc295b90de	Implement preemption for system jobs. This commit implements an allocation selection algorithm for finding allocations to preempt. It currently special cases network resource asks from others (cpu/memory/disk/iops).	2018-10-30 11:06:32 -05:00
Alex Dadgar	a78cefec18	use int64	2018-10-16 15:34:32 -07:00
Preetha Appan	7c0d8c646c	Change CPU/Disk/MemoryMB to int everywhere in new resource structs	2018-10-16 16:21:42 -05:00
Alex Dadgar	bac5cb1e8b	Scheduler uses allocated resources	2018-10-02 17:08:25 -07:00
Preetha Appan	a10118c461	Add failed follow up to the list of allowed eval trigger reasons needs unit test	2018-09-25 10:49:55 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Preetha Appan	751c0eb5a5	code review feedback	2018-09-04 16:10:11 -05:00
Alex Dadgar	9d60e2cebf	Correct status desc on draining system allocs	2018-03-26 17:54:46 -07:00
Alex Dadgar	db4a634072	RPC, FSM, State Store for marking DesiredTransistion fix build tag	2018-03-21 16:49:48 -07:00
Preetha Appan	031c566ada	Reschedule previous allocs and track their reschedule attempts	2018-01-31 09:56:53 -06:00
Preetha Appan	3b4d7ac2a3	Fix some typos	2017-12-14 13:29:27 -06:00
Michael Schurter	a66c53d45a	Remove `structs` import from `api` Goes a step further and removes structs import from api's tests as well by moving GenerateUUID to its own package.	2017-09-29 10:36:08 -07:00
Alex Dadgar	84d06f6abe	Sync namespace changes	2017-09-07 17:04:21 -07:00
Alex Dadgar	20005f925a	Rolling node drains using max_parallel and stagger This PR adds rolling node drains done at max_parallel and stagger of the update spec. It brings it inline with old behavior.	2017-07-07 12:12:48 -07:00
Alex Dadgar	e229d3650b	Attach eval id	2017-07-07 12:10:04 -07:00
Alex Dadgar	067ed86a47	Client watches for allocation health using task state and Consul checks This PR adds watching of allocation health at the client. The client can watch for health based on the tasks running on time and also based on the consul checks passing.	2017-07-07 12:10:04 -07:00
Alex Dadgar	5a2449d236	Respond to review comments	2017-04-19 10:54:03 -07:00
Alex Dadgar	3145086a42	non-purge deregisters	2017-04-15 17:08:05 -07:00
Alex Dadgar	b69b357c7f	Nomad builds	2017-02-07 20:31:23 -08:00
Diptanu Choudhury	5191b4d33a	Making the status command return the allocs of currently registered job	2016-11-24 16:31:30 +01:00
Diptanu Choudhury	36edabb487	Fixed the logic of calculating queued allocation in sys sched (#1724 )	2016-09-20 12:05:19 -07:00
Diptanu Choudhury	1b3c5e98c8	Renaming LocalDisk to EphemeralDisk (#1710 ) Renaming LocalDisk to EphemeralDisk	2016-09-14 15:43:42 -07:00
Diptanu Choudhury	52e9946da9	Implemented SetPrefferingNodes in stack	2016-08-30 16:17:50 -07:00
Diptanu Choudhury	bfee7b30a3	Introducing shared resources in alloc	2016-08-29 13:49:25 -07:00
Diptanu Choudhury	13497913f9	Ensuring resources are re-calculated properly in fsm	2016-08-26 20:13:11 -07:00
Diptanu Choudhury	ffaf6c6299	Fixed some tests	2016-08-25 13:56:39 -05:00
Diptanu Choudhury	c1a455983d	Added the chained alloc for system scheduler	2016-08-16 10:49:45 -07:00
Alex Dadgar	64f7eff612	Plan on system scheduler doesn't count nodes who don't meet constraints	2016-08-11 15:26:25 -07:00
Diptanu Choudhury	23fcb9f5c9	Ensuring system sched doesn't increment queued count when nodes are filtered	2016-08-10 14:33:13 -07:00
Diptanu Choudhury	13bab5b1ad	Added scheduler tests	2016-08-09 14:52:25 -07:00
Diptanu Choudhury	ab94c8eed9	Marking allocations which are not terminal and are on down nodes as lost	2016-08-09 13:11:58 -07:00
Alex Dadgar	ac3328e812	Make scheduler mark allocations as lost	2016-08-03 15:57:46 -07:00

1 2

82 commits