open-nomad

Commit Graph

Author	SHA1	Message	Date
Drew Bailey	009b8d5363	Persist shared allocated ports for inplace update (#9830 ) * Persist shared allocated ports for inplace update Ports were not copied over when performing inplace updates in the generic scheduler * changelog * drop spew	2021-01-15 12:45:12 -05:00
Drew Bailey	c87adfac62	persist shared ports during inplace updates (#9736 ) AllocatedSharedResources were not being copied over to the new allocation struct the scheduler makes during inplace updates. This caused downstream issues after the plan was applied, namely the shared ports were dropped causing issues with service registration/deregistration. test that shared ports are preserved change log, also carry over shared network copy networks	2021-01-08 09:00:41 -05:00
Seth Hoenig	f44a4f68ee	consul/connect: trigger update as necessary on connect changes This PR fixes a long standing bug where submitting jobs with changes to connect services would not trigger updates as expected. Previously, service blocks were not considered as sources of destructive updates since they could be synced with consul non-destructively. With Connect, task group services that have changes to their connect block or to the service port should be destructive, since the network plumbing of the alloc is going to need updating. Fixes #8596 #7991 Non-destructive half in #7192	2020-10-05 14:53:00 -05:00
Mahmood Ali	def768728e	Have Plan.AppendAlloc accept the job	2020-08-25 17:22:09 -04:00
Mahmood Ali	8a342926b7	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Lang Martin	d3c4700cd3	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Mahmood Ali	9f005201e2	Ensure that alloc updates preserve device offers When an alloc is updated in-place, ensure that the allocated device are preserved and carried over to new alloc.	2020-04-21 08:57:15 -04:00
Mahmood Ali	6ddf3d1742	Merge pull request #7414 from hashicorp/b-network-mode-change Detect network mode change	2020-03-24 09:46:40 -04:00
Mahmood Ali	b880607bad	update scheduler to account for hooks	2020-03-21 17:52:45 -04:00
Mahmood Ali	9568553d7e	Detect network mode change Mark job as updated if network mode changed.	2020-03-21 16:51:10 -04:00
Drew Bailey	1c046a74d8	comment for filtering reason	2020-02-03 09:02:09 -05:00
Drew Bailey	6b492630dd	make diffSystemAllocsForNode aware of eligibility diffSystemAllocs -> diffSystemAllocsForNode, this function is only used for diffing system allocations, but lacked awareness of eligible nodes and the node ID that the allocation was going to be placed. This change now ignores a change if its existing allocation is on an ineligible node. For a new allocation, it also checks tainted and ineligible nodes in the same function instead of nil-ing out the diff after computation in diffSystemAllocs	2020-02-03 09:02:08 -05:00
Drew Bailey	e613a258da	ignore computed diffs if node is ineligible test flakey, add temp sleeps for debugging fix computed class	2020-02-03 09:02:08 -05:00
Drew Bailey	ef175c0b31	Update Evicted allocations to lost when lost If an alloc is being preempted and marked as evict, but the underlying node is lost before the migration takes place, the allocation currently stays as desired evict, status running forever, or until the node comes back online. This commit updates updateNonTerminalAllocsToLost to check for a destired status of Evict as well as Stop when updating allocations on tainted nodes. switch to table test for lost node cases	2020-01-07 13:34:18 -05:00
Drew Bailey	876618b5d2	Removes checking constraints for inplace update	2019-11-19 13:34:41 -05:00
Drew Bailey	e44a66d7fc	DOCS: Spread stanza does not exist on task Fixes documentation inaccuracy for spread stanza placement. Spreads can only exist on the top level job struct or within a group. comment about nil assumption	2019-11-19 08:26:36 -05:00
Drew Bailey	07e3164bf9	Check for changes to affinity and constraints Adds checks for affinity and constraint changes when determining if we should update inplace. refactor to check all levels at once check for spread changes when checking inplace update	2019-11-19 08:26:34 -05:00
Chris Baker	95ae01a9f4	the scheduler checks whether task changes require a restart, this needed to be updated to consider devices	2019-11-07 17:51:15 +00:00
Michael Schurter	c6bbe85f42	core: fix panic when AllocatedResources is nil Fix for #6540	2019-10-28 14:38:21 -07:00
Preetha Appan	9accf60805	update comment	2019-09-05 18:43:30 -05:00
Preetha Appan	d21c708c4a	Fix inplace updates bug with group level networks During inplace updates, we should be using network information from the previous allocation being updated.	2019-09-05 18:37:24 -05:00
Nick Ethier	7c9520b404	scheduler: fix disk constraints	2019-07-31 01:04:08 -04:00
Nick Ethier	09a4cfd8d7	fix failing tests	2019-07-31 01:04:07 -04:00
Nick Ethier	af66a35924	networking: Add new bridge networking mode implementation	2019-07-31 01:04:06 -04:00
Nick Ethier	15989bba8e	ar: cleanup lint errors	2019-07-31 01:03:18 -04:00
Nick Ethier	66c514a388	Add network lifecycle management Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.	2019-07-31 01:03:17 -04:00
Arshneet Singh	b977748a4b	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Preetha Appan	da1ce9bcea	Fix bug where scoring metadata would be overridden during an inplace upgrade.	2019-03-12 23:36:46 -05:00
Alex Dadgar	1e3c3cb287	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Alex Dadgar	a78cefec18	use int64	2018-10-16 15:34:32 -07:00
Preetha Appan	7c0d8c646c	Change CPU/Disk/MemoryMB to int everywhere in new resource structs	2018-10-16 16:21:42 -05:00
Alex Dadgar	52f9cd7637	fixing tests	2018-10-04 14:26:19 -07:00
Alex Dadgar	bac5cb1e8b	Scheduler uses allocated resources	2018-10-02 17:08:25 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	3aa4ee9d75	Fix lost handling of not actually down nodes	2018-03-30 14:17:41 -07:00
Alex Dadgar	b18f789020	Unmark drain when nodes hit their deadline and only batch/system left and add all job type integration test	2018-03-28 17:25:58 -07:00
Alex Dadgar	9d60e2cebf	Correct status desc on draining system allocs	2018-03-26 17:54:46 -07:00
Michael Schurter	d1ec65d765	switch to new raft DesiredTransition message	2018-03-21 16:49:48 -07:00
Alex Dadgar	db4a634072	RPC, FSM, State Store for marking DesiredTransistion fix build tag	2018-03-21 16:49:48 -07:00
Michael Schurter	c0542474db	drain: initial drainv2 structs and impl	2018-03-21 16:49:48 -07:00
Josh Soref	ed8db9992e	spelling: feasibility	2018-03-11 18:07:09 +00:00
Preetha Appan	fbb1936dee	Fix some comments and lint warnings, remove unused method	2018-01-31 09:56:53 -06:00
Preetha Appan	031c566ada	Reschedule previous allocs and track their reschedule attempts	2018-01-31 09:56:53 -06:00
Alex Dadgar	4173834231	Enable more linters	2017-09-26 15:26:33 -07:00
Alex Dadgar	641e178416	Stop before trying to place	2017-07-17 17:18:12 -07:00
Alex Dadgar	454083ba1b	Remove canary	2017-07-07 12:10:04 -07:00
Alex Dadgar	477c713df5	Plan apply handles canaries and success is set via update	2017-07-07 12:10:04 -07:00
Alex Dadgar	e229d3650b	Attach eval id	2017-07-07 12:10:04 -07:00

1 2 3

127 Commits