open-nomad

Author	SHA1	Message	Date
Danielle Lancashire	9d4307a3ef	csi_endpoint: Provide AllocID in req, and return Volume Currently, the client has to ship an entire allocation to the server as part of performing a VolumeClaim, this has a few problems: Firstly, it means the client is sending significantly more data than is required (an allocation contains the entire contents of a Nomad job, alongside other irrelevant state) which has a non-zero (de)serialization cost. Secondly, because the allocation was never re-fetched from the state store, it means that we were potentially open to issues caused by stale state on a misbehaving or malicious client. The change removes both of those issues at the cost of a couple of more state store lookups, but they should be relatively cheap. We also now provide the CSIVolume in the response for a claim, so the client can perform a Claim without first going ahead and fetching all of the volumes.	2020-03-23 13:58:30 -04:00
Lang Martin	5b31b140c3	csi: do not use namespace specific identifiers	2020-03-23 13:58:29 -04:00
Lang Martin	857cd37ab5	fsm: dispatch CSIVolume register, deregister, claim	2020-03-23 13:58:29 -04:00
Seth Hoenig	9df33f622f	nomad: proxy requests for Service Identity tokens between Clients and Consul Nomad jobs may be configured with a TaskGroup which contains a Service definition that is Consul Connect enabled. These service definitions end up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In the case where Consul ACLs are enabled, a Service Identity token is required for these tasks to run & connect, etc. This changeset enables the Nomad Server to recieve RPC requests for the derivation of SI tokens on behalf of instances of Consul Connect using Tasks. Those tokens are then relayed back to the requesting Client, which then injects the tokens in the secrets directory of the Task.	2020-01-31 19:03:53 -06:00
Seth Hoenig	2b66ce93bb	nomad: ensure a unique ClusterID exists when leader (gh-6702) Enable any Server to lookup the unique ClusterID. If one has not been generated, and this node is the leader, generate a UUID and attempt to apply it through raft. The value is not yet used anywhere in this changeset, but is a prerequisite for gh-6701.	2020-01-31 19:03:26 -06:00
Mahmood Ali	d740d347ce	Migrate old alloc structs on read This commit ensures that Alloc.AllocatedResources is properly populated when read from persistence stores (namely Raft and client state store). The alloc struct may have been written previously by an arbitrary old version that may only populate Alloc.TaskResources.	2020-01-09 08:46:50 -05:00
Lang Martin	ea275d5ce7	fsm attach UnblockNode on node updates	2019-07-18 10:32:12 -04:00
Lang Martin	a95225d754	NodeDeregisterBatch -> NodeBatchDeregister match JobBatch pattern	2019-07-10 13:56:20 -04:00
Lang Martin	44cbca9b98	fsm new NodeDeregisterBatchRequestType sorted at the end of the case	2019-07-10 13:56:20 -04:00
Lang Martin	1cc6b4062c	fsm label batch_deregister_node metrics explicitly Co-Authored-By: Mahmood Ali <mahmood@notnoop.com>	2019-07-10 13:56:20 -04:00
Lang Martin	ce0f03651a	fsm support new NodeDeregisterBatchRequest	2019-07-10 13:56:20 -04:00
Lang Martin	6dbf5d7d13	fsm return an error on both NodeDeregisterRequest fields set	2019-07-10 13:56:19 -04:00
Lang Martin	fbc78ba96c	fsm variable names for consistency	2019-07-10 13:56:19 -04:00
Lang Martin	3bf41211fb	fsm honor new and old style NodeDeregisterRequests	2019-07-10 13:56:19 -04:00
Lang Martin	a97407e030	fsm NodeDeregisterRequest is now a batch	2019-07-10 13:56:19 -04:00
Preetha Appan	10e7d6df6d	Remove compat code associated with many previous versions of nomad This removes compat code for namespaces (0.7), Drain(0.8) and other older features from releases older than Nomad 0.7	2019-06-25 19:05:25 -05:00
Mahmood Ali	6bdbeed319	set node.StatusUpdatedAt in raft Fix a case where `node.StatusUpdatedAt` was manipulated directly in memory. This ensures that StatusUpdatedAt is set in raft layer, and ensures that the field is updated when node drain/eligibility is updated too.	2019-05-21 16:13:32 -04:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Preetha Appan	be36fee48e	Use IsParameterized/isPeriodic methods	2019-01-17 12:15:42 -06:00
Preetha Appan	81a8f18cac	Fix bug in reconcile summaries that affects periodic/parameterized jobs This fixes incorrect parent job summaries by recomputing them in the ReconcileJobSummaries method in the state store	2019-01-17 12:01:01 -06:00
Mahmood Ali	a4a9347501	fix comment typos	2018-11-14 08:36:14 -05:00
Alex Dadgar	08dc2ea702	Merge pull request #4867 from hashicorp/b-deployment-progress-deadline Blocked evaluation fixes	2018-11-13 10:29:03 -08:00
Mahmood Ali	8513b3cccb	Comment public functions and batch write txn	2018-11-12 16:09:39 -05:00
Mahmood Ali	9c0a15f3ce	Run job deregistering in a single transaction Fixes https://github.com/hashicorp/nomad/issues/4299 Upon investigating this case further, we determined the issue to be a race between applying `JobBatchDeregisterRequest` fsm operation and processing job-deregister evals. Processing job-deregister evals should wait until the FSM log message finishes applying, by using the snapshot index. However, with `JobBatchDeregister`, any single individual job deregistering was applied accidentally incremented the snapshot index and resulted into processing job-deregister evals. When a Nomad server receives an eval for a job in the batch that is yet to be deleted, we accidentally re-run it depending on the state of allocation. This change ensures that we delete deregister all of the jobs and inserts all evals in a single transactions, thus blocking processing related evals until deregistering complete.	2018-11-09 22:35:26 -05:00
Alex Dadgar	b1c5d52817	Track jobs by namespace	2018-11-07 10:22:08 -08:00
Preetha Appan	32cc764072	Add fsm layer tests	2018-10-30 11:06:32 -05:00
Preetha Appan	7b8156fc47	Restore/Snapshot plus unit tests for scheduler configuration	2018-10-30 11:06:32 -05:00
Preetha Appan	bd34cbb1f7	Support for new scheduler config API, first use case is to disable preemption	2018-10-30 11:06:32 -05:00
Preetha Appan	cc295b90de	Implement preemption for system jobs. This commit implements an allocation selection algorithm for finding allocations to preempt. It currently special cases network resource asks from others (cpu/memory/disk/iops).	2018-10-30 11:06:32 -05:00
Alex Dadgar	52f9cd7637	fixing tests	2018-10-04 14:26:19 -07:00
Alex Dadgar	ca28afa3b2	small fixes	2018-09-15 16:42:38 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	300b1a7a15	Tests only use testlog package logger	2018-06-13 15:40:56 -07:00
Alex Dadgar	352f2e03b5	Clean up leaked deployments on restoration This PR cancels deployments that are active but do not have a job associated with them. This is a broken invariant that causes issues in the deployment watcher since it will not track them. Thus they are objects that can't be operated on or cleaned up. Fixes https://github.com/hashicorp/nomad/issues/4286	2018-05-23 16:44:21 -07:00
Alex Dadgar	17aac1c9de	node heartbeat missed event	2018-05-22 14:05:46 -07:00
Alex Dadgar	5f2080bc26	Emit events based on eligibility	2018-05-22 14:04:59 -07:00
Alex Dadgar	a35248d1d8	Plumb event via FSM	2018-05-10 16:30:54 -07:00
Alex Dadgar	c91ce5cc38	Fix not enqueuing eval	2018-05-07 14:50:01 -05:00
Alex Dadgar	641ef81cbf	Test fixes	2018-05-07 14:50:01 -05:00
Alex Dadgar	d0f237086b	UX touchups	2018-04-26 15:24:27 -07:00
Alex Dadgar	2b14371db5	Fix spelling	2018-04-03 15:58:03 -07:00
Alex Dadgar	9617a13a2b	Correctly handle the upgrade path of a node being drained when applying Raft logs	2018-04-03 15:32:44 -07:00
Alex Dadgar	301704091b	Handle upgrade where Node doesn't have eligiblity This PR handles upgrading a node that has no scheduling eligiblity set.	2018-03-29 16:52:23 -07:00
Alex Dadgar	2d91b9dfba	Batch drain update	2018-03-21 16:51:44 -07:00
Alex Dadgar	7b2bad8c5e	Toggle Drain allows resetting eligibility This PR allows marking a node as eligible for scheduling while toggling drain. By default the `nomad node drain -disable` commmand will mark it as eligible but the drainer will maintain in-eligibility.	2018-03-21 16:51:44 -07:00
Alex Dadgar	010a6b8ca5	Unblock evals once eligible	2018-03-21 16:51:44 -07:00
Alex Dadgar	0fba0101b6	RPC/FSM/State Store for Eligibility	2018-03-21 16:51:44 -07:00
Alex Dadgar	2f5309d82a	Remove update time	2018-03-21 16:51:43 -07:00
Alex Dadgar	e459a666ed	Node.Drain takes strategy	2018-03-21 16:49:48 -07:00
Michael Schurter	d1ec65d765	switch to new raft DesiredTransition message	2018-03-21 16:49:48 -07:00

1 2 3 4

174 commits