open-nomad

Author	SHA1	Message	Date
Michael Schurter	547a718ef6	Merge pull request #10248 from hashicorp/f-remotetask-2021 core: propagate remote task handles	2021-04-30 08:57:26 -07:00
Michael Schurter	641eb1dc1a	clarify docs from pr comments	2021-04-30 08:31:31 -07:00
Mahmood Ali	52d881f567	Allow configuring memory oversubscription (#10466 ) Cluster operators want to have better control over memory oversubscription and may want to enable/disable it based on their experience. This PR adds a scheduler configuration field to control memory oversubscription. It's additional field that can be set in the [API via Scheduler Config](https://www.nomadproject.io/api-docs/operator/scheduler), or [the agent server config](https://www.nomadproject.io/docs/configuration/server#configuring-scheduler-config). I opted to have the memory oversubscription be an opt-in, but happy to change it. To enable it, operators should call the API with: ```json { "MemoryOversubscriptionEnabled": true } ``` If memory oversubscription is disabled, submitting jobs specifying `memory_max` will get a "Memory oversubscription is not enabled" warnings, but the jobs will be accepted without them accessing the additional memory. The warning message is like: ``` $ nomad job run /tmp/j Job Warnings: 1 warning(s): * Memory oversubscription is not enabled; Task cache.redis memory_max value will be ignored ==> Monitoring evaluation "7c444157" Evaluation triggered by job "example" ==> Monitoring evaluation "7c444157" Evaluation within deployment: "9d826f13" Allocation "aa5c3cad" created: node "9272088e", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "7c444157" finished with status "complete" # then you can examine the Alloc AllocatedResources to validate whether the task is allowed to exceed memory: $ nomad alloc status -json aa5c3cad \| jq '.AllocatedResources.Tasks["redis"].Memory' { "MemoryMB": 256, "MemoryMaxMB": 0 } ```	2021-04-29 22:09:56 -04:00
Luiz Aoqui	f1b9055d21	Add metrics for blocked eval resources (#10454 ) * add metrics for blocked eval resources * docs: add new blocked_evals metrics * fix to call `pruneStats` instead of `stats.prune` directly	2021-04-29 15:03:45 -04:00
Michael Schurter	e62795798d	core: propagate remote task handles Add a new driver capability: RemoteTasks. When a task is run by a driver with RemoteTasks set, its TaskHandle will be propagated to the server in its allocation's TaskState. If the task is replaced due to a down node or draining, its TaskHandle will be propagated to its replacement allocation. This allows tasks to be scheduled in remote systems whose lifecycles are disconnected from the Nomad node's lifecycle. See https://github.com/hashicorp/nomad-driver-ecs for an example ECS remote task driver.	2021-04-27 15:07:03 -07:00
Andrii Chubatiuk	712bd5f5a6	add support for host network interpolation	2021-04-13 09:53:05 -04:00
Seth Hoenig	f17ba33f61	consul: plubming for specifying consul namespace in job/group This PR adds the common OSS changes for adding support for Consul Namespaces, which is going to be a Nomad Enterprise feature. There is no new functionality provided by this changeset and hopefully no new bugs.	2021-04-05 10:03:19 -06:00
Chris Baker	436d46bd19	Merge branch 'main' into f-node-drain-api	2021-04-01 15:22:57 -05:00
Mahmood Ali	0c2551270a	oversubscription: Add MemoryMaxMB to internal structs Start tracking a new MemoryMaxMB field that represents the maximum memory a task may use in the client. This allows tasks to specify a memory reservation (to be used by scheduler when placing the task) but use excess memory used on the client if the client has any. This commit adds the server tracking for the value, and ensures that allocations AllocatedResource fields include the value.	2021-03-30 16:55:58 -04:00
Nick Ethier	daecfa61e6	Merge pull request #10203 from hashicorp/f-cpu-cores Reserved Cores [1/4]: Structs and scheduler implementation	2021-03-29 14:05:54 -04:00
Chris Baker	770c9cecb5	restored Node.Sanitize() for RPC endpoints multiple other updates from code review	2021-03-26 17:03:15 +00:00
Chris Baker	dd291e69f4	removed deprecated fields from Drain structs and API node drain: use msgtype on txn so that events are emitted wip: encoding extension to add Node.Drain field back to API responses new approach for hiding Node.SecretID in the API, using `json` tag documented this approach in the contributing guide refactored the JSON handlers with extensions modified event stream encoding to use the go-msgpack encoders with the extensions	2021-03-21 15:30:11 +00:00
Nick Ethier	b8a48bc325	scheduler: detect job change in cores resource	2021-03-19 22:25:50 -04:00
Nick Ethier	648ade63ad	scheduler: implement scheduling of reserved cores	2021-03-19 00:29:07 -04:00
Tim Gross	fa25e048b2	CSI: unique volume per allocation Add a `PerAlloc` field to volume requests that directs the scheduler to test feasibility for volumes with a source ID that includes the allocation index suffix (ex. `[0]`), rather than the exact source ID. Read the `PerAlloc` field when making the volume claim at the client to determine if the allocation index suffix (ex. `[0]`) should be added to the volume source ID.	2021-03-18 15:35:11 -04:00
Tim Gross	9b2b580d1a	CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158 ) Callers of `CSIVolumeByID` are generally assuming they should receive a single volume. This potentially results in feasibility checking being performed against the wrong volume if a volume's ID is a prefix substring of other volume (for example: "test" and "testing"). Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix matching in the command line client. Add the required elements for prefix matching to the commands and API.	2021-03-18 14:32:40 -04:00
Tim Gross	0e3264aa4f	scheduler/csi: fix early return when multiple volumes are requested When multiple CSI volumes are requested, the feasibility check could return early for read/write volumes with free claims, even if a later volume in the request was not feasible for any other reason (including not existing at all). This can result in random failure to fail feasibility checking, depending on how the map of volumes was being ordered at runtime. Remove the early return from the feasibility check. Add a test to verify that missing volumes in the map will cause a failure; this test will not catch a regression every test run because of the random map ordering, but any failure will be caught over the course of several CI runs.	2021-03-10 15:18:36 -05:00
Seth Hoenig	4f759f1cc8	consul/connect: correctly detect when connect tasks not updated This PR fixes a bug where tasks with Connect services could be triggered to destructively update (i.e. placed in a new alloc) when no update should be necessary. Fixes #10077	2021-02-23 15:12:49 -06:00
Nick Ethier	dc29b679b4	Merge pull request #9937 from hashicorp/b-9728 scheduler: add tests and fix for detected host_network and to port field changes	2021-02-02 13:54:41 -05:00
Nick Ethier	93095917dc	scheduler: add tests and fix for detected host_network and to port field changes	2021-02-01 15:56:43 -05:00
Drew Bailey	009b8d5363	Persist shared allocated ports for inplace update (#9830 ) * Persist shared allocated ports for inplace update Ports were not copied over when performing inplace updates in the generic scheduler * changelog * drop spew	2021-01-15 12:45:12 -05:00
Drew Bailey	c87adfac62	persist shared ports during inplace updates (#9736 ) AllocatedSharedResources were not being copied over to the new allocation struct the scheduler makes during inplace updates. This caused downstream issues after the plan was applied, namely the shared ports were dropped causing issues with service registration/deregistration. test that shared ports are preserved change log, also carry over shared network copy networks	2021-01-08 09:00:41 -05:00
Kris Hicks	0cf9cae656	Apply some suggested fixes from staticcheck (#9598 )	2020-12-10 07:29:18 -08:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Kris Hicks	62972cc839	scheduler: Fix always-false sort func (#9547 ) Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-12-08 09:57:47 -08:00
Nick Ethier	d21cbeb30f	command: remove task network usage from init examples	2020-11-23 10:25:11 -06:00
Seth Hoenig	6b89527505	scheduler: enable upgrade path for bridge network finger print This PR enables users of Nomad < 0.12 to upgrade to Nomad 0.12 and beyond. Nomad 0.12 introduced a network fingerprinter for bridge networks, which is a contstraint checked for if bridge network is being used. If users upgrade servers first as is recommended, suddenly no clients running older versions of Nomad will satisfy the bridge network resource constraint. Instead, this change only enforces the constraint if the Nomad client version is also >= 0.12. Closes #8423	2020-11-13 14:17:01 -06:00
Drew Bailey	6c788fdccd	Events/msgtype cleanup (#9117 ) * use msgtype in upsert node adds message type to signature for upsert node, update tests, remove placeholder method * UpsertAllocs msg type test setup * use upsertallocs with msg type in signature update test usage of delete node delete placeholder msgtype method * add msgtype to upsert evals signature, update test call sites with test setup msg type handle snapshot upsert eval outside of FSM and ignore eval event remove placeholder upsertevalsmsgtype handle job plan rpc and prevent event creation for plan msgtype cleanup upsertnodeevents updatenodedrain msgtype msg type 0 is a node registration event, so set the default to the ignore type * fix named import * fix signature ordering on upsertnode to match	2020-10-19 09:30:15 -04:00
Michael Schurter	dd09fa1a4a	Merge pull request #9055 from hashicorp/f-9017-resources api: add field filters to /v1/{allocations,nodes}	2020-10-14 14:49:39 -07:00
Michael Schurter	8ccbd92cb6	api: add field filters to /v1/{allocations,nodes} Fixes #9017 The ?resources=true query parameter includes resources in the object stub listings. Specifically: - For `/v1/nodes?resources=true` both the `NodeResources` and `ReservedResources` field are included. - For `/v1/allocations?resources=true` the `AllocatedResources` field is included. The ?task_states=false query parameter removes TaskStates from /v1/allocations responses. (By default TaskStates are included.)	2020-10-14 10:35:22 -07:00
Drew Bailey	b4c135358d	use Events to wrap index and events, store in events table	2020-10-14 12:44:39 -04:00
Drew Bailey	9d48818eb8	writetxn can return error, add alloc and job generic events. Add events table for durability	2020-10-14 12:44:39 -04:00
Drew Bailey	400455d302	Events/eval alloc events (#9012 ) * generic eval update event first pass at alloc client update events * api/event client	2020-10-14 12:44:37 -04:00
Drew Bailey	4793bb4e01	Events/deployment events (#9004 ) * Node Drain events and Node Events (#8980) Deployment status updates handle deployment status updates (paused, failed, resume) deployment alloc health generate events from apply plan result txn err check, slim down deployment event one ndjson line per index * consolidate down to node event + type * fix UpdateDeploymentAllocHealth test invocations * fix test	2020-10-14 12:44:37 -04:00
Tim Gross	3ceb5b36b1	csi: allow more than 1 writer claim for multi-writer mode (#9040 ) Fixes a bug where CSI volumes with the `MULTI_NODE_MULTI_WRITER` access mode were using the same logic as `MULTI_NODE_SINGLE_WRITER` to determine whether the volume had writer claims available for scheduling. Extends CSI claim endpoint test to exercise multi-reader and make sure `WriteFreeClaims` is exercised for multi-writer in feasibility test.	2020-10-07 10:43:23 -04:00
Seth Hoenig	f44a4f68ee	consul/connect: trigger update as necessary on connect changes This PR fixes a long standing bug where submitting jobs with changes to connect services would not trigger updates as expected. Previously, service blocks were not considered as sources of destructive updates since they could be synced with consul non-destructively. With Connect, task group services that have changes to their connect block or to the service port should be destructive, since the network plumbing of the alloc is going to need updating. Fixes #8596 #7991 Non-destructive half in #7192	2020-10-05 14:53:00 -05:00
Neil Mock	f749de8543	Fix multi-interface networking in the system scheduler (#8822 )	2020-09-22 12:54:34 -04:00
Mahmood Ali	6a0dd8bc87	Merge pull request #8867 from hashicorp/b-canary-substitution scheduler: Revert requireCanary logic	2020-09-15 12:58:55 -05:00
Mahmood Ali	339617a836	Only ignore rescheduled allocations if they got stopped	2020-09-14 21:11:52 -04:00
Mahmood Ali	98de2d2278	add a test when .NextAllocation is set but alloc is still running	2020-09-14 17:12:53 -04:00
Mahmood Ali	fd54cfce6e	Revert the `requireCanary` check introduced in https://github.com/hashicorp/nomad/pull/8691/files#diff-1801138ac4d10f2064ba6f2e434ac9b4L430-R431 . The change was intended to fix a case where a canary alloc may fail to be rescheduled if all the other allocs fail as well (e.g. if all allocs happen to be placed on a node that died). However, it introduced some unintended side-effects. Reverting the change for now and will investigate further.	2020-09-10 14:59:02 -04:00
Mahmood Ali	c6e1d22697	test for rescheduling non-canaries	2020-09-10 14:59:02 -04:00
Mahmood Ali	8837c9a45d	Handle migration of non-deployment jobs This handles the case where a job when from no-deployment to deployment with canaries. Consider a case where a `max_parallel=0` job is submitted as version 0, then an update is submitted with `max_parallel=1, canary=1` as verion 1. In this case, we will have 1 canary alloc, and all remaining allocs will be version 0. Until the deployment is promoted, we ought to replace the canaries with version 0 job (which isn't associated with a deployment).	2020-08-26 10:36:34 -04:00
Mahmood Ali	2438b90334	Update scheduler/reconcile.go Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-08-25 17:37:19 -04:00
Mahmood Ali	38b61b97d8	simplify canary check `(alloc.DeploymentStatus == nil \|\| !alloc.DeploymentStatus.IsCanary())` and `!alloc.DeploymentStatus.IsCanary()` are equivalent.	2020-08-25 17:37:19 -04:00
Mahmood Ali	e4bb88dfcf	tweak stack job manipulation To address review comments	2020-08-25 17:37:19 -04:00
Mahmood Ali	def768728e	Have Plan.AppendAlloc accept the job	2020-08-25 17:22:09 -04:00
Mahmood Ali	8a342926b7	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Lars Lehtonen	fb7b2282b1	scheduler: label loops with nested switch statements for effective break (#8528 )	2020-07-24 08:50:41 -04:00
Tim Gross	1ca2c4ec2c	scheduler: DesiredCanaries can be set on every pass safely The reconcile loop sets `DeploymentState.DesiredCanaries` only on the first pass through the loop and if the job is not paused/pending. In MRD, deployments will make one pass though the loop while "pending", and were not ever getting `DesiredCanaries` set. We can't set it in the initial `DeploymentState` constructor because the first pass through setting up canaries expects it's not there yet. However, this value is static for a given version of a job because it's coming from the update stanza, so it's safe to re-assign the value on subsequent passes.	2020-07-20 11:25:53 -04:00

1 2 3 4 5 ...

738 commits