open-nomad

Author	SHA1	Message	Date
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Seth Hoenig	3aaaedf52e	cli: forward request for job validation to nomad leader This PR changes the behavior of 'nomad job validate' to forward the request to the nomad leader, rather than responding from any server. This is because we need the leader when validating Vault tokens, since the leader is the only server with an active vault client.	2022-08-10 14:34:04 -05:00
Luiz Aoqui	a9a66ad018	api: apply new ACL check for wildcard namespace (#13608 ) api: apply new ACL check for wildcard namespace In #13606 the ACL check was refactored to better support the all namespaces wildcard (`*`). This commit applies the changes to the jobs and alloc list endpoints.	2022-07-06 16:17:16 -04:00
Luiz Aoqui	ab7eb5de6e	Support Vault entity aliases (#12449 ) Move some common Vault API data struct decoding out of the Vault client so it can be reused in other situations. Make Vault job validation its own function so it's easier to expand it. Rename the `Job.VaultPolicies` method to just `Job.Vault` since it returns the full Vault block, not just their policies. Set `ChangeMode` on `Vault.Canonicalize`. Add some missing tests. Allows specifying an entity alias that will be used by Nomad when deriving the task Vault token. An entity alias assigns an indentity to a token, allowing better control and management of Vault clients since all tokens with the same indentity alias will now be considered the same client. This helps track Nomad activity in Vault's audit logs and better control over Vault billing. Add support for a new Nomad server configuration to define a default entity alias to be used when deriving Vault tokens. This default value will be used if the task doesn't have an entity alias defined.	2022-04-05 14:18:10 -04:00
Seth Hoenig	9670adb6c6	cleanup: purge github.com/pkg/errors	2022-04-01 19:24:02 -05:00
James Rasell	a646333263	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
Tim Gross	066a820209	job summary query in `Job.List` RPC should use job's namespace (#12249 ) The `Job.List` RPC attaches a `JobSummary` to each job stub. We're using the request namespace and not the job namespace for that query, which results in a nil `JobSummary` whenever we pass the wildcard namespace. This is incorrect and causes panics in downstream consumers like the CLI, which assume the `JobSummary` is non-nil as an unstate invariant.	2022-03-09 10:47:19 -05:00
Luiz Aoqui	ab8ce87bba	Add pagination, filtering and sort to more API endpoints (#12186 )	2022-03-08 20:54:17 -05:00
James Rasell	ca6ba2e047	rpc: add job service registration list RPC endpoint.	2022-03-03 11:26:14 +01:00
Florian Apolloner	3bced8f558	namespaces: allow enabling/disabling allowed drivers per namespace	2022-02-24 09:27:32 -05:00
Luiz Aoqui	c4cff5359f	Verify TLS certificate on endpoints that are used between agents only (#11956 )	2022-02-02 15:03:18 -05:00
Luiz Aoqui	b1753d0568	scheduler: detect and log unexpected scheduling collisions (#11793 )	2022-01-14 20:09:14 -05:00
Tim Gross	03e697a69d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Luiz Aoqui	9d6842dd4d	Don't emit scaling event error when a deployment is underway (#11556 )	2021-11-23 10:20:18 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
James Rasell	99955eb80f	Merge pull request #11426 from hashicorp/b-set-dereg-eval-priority-correctly rpc: set the deregistration eval priority to the job priority.	2021-11-05 15:53:10 +01:00
James Rasell	ac9268a429	rpc: set the job scale eval priority to the job priority.	2021-11-02 12:57:53 +01:00
James Rasell	afb6913428	rpc: set the deregistration eval priority to the job priority. Previously when creating an eval for job deregistration, the eval priority was set to the default value irregardless of the job priority. In situations where an operator would want to deregister a high priority job so they could re-register; the evaluation may get blocked for some time on a busy cluster because of the deregsiter priority. If a job had a lower than default priority and was deregistered, the deregister eval would get a priority higher than that of the job. If we attempted to register another job with a higher priority than this, but still below the default, the deregister would be actioned before the register. Both situations described above seem incorrect and unexpected from a user prespective. This fix modifies to behaviour to set the deregister eval priority to that of the job, if available. Otherwise the default value is still used.	2021-11-02 09:11:44 +01:00
Alex Munda	9e5061ef87	Update idempotency comment to reflect all jobs Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2021-07-07 15:54:56 -05:00
Alex Munda	557a227de1	Match idempotency key on all child jobs and return existing job when idempotency keys match.	2021-07-02 14:08:46 -05:00
Alex Munda	34c63b086b	Move idempotency check closer to validate. Log error.	2021-07-02 10:58:42 -05:00
Alex Munda	baae6d5546	Update comment about idempotency check	2021-06-30 12:30:44 -05:00
Alex Munda	01bcd9c41c	Make idempotency error user friendly Co-authored-by: Tim Gross <tgross@hashicorp.com>	2021-06-30 12:26:33 -05:00
Alex Munda	ca86c7ba0c	Add idempotency token to dispatch request instead of special meta key	2021-06-29 15:59:23 -05:00
Alex Munda	122136b657	Always allow idempotency key meta. Tests for idempotent dispatch	2021-06-29 10:30:04 -05:00
Alex Munda	561cd9fc7f	Enforce idempotency of dispatched jobs using special meta key	2021-06-23 17:10:31 -05:00
Mahmood Ali	52d881f567	Allow configuring memory oversubscription (#10466 ) Cluster operators want to have better control over memory oversubscription and may want to enable/disable it based on their experience. This PR adds a scheduler configuration field to control memory oversubscription. It's additional field that can be set in the [API via Scheduler Config](https://www.nomadproject.io/api-docs/operator/scheduler), or [the agent server config](https://www.nomadproject.io/docs/configuration/server#configuring-scheduler-config). I opted to have the memory oversubscription be an opt-in, but happy to change it. To enable it, operators should call the API with: ```json { "MemoryOversubscriptionEnabled": true } ``` If memory oversubscription is disabled, submitting jobs specifying `memory_max` will get a "Memory oversubscription is not enabled" warnings, but the jobs will be accepted without them accessing the additional memory. The warning message is like: ``` $ nomad job run /tmp/j Job Warnings: 1 warning(s): * Memory oversubscription is not enabled; Task cache.redis memory_max value will be ignored ==> Monitoring evaluation "7c444157" Evaluation triggered by job "example" ==> Monitoring evaluation "7c444157" Evaluation within deployment: "9d826f13" Allocation "aa5c3cad" created: node "9272088e", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "7c444157" finished with status "complete" # then you can examine the Alloc AllocatedResources to validate whether the task is allowed to exceed memory: $ nomad alloc status -json aa5c3cad \| jq '.AllocatedResources.Tasks["redis"].Memory' { "MemoryMB": 256, "MemoryMaxMB": 0 } ```	2021-04-29 22:09:56 -04:00
Mahmood Ali	cf24a9eaaf	api: /v1/jobs always include namespaces (#10434 ) Add Namespace as a top-level field in `/v1/jobs` stub. The `/v1/jobs` endpoint already includes the namespace under `JobSummary`, though the API is odd, as typically the job ID and Namespace are in the same level, and the oddity complicates the UI frontend development. The downside of adding it is redundant field, that makes the response body a bit bigger, specially for clusters with large jobs. Though, it should compress nicely and I expect the overhead to be small to overall response size. The benefit of a cleaner and more consistent API seem worth it. Fixes #10431	2021-04-23 16:36:54 -04:00
Seth Hoenig	509490e5d2	e2e: consul namespace tests from nomad ent (cherry-picked from ent without _ent things) This is part 2/4 of e2e tests for Consul Namespaces. Took a first pass at what the parameterized tests can look like, but only on the ENT side for this PR. Will continue to refactor in the next PRs. Also fixes 2 bugs: - Config Entries registered by Nomad Server on job registration were not getting Namespace set - Group level script checks were not getting Namespace set Those changes will need to be copied back to Nomad OSS. Nomad OSS + no ACLs (previously, needs refactor) Nomad ENT + no ACLs (this) Nomad OSS + ACLs (todo) Nomad ENT + ALCs (todo)	2021-04-19 15:35:31 -06:00
Charlie Voiselle	8afb9eb05d	Fix parameterized <-> non-parameterized job error (#10357 ) The error messages are reversed from tests performed above them. The test uses the `validateJobUpdate()` function, but ignores the text of the error message itself.	2021-04-12 09:27:04 -04:00
Seth Hoenig	f17ba33f61	consul: plubming for specifying consul namespace in job/group This PR adds the common OSS changes for adding support for Consul Namespaces, which is going to be a Nomad Enterprise feature. There is no new functionality provided by this changeset and hopefully no new bugs.	2021-04-05 10:03:19 -06:00
Mahmood Ali	5d75705edd	dispatched parameterized job should clear status too	2021-03-25 15:14:21 -04:00
Seth Hoenig	1ad219c441	consul/connect: remove debug line	2021-01-25 10:36:04 -06:00
Seth Hoenig	8b05efcf88	consul/connect: Add support for Connect terminating gateways This PR implements Nomad built-in support for running Consul Connect terminating gateways. Such a gateway can be used by services running inside the service mesh to access "legacy" services running outside the service mesh while still making use of Consul's service identity based networking and ACL policies. https://www.consul.io/docs/connect/gateways/terminating-gateway These gateways are declared as part of a task group level service definition within the connect stanza. service { connect { gateway { proxy { // envoy proxy configuration } terminating { // terminating-gateway configuration entry } } } } Currently Envoy is the only supported gateway implementation in Consul. The gateay task can be customized by configuring the connect.sidecar_task block. When the gateway.terminating field is set, Nomad will write/update the Configuration Entry into Consul on job submission. Because CEs are global in scope and there may be more than one Nomad cluster communicating with Consul, there is an assumption that any terminating gateway defined in Nomad for a particular service will be the same among Nomad clusters. Gateways require Consul 1.8.0+, checked by a node constraint. Closes #9445	2021-01-25 10:36:04 -06:00
Kris Hicks	abb8f2ebc0	Refactor Job.Scale() (#9771 )	2021-01-14 12:40:42 -08:00
Chris Baker	3546469205	nicer error message	2021-01-08 21:13:29 +00:00
Chris Baker	d43e0d10c0	appease the linter and fix an incorrect test	2021-01-08 19:38:25 +00:00
Chris Baker	49effd5840	in Job.Scale, ensure that new count is within [min,max] configured in scaling policy resolves #9758	2021-01-08 19:24:36 +00:00
Chris Baker	719077a26d	added new policy capabilities for recommendations API state store: call-out to generic update of job recommendations from job update method recommendations API work, and http endpoint errors for OSS support for scaling polices in task block of job spec add query filters for ScalingPolicy list endpoint command: nomad scaling policy list: added -job and -type	2020-10-28 14:32:16 +00:00
Drew Bailey	6c788fdccd	Events/msgtype cleanup (#9117 ) * use msgtype in upsert node adds message type to signature for upsert node, update tests, remove placeholder method * UpsertAllocs msg type test setup * use upsertallocs with msg type in signature update test usage of delete node delete placeholder msgtype method * add msgtype to upsert evals signature, update test call sites with test setup msg type handle snapshot upsert eval outside of FSM and ignore eval event remove placeholder upsertevalsmsgtype handle job plan rpc and prevent event creation for plan msgtype cleanup upsertnodeevents updatenodedrain msgtype msg type 0 is a node registration event, so set the default to the ignore type * fix named import * fix signature ordering on upsertnode to match	2020-10-19 09:30:15 -04:00
Michael Schurter	8ccbd92cb6	api: add field filters to /v1/{allocations,nodes} Fixes #9017 The ?resources=true query parameter includes resources in the object stub listings. Specifically: - For `/v1/nodes?resources=true` both the `NodeResources` and `ReservedResources` field are included. - For `/v1/allocations?resources=true` the `AllocatedResources` field is included. The ?task_states=false query parameter removes TaskStates from /v1/allocations responses. (By default TaskStates are included.)	2020-10-14 10:35:22 -07:00
Tim Gross	4335d847a4	Allow job Version to start at non-zero value (#9071 ) Stop coercing version of new job to 0 in the state_store, so that we can add regions to a multi-region deployment. Send new version, rather than existing version, to MRD to accomodate version-choosing logic changes in ENT. Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-10-12 13:59:48 -04:00
Chris Baker	d0cc0a768b	Update nomad/job_endpoint.go	2020-09-10 17:18:23 -05:00
Chris Baker	eff726609d	move variable out of oss-only build into shared file, fixes ent compile error introduced by #8834	2020-09-10 22:08:25 +00:00
Tim Gross	b77fe023b5	MRD: move 'job stop -global' handling into RPC (#8776 ) The initial implementation of global job stop for MRD looped over all the regions in the CLI for expedience. This changeset includes the OSS parts of moving this into the RPC layer so that API consumers don't have to implement this logic themselves.	2020-08-28 14:28:13 -04:00
Seth Hoenig	c4fd1c97aa	Merge pull request #8761 from hashicorp/b-consul-op-token-check consul/connect: make use of task kind to determine service name in consul token checks	2020-08-27 14:08:33 -05:00
Tim Gross	606df14e78	MRD: deregister regions that are dropped on update (#8763 ) This changeset is the OSS hooks for what will be implemented in ENT.	2020-08-27 14:54:45 -04:00
Seth Hoenig	84176c9a41	consul/connect: make use of task kind to determine service name in consul token checks When consul.allow_unauthenticated is set to false, the job_endpoint hook validates that a `-consul-token` is provided and validates the token against the privileges inherent to a Consul Service Identity policy for all the Connect enabled services defined in the job. Before, the check was assuming the service was of type sidecar-proxy. This fixes the check to use the type of the task so we can distinguish between the different connect types.	2020-08-27 12:14:40 -05:00
Seth Hoenig	9f1f2a5673	Merge branch 'master' into f-cc-ingress	2020-08-26 15:31:05 -05:00
Seth Hoenig	5d670c6d01	consul/connect: use context cancel more safely	2020-08-26 14:23:31 -05:00

1 2 3 4 5 ...

278 commits