open-nomad

Commit Graph

Author	SHA1	Message	Date
Seth Hoenig	acfdf0f479	compliance: add headers with fixed copywrite tool (#17353 ) Closes #17117	2023-05-30 09:20:32 -05:00
Piotr Kazmierczak	cea48b24ee	fix: job canonicalization should set job priority to 50, not 0. (#17314 ) Nomad API will reject jobs with priority set to 0.	2023-05-30 09:05:32 +02:00
Charlie Voiselle	fc313b7f8f	[api] Return a shapely error for unexpected response (#16743 ) * Add UnexpectedResultError to nomad/api This allows users to perform additional status-based behavior by rehydrating the error using `errors.As` inside of consumers.	2023-05-22 11:45:31 -04:00
dependabot[bot]	31a38d750b	build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.6 in /api (#17178 ) * build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.5 in /api * deps: update shoenig/test to v0.6.5 * deps: update again to v0.6.6 --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-05-22 07:53:12 -05:00
Phil Renaud	0b729e4bb9	Fixes to scheduling-filtering-in-ui (#17244 )	2023-05-18 17:38:34 -04:00
Phil Renaud	7e56ca62d1	[ui] Adds a "Scheduling" filter to the job.allocations page (#17227 ) * Basic filter concept * Make sure NextAllocation gets sent up with allocation stub	2023-05-18 16:24:41 -04:00
James Rasell	c60c5ace60	api: update Go mod go version to 1.20 to match main mod. (#17137 )	2023-05-10 16:29:06 +01:00
Tim Gross	17bd930ca9	logs: fix missing allocation logs after update to Nomad 1.5.4 (#17087 ) When the server restarts for the upgrade, it loads the `structs.Job` from the Raft snapshot/logs. The jobspec has long since been parsed, so none of the guards around the default value are in play. The empty field value for `Enabled` is the zero value, which is false. This doesn't impact any running allocation because we don't replace running allocations when either the client or server restart. But as soon as any allocation gets rescheduled (ex. you drain all your clients during upgrades), it'll be using the `structs.Job` that the server has, which has `Enabled = false`, and logs will not be collected. This changeset fixes the bug by adding a new field `Disabled` which defaults to false (so that the zero value works), and deprecates the old field. Fixes #17076	2023-05-04 16:01:18 -04:00
dependabot[bot]	1633cab363	build(deps): bump github.com/shoenig/test from 0.6.3 to 0.6.4 in /api (#16895 ) Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 0.6.3 to 0.6.4. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v0.6.3...v0.6.4) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-04-24 11:39:37 -05:00
Tim Gross	72cbe53f19	logs: allow disabling log collection in jobspec (#16962 ) Some Nomad users ship application logs out-of-band via syslog. For these users having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow disabling the logmon and pointing the task's stdout/stderr to /dev/null. This changeset is the first of several incremental improvements to log collection short of full-on logging plugins. The next step will likely be to extend the internal-only task driver configuration so that cluster administrators can turn off log collection for the entire driver. --- Fixes: #11175 Co-authored-by: Thomas Weber <towe75@googlemail.com>	2023-04-24 10:00:27 -04:00
Seth Hoenig	ba728f8f97	api: enable support for setting original job source (#16763 ) * api: enable support for setting original source alongside job This PR adds support for setting job source material along with the registration of a job. This includes a new HTTP endpoint and a new RPC endpoint for making queries for the original source of a job. The HTTP endpoint is /v1/job/<id>/submission?version=<version> and the RPC method is Job.GetJobSubmission. The job source (if submitted, and doing so is always optional), is stored in the job_submission memdb table, separately from the actual job. This way we do not incur overhead of reading the large string field throughout normal job operations. The server config now includes job_max_source_size for configuring the maximum size the job source may be, before the server simply drops the source material. This should help prevent Bad Things from happening when huge jobs are submitted. If the value is set to 0, all job source material will be dropped. * api: avoid writing var content to disk for parsing * api: move submission validation into RPC layer * api: return an error if updating a job submission without namespace or job id * api: be exact about the job index we associate a submission with (modify) * api: reword api docs scheduling * api: prune all but the last 6 job submissions * api: protect against nil job submission in job validation * api: set max job source size in test server * api: fixups from pr	2023-04-11 08:45:08 -05:00
hashicorp-copywrite[bot]	005636afa0	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Charlie Voiselle	9dfe4aa7c0	Set RequireRoot to be a test helper. (#16641 )	2023-04-06 14:34:36 -04:00
James Rasell	cb6ba80f0f	cli: stream both stdout and stderr when following an alloc. (#16556 ) This update changes the behaviour when following logs from an allocation, so that both stdout and stderr files streamed when the operator supplies the follow flag. The previous behaviour is held when all other flags and situations are provided. Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-04-04 10:42:27 +01:00
Horacio Monsalvo	20372b1721	connect: add meta on ConsulSidecarService (#16705 ) Co-authored-by: Sol-Stiep <sol.stiep@southworks.com>	2023-03-30 16:09:28 -04:00
Piotr Kazmierczak	2b353902a1	acl: HTTP endpoints for JWT auth (#16519 )	2023-03-30 09:39:56 +02:00
Seth Hoenig	87f4b71df0	client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips (#16672 ) * client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips This PR adds detection of asymetric core types (Power & Efficiency) (P/E) when running on M1/M2 Apple Silicon CPUs. This functionality is provided by shoenig/go-m1cpu which makes use of the Apple IOKit framework to read undocumented registers containing CPU performance data. Currently working on getting that functionality merged upstream into gopsutil, but gopsutil would still not support detecting P vs E cores like this PR does. Also refactors the CPUFingerprinter code to handle the mixed core types, now setting power vs efficiency cpu attributes. For now the scheduler is still unaware of mixed core types - on Apple platforms tasks cannot reserve cores anyway so it doesn't matter, but at least now the total CPU shares available will be correct. Future work should include adding support for detecting P/E cores on the latest and upcoming Intel chips, where computation of total cpu shares is currently incorrect. For that, we should also include updating the scheduler to be core-type aware, so that tasks of resources.cores on Linux platforms can be assigned the correct number of CPU shares for the core type(s) they have been assigned. node attributes before cpu.arch = arm64 cpu.modelname = Apple M2 Pro cpu.numcores = 12 cpu.reservablecores = 0 cpu.totalcompute = 1000 node attributes after cpu.arch = arm64 cpu.frequency.efficiency = 2424 cpu.frequency.power = 3504 cpu.modelname = Apple M2 Pro cpu.numcores.efficiency = 4 cpu.numcores.power = 8 cpu.reservablecores = 0 cpu.totalcompute = 37728 * fingerprint/cpu: follow up cr items	2023-03-28 08:27:58 -05:00
Luiz Aoqui	e5d31bca61	cli: job restart command (#16278 ) Implement the new `nomad job restart` command that allows operators to restart allocations tasks or reschedule then entire allocation. Restarts can be batched to target multiple allocations in parallel. Between each batch the command can stop and hold for a predefined time or until the user confirms that the process should proceed. This implements the "Stateless Restarts" alternative from the original RFC (https://gist.github.com/schmichael/e0b8b2ec1eb146301175fd87ddd46180). The original concept is still worth implementing, as it allows this functionality to be exposed over an API that can be consumed by the Nomad UI and other clients. But the implementation turned out to be more complex than we initially expected so we thought it would be better to release a stateless CLI-based implementation first to gather feedback and validate the restart behaviour. Co-authored-by: Shishir Mahajan <smahajan@roblox.com>	2023-03-23 18:28:26 -04:00
Michael Schurter	f8884d8b52	client/metadata: fix crasher caused by AllowStale = false (#16549 ) Fixes #16517 Given a 3 Server cluster with at least 1 Client connected to Follower 1: If a NodeMeta.{Apply,Read} for the Client request is received by Follower 1 with `AllowStale = false` the Follower will forward the request to the Leader. The Leader, not being connected to the target Client, will forward the RPC to Follower 1. Follower 1, seeing AllowStale=false, will forward the request to the Leader. The Leader, not being connected to... well hoppefully you get the picture: an infinite loop occurs.	2023-03-20 16:32:32 -07:00
Juana De La Cuesta	47be374bbd	Add `-json` flag to `quota inspect` command (#16478 ) * Added and flag to command * cli[style]: small refactor to avoid confussion with tmpl variable * Update inspect.mdx * cli: add changelog entry * Update .changelog/16478.txt Co-authored-by: James Rasell <jrasell@users.noreply.github.com> * Update command/quota_inspect.go Co-authored-by: James Rasell <jrasell@users.noreply.github.com> --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-20 10:40:51 +01:00
Lance Haig	ae256e28d8	Update ioutil library references to os and io respectively for API and Plugins package (#16330 ) No user facing changes so I assume no change log is required	2023-03-08 10:25:09 -06:00
Seth Hoenig	1c8b408a81	deps: update test to 0.6.2 for new functions (#16326 )	2023-03-06 09:24:45 -06:00
Luiz Aoqui	3f1ea9da4b	api: set last index and request time on alloc stop (#16319 ) Some of the methods in `Allocations()` incorrectly use the `putQuery` in API calls where `put` is more appropriate since they are not reading information back. These methods are also not returning request metadata such as `LastIndex` back to callers, which can be useful to have in some scenarios. They also provide poor developer experience as they take an `api.Allocation` struct when only the allocation ID is necessary. This can lead consumers to make unnecessary API calls to fetch the full allocation. Fixing these problems require updating the methods' signatures so they take `WriteOptions` instead of `QueryOptions` and return `WriteMeta`, but this is a breaking change that requires advanced notice to consumers. This commit adds a future breaking change notice and also fixes the `Stop` method so it properly returns request metadata in a backwards compatible way.	2023-03-03 15:52:41 -05:00
Dao Thanh Tung	62a69552c1	api: add new test case for force-leave (#16260 ) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>	2023-03-03 10:38:40 -05:00
Tim Gross	0e1b554299	handle `FSM.Apply` errors in `raftApply` (#16287 ) The signature of the `raftApply` function requires that the caller unwrap the first returned value (the response from `FSM.Apply`) to see if it's an error. This puts the burden on the caller to remember to check two different places for errors, and we've done so inconsistently. Update `raftApply` to do the unwrapping for us and return any `FSM.Apply` error as the error value. Similar work was done in Consul in https://github.com/hashicorp/consul/pull/9991. This eliminates some boilerplate and surfaces a few minor bugs in the process: * job deregistrations of already-GC'd jobs were still emitting evals * reconcile job summaries does not return scheduler errors * node updates did not report errors associated with inconsistent service discovery or CSI plugin states Note that although _most_ of the `FSM.Apply` functions return only errors (which makes it tempting to remove the first return value entirely), there are few that return `bool` for some reason and Variables relies on the response value for proper CAS checking.	2023-03-02 13:51:09 -05:00
Seth Hoenig	c9ffd1274b	api: fix a panic and tweak some exported types (#16237 ) This PR - fixes a panic in GetItems when looking up a variable that does not exist. - deprecates GetItems in favor of GetVariableItems which avoids returning a pointer to a map - deprecates ErrVariableNotFound in favor of ErrVariablePathNotFound which is an actual error type - does some minor code cleanup to make linters happier	2023-02-22 08:17:22 -06:00
James Rasell	8295d0e516	acl: add validation to binding rule selector on upsert. (#16210 ) * acl: add validation to binding rule selector on upsert. * docs: add more information on binding rule selector escaping.	2023-02-17 15:38:55 +01:00
Alessio Perugini	4e9ec24b22	Allow configurable range of Job priorities (#16084 )	2023-02-17 09:23:13 -05:00
Pierre Cauchois	74cf372e20	api: fix missing Node Status "disconnected" in API (#16166 )	2023-02-14 09:43:23 -05:00
Michael Schurter	35d65c7c7e	Dynamic Node Metadata (#15844 ) Fixes #14617 Dynamic Node Metadata allows Nomad users, and their jobs, to update Node metadata through an API. Currently Node metadata is only reloaded when a Client agent is restarted. Includes new UI for editing metadata as well. --------- Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>	2023-02-07 14:42:25 -08:00
Charlie Voiselle	cc6f4719f1	Add option to expose workload token to task (#15755 ) Add `identity` jobspec block to expose workload identity tokens to tasks. --------- Co-authored-by: Anders <mail@anars.dk> Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2023-02-02 10:59:14 -08:00
Tristan Pemble	5440965260	fix(#13844 ): canonicalize job to avoid nil pointer deference (#13845 )	2023-02-01 16:01:28 -05:00
James Rasell	9e8325d63c	acl: fix a bug in token creation when parsing expiration TTLs. (#15999 ) The ACL token decoding was not correctly handling time duration syntax such as "1h" which forced people to use the nanosecond representation via the HTTP API. The change adds an unmarshal function which allows this syntax to be used, along with other styles correctly.	2023-02-01 17:43:41 +01:00
Jorge Marey	d1c9aad762	Rename fields on proxyConfig (#15541 ) * Change api Fields for expose and paths * Add changelog entry * changelog: add deprecation notes about connect fields * api: minor style tweaks --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:31:16 -06:00
Piotr Kazmierczak	14b53df3b6	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
dependabot[bot]	52a86b9d32	build(deps): bump github.com/shoenig/test from 0.6.0 to 0.6.1 in /api (#15939 ) * build(deps): bump github.com/shoenig/test from 0.6.0 to 0.6.1 in /api Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 0.6.0 to 0.6.1. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v0.6.0...v0.6.1) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * deps: update test --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-29 14:03:56 -06:00
James Rasell	5d33891910	sso: allow binding rules to create management ACL tokens. (#15860 ) * sso: allow binding rules to create management ACL tokens. * docs: update binding rule docs to detail management type addition.	2023-01-26 09:57:44 +01:00
James Rasell	fad9b40e53	Merge branch 'main' into sso/gh-13120-oidc-login	2023-01-18 10:05:31 +00:00
Benjamin Buzbee	13cc30ebeb	Return buffered text from log endpoint if decoding fails (#15558 ) To see why I think this is a good change lets look at why I am making it My disk was full, which means GC was happening agressively. So by the time I called the logging endpoint from the SDK, the logs were GC'd The error I was getting before was: ``` invalid character 'i' in literal false (expecting 'l') ``` Now the error I get is: ``` failed to decode log endpoint response as JSON: "failed to list entries: open /tmp/nomad.data.4219353875/alloc/f11fee50-2b66-a7a2-d3ec-8442cb3d557a/alloc/logs: no such file or directory" ``` Still not super descriptive but much more debugable	2023-01-16 10:39:56 +01:00
James Rasell	b3a6cfecc4	api: add OIDC HTTP API endpoints and SDK.	2023-01-13 13:15:58 +00:00
Seth Hoenig	232a9e743b	consul/connect: use block not optional for opaque map (#15765 )	2023-01-12 10:39:10 -06:00
Seth Hoenig	fe7795ce16	consul/connect: support for proxy upstreams opaque config (#15761 ) This PR adds support for configuring `proxy.upstreams[].config` for Consul Connect upstreams. This is an opaque config value to Nomad - the data is passed directly to Consul and is unknown to Nomad.	2023-01-12 08:20:54 -06:00
Seth Hoenig	84cb5fb03d	deps: update shoenig/test to v0.6.0 (#15715 ) Adds support for custom cmp.Options; need to fix one minor thing causing api breakage.	2023-01-09 09:37:08 -06:00
Seth Hoenig	c40dea83a4	test: wait for node in api tests that register job (#15677 )	2023-01-03 16:11:35 -06:00
Seth Hoenig	9dc82864a9	tests: fix assertion for slice length (#15672 ) This assertions got borked during the refactoring; should be at least one element, not exactly one element.	2023-01-03 15:40:38 -06:00
Seth Hoenig	cd75858f4a	api: purge testify and pretty dependencies (#15627 ) * api: swap testify for test (acl) * api: swap testify for test (agent) Please enter the commit message for your changes. Lines starting * api: swap testify for test (allocations) * api: swap testify for test (api) * api: swap testify for test (compose) * api: swap testify for test (constraint) * api: swap testify for test (consul) * api: swap testify for test (csi) * api: swap testify for test (evaluations) * api: swap testify for test (event stream) * api: swap testify for test (fs) * api: swap testify for test (ioutil) * api: swap testify for test (jobs) * api: swap testify for test (keyring) * api: swap testify for test (operator_ent) * api: swap testify for test (operator_metrics) * api: swap testify for test (operator) * api: swap testify for test (quota) * api: swap testify for test (resources) * api: swap testify for test (fix operator_metrics) * api: swap testify for test (scaling) * api: swap testify for test (search) * api: swap testify for test (sentinel) * api: swap testify for test (services) * api: swap testify for test (status) * api: swap testify for test (system) * api: swap testify for test (tasks) * api: swap testify for test (utils) * api: swap testify for test (variables) * api: remove dependencies on testify and pretty	2023-01-01 12:57:26 -06:00
Seth Hoenig	429293a4be	api: cleanup use of deprecated waiter functions (#15608 )	2022-12-22 08:21:00 -06:00
Seth Hoenig	26512b4e38	deps: update shoenig/test to 0.5.2 and fixup breaking changes (#15574 )	2022-12-20 07:52:10 -06:00
Seth Hoenig	336d730b9c	api: make api tests fast and more concurrency safe (#15543 ) This PR tries to make API tests run fast, as an experiment to later apply to all packages. Key changes include - Swapping freeport for test/portal for port allocations - Swappng some uses of WaitForResult with test/wait - Turning on parallelism in api/testutil/slow.go - Switching to custom public runner (32 vcpu) There's also chunk of cleanup brought in for the ride	2022-12-16 12:25:28 -06:00
James Rasell	95c9ffa505	ACL: add ACL binding rule RPC and HTTP API handlers. (#15529 ) This change add the RPC ACL binding rule handlers. These handlers are responsible for the creation, updating, reading, and deletion of binding rules. The write handlers are feature gated so that they can only be used when all federated servers are running the required version. The HTTP API handlers and API SDK have also been added where required. This allows the endpoints to be called from the API by users and clients.	2022-12-15 09:18:55 +01:00

1 2 3 4 5 ...

1166 Commits