open-nomad

Commit Graph

Author	SHA1	Message	Date
Luiz Aoqui	4ccd999304	ci: send notification when prepare is complete (#16627 )	2023-03-23 17:34:45 -04:00
Tim Gross	977c88dcea	drainer: test refactoring to clarify behavior around delete/down nodes (#16612 ) This changeset refactors the tests of the draining node watcher so that we don't mock the node watcher's `Remove` and `Update` methods for its own tests. Instead we'll mock the node watcher's dependencies (the job watcher and deadline notifier) and now unit tests can cover the real code. This allows us to remove a bunch of TODOs in `watch_nodes.go` around testing and clarify some important behaviors: * Nodes that are down or disconnected will still be watched until the scheduler decides what to do with their allocations. This will drive the job watcher but not the node watcher, and that lets the node watcher gracefully handle cases where a heartbeat fails but the node heartbeats again before its allocs can be evicted. * Stop watching nodes that have been deleted. The blocking query for nodes set the maximum index to the highest index of a node it found, rather than the index of the nodes table. This misses updates to the index from deleting nodes. This was done as an performance optimization to avoid excessive unblocking, but because the query is over all nodes anyways there's no optimization to be had here. Remove the optimization so we can detect deleted nodes without having to wait for an update to an unrelated node.	2023-03-23 14:07:09 -04:00
Michael Schurter	5e6799164f	Post 1.5.2 release (#16614 ) * Generate files for 1.5.2 release * Prepare for next release * add 1.4.7 and 1.3.12 to the changelog --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2023-03-22 14:23:38 -07:00
Phil Renaud	11de45d17b	[ui] Copyable server and client attribute values (#16548 ) * Copyable server and client attribute values * Changelog	2023-03-22 15:05:01 -04:00
Juana De La Cuesta	5892839c83	Fix broken test for quotas CLI (#16610 ) * fix: fix broken test * fix: fix broken test for quota status	2023-03-22 19:07:37 +01:00
James Rasell	7dd1484757	docs: detail support for Nomad checks in service block. (#16598 )	2023-03-22 09:27:58 +01:00
Michael Schurter	d2aa8fcdc7	taskapi: use HasSuffix to detect errors from rpcs (#16594 ) Matches the "normal" HTTP error detection logic in the same file.	2023-03-21 14:38:07 -07:00
Michael Schurter	4678dc7b4d	e2e: sleep to ensure logs are picked up (#16596 ) :(	2023-03-21 14:10:50 -07:00
Tim Gross	ad774ccfa1	E2E: fix events tests (#16595 ) In #12916 we updated the events test as part of a larger set of changes around mapstructure serialization fixes. But the changes to the jobs we're deploying in the tests had invalid task configs so they never result in good deployments and the test will always fail. Make the before/after jobs identical (except for the version bump) and make them valid. Also wait for allocations for the 2nd job run to appear before checking the deployment list, so that we don't race with the scheduler.	2023-03-21 14:01:40 -07:00
Michael Schurter	15fe2ade18	Windows fixes for e2e tests (#16592 ) * e2e: skip task api test when windows too old * e2e: don't run proxy on windows	2023-03-21 13:55:32 -07:00
Suselz	b3d2ec7634	Update csi_plugin.mdx (#16584 ) Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-21 16:16:18 +01:00
Tim Gross	1763622dfd	contrib: architecture guide to the drainer (#16569 ) The drainer component is fairly complex. As part of upcoming work to fix some of the drainer's rough edges, document the drainer's architecture from a Nomad developer perspective.	2023-03-21 09:17:24 -04:00
Luiz Aoqui	518fd610b3	changelog: update #16427 to improvement (#16565 ) The security fix in Go 1.20.2 does not apply to Nomad.	2023-03-20 21:24:53 -04:00
Michael Schurter	f8884d8b52	client/metadata: fix crasher caused by AllowStale = false (#16549 ) Fixes #16517 Given a 3 Server cluster with at least 1 Client connected to Follower 1: If a NodeMeta.{Apply,Read} for the Client request is received by Follower 1 with `AllowStale = false` the Follower will forward the request to the Leader. The Leader, not being connected to the target Client, will forward the RPC to Follower 1. Follower 1, seeing AllowStale=false, will forward the request to the Leader. The Leader, not being connected to... well hoppefully you get the picture: an infinite loop occurs.	2023-03-20 16:32:32 -07:00
Tim Gross	d1b35c6bd0	contrib: mock driver (#16573 )	2023-03-20 16:35:32 -04:00
James Rasell	2f4680680f	dev: remove use of cfssl and use Nomad CLI for TLS certs. (#16145 )	2023-03-20 17:06:15 +01:00
James Rasell	4825b40e9a	docs: remove Java and Scala SDKs from supported list. (#16555 )	2023-03-20 15:35:02 +01:00
Phil Renaud	ccce4b68f2	[ui] Perform common job tasks with keyboard shortcuts (#16378 ) * Throw your mouse into traffic * Add node metadata with a shortcut * Re-labelled * Adds a toast notification to job start/stop on keyboard shortcut * Typo fix	2023-03-20 09:24:39 -04:00
Juana De La Cuesta	47be374bbd	Add `-json` flag to `quota inspect` command (#16478 ) * Added and flag to command * cli[style]: small refactor to avoid confussion with tmpl variable * Update inspect.mdx * cli: add changelog entry * Update .changelog/16478.txt Co-authored-by: James Rasell <jrasell@users.noreply.github.com> * Update command/quota_inspect.go Co-authored-by: James Rasell <jrasell@users.noreply.github.com> --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-20 10:40:51 +01:00
Juana De La Cuesta	ed44f50091	cli: add `-json` and `-t` flags to `quota status` command (#16485 ) * cli: add json and t flags to quota status command * cli: add entry to changelog * Update command/quota_status.go Co-authored-by: James Rasell <jrasell@users.noreply.github.com> --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-20 10:39:56 +01:00
Juana De La Cuesta	eeb3766575	cli: Add `json` and `-t` flags to `server members` command (#16444 ) * cli: Add and flags to server members * Update website/content/docs/commands/server/members.mdx Co-authored-by: James Rasell <jrasell@users.noreply.github.com> * Update website/content/docs/commands/server/members.mdx Co-authored-by: James Rasell <jrasell@users.noreply.github.com> * cli: update the server memebers tests to use must * cli: add flags addition to changelog --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-20 10:39:24 +01:00
Adam Pugh	e4e53872be	Spelling update (#16553 ) updated propogating to propagating	2023-03-20 09:24:41 +01:00
Seth Hoenig	d6dcc53c0a	tls enforcement flaky tests (#16543 ) * tests: add WaitForLeaders helpers using must/wait timings * tests: start servers for mtls tests together Fixes #16253 (hopefully)	2023-03-17 14:11:13 -05:00
Piotr Kazmierczak	0a2b425eb5	cli: nomad login command should not require a -type flag and should respect default auth method (#16504 ) nomad login command does not need to know ACL Auth Method's type, since all method names are unique. Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-17 19:14:28 +01:00
Seth Hoenig	07543f8bdf	nsd: always set deregister flag after deregistration of group (#16289 ) * services: always set deregister flag after deregistration of group This PR fixes a bug where the group service hook's deregister flag was not set in some cases, causing the hook to attempt deregistrations twice during job updates (alloc replacement). In the tests ... we used to assert on the wrong behvior (remove twice) which has now been corrected to assert we remove only once. This bug was "silent" in the Consul provider world because the error logs for double deregistration only show up in Consul logs; with the Nomad provider the error logs are in the Nomad agent logs. * services: cleanup group service hook tests	2023-03-17 09:44:21 -05:00
Piotr Kazmierczak	14927e93bc	acl: fix canonicalization of OIDC auth method mock (#16534 )	2023-03-17 15:37:54 +01:00
James Rasell	4a5d7d3793	docs: add binding-rule selector escape example on Windows PS (#16273 )	2023-03-17 15:13:35 +01:00
Michael Schurter	a875bad6e5	Enable ACLs on E2E test clients (#16530 ) * e2e: uniformly enable acls across all agents * docs: clarify that acls should be set everywhere	2023-03-16 14:22:41 -07:00
Tim Gross	ec47b245d0	client: don't use `Status` RPC for Consul discovery (#16490 ) In #16217 we switched clients using Consul discovery to the `Status.Members` endpoint for getting the list of servers so that we're using the correct address. This endpoint has an authorization gate, so this fails if the anonymous policy doesn't have `node:read`. We also can't check the `AuthToken` for the request for the client secret, because the client hasn't yet registered so the server doesn't have anything to compare against. Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the Consul response directly. Update the `registerNode` method to handle the list of servers we get back in the response; if we get a "no servers" or "no path to region" response we'll kick off discovery again and retry immediately rather than waiting 15s.	2023-03-16 15:38:33 -04:00
Seth Hoenig	5b1970468e	artifact: git needs more files for private repositories (#16508 ) * landlock: git needs more files for private repositories This PR fixes artifact downloading so that git may work when cloning from private repositories. It needs - file read on /etc/passwd - dir read on /root/.ssh - file write on /root/.ssh/known_hosts Add these rules to the landlock rules for the artifact sandbox. * cr: use nonexistent instead of devnull Co-authored-by: Michael Schurter <mschurter@hashicorp.com> * cr: use go-homdir for looking up home directory * pr: pull go-homedir into explicit require * cr: fixup homedir tests in homeless root cases * cl: fix root test for real --------- Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2023-03-16 12:22:25 -05:00
Michael Schurter	81b8c52472	docs: dispatch_payload and jobs api docs had some weirdness (#16514 ) * docs: dispatch_payload docs had some weirdness Docs said "Examples" when there was only 1 example. Not sure what the floating "to" in the description was for. * docs: missing a heading level on jobs api docs	2023-03-16 09:42:46 -07:00
Seth Hoenig	d2e8fb626a	artifact: do not set process attributes on darwin (#16511 ) This PR fixes the non-root macOS use case where artifact downloads stopped working. It seems setting a Credential on a SysProcAttr used by the exec package will always cause fork/exec to fail - even if the credential contains our own UID/GID or nil UID/GID. Technically we do not need to set this as the child process will inherit the parent UID/GID anyway... and not setting it makes things work again ... /shrug	2023-03-16 11:31:18 -05:00
Seth Hoenig	25944cbb7d	artifact: use specific version link for zipbomb artifact (#16513 ) Fix the e2e case where we download the go-getter bomb.zip test file, which is being removed on main. We can still get it from the version tag - yay git!	2023-03-16 10:18:46 -05:00
James Rasell	184733a126	build: fix `test-nomad` make target when running locally. (#16506 )	2023-03-16 09:32:14 +01:00
Daniel Bennett	0331dd71ca	test: set BuildDate in default TestAgent config (#16499 ) so enterprise tests don't fail due to the default zero time	2023-03-15 11:47:15 -05:00
James Rasell	b0a3964e6b	cli: fix login help output formatting. (#16502 )	2023-03-15 13:23:26 +01:00
Seth Hoenig	ed7177de76	scheduler: annotate tasksUpdated with reason and purge DeepEquals (#16421 ) * scheduler: annotate tasksUpdated with reason and purge DeepEquals * cr: move opaque into helper * cr: swap affinity/spread hashing for slice equal * contributing: update checklist-jobspec with notes about struct methods * cr: add more cases to wait config equal method * cr: use reflect when comparing envoy config blocks * cl: add cl	2023-03-14 09:46:00 -05:00
Anthony	6a7e22d546	Merge pull request #16484 from hashicorp/tunzor-patch-1 Update for enterprise trial wording and link	2023-03-14 10:19:29 -04:00
Anthony	9a3d2924e4	Updated trial license link and wording	2023-03-14 09:31:06 -04:00
Juana De La Cuesta	c235bafa3f	cli: Add `-json` and `-t` flags to `namespace status` command (#16442 ) * cli: Add and flag to namespace status command * Update command/namespace_status.go Co-authored-by: James Rasell <jrasell@users.noreply.github.com> * cli: update tests for namespace status command to use must --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-03-14 14:23:04 +01:00
Tim Gross	16b731e456	docs: clarify migration behavior under `nomad alloc stop` (#16468 )	2023-03-14 09:00:29 -04:00
Tim Gross	8579d1e479	agent: trim space when parsing X-Nomad-Token header (#16469 ) Our auth token parsing code trims space around the `Authorization` header but not around `X-Nomad-Token`. When using the UI, it's easy to accidentally introduce a leading or trailing space, which results in spurious authentication errors. Trim the space at the HTTP server.	2023-03-14 08:57:53 -04:00
Seth Hoenig	a25d3ea792	cgv1: do not disable cpuset manager if reserved interface already exists (#16467 ) * cgv1: do not disable cpuset manager if reserved interface already exists This PR fixes a bug where restarting a Nomad Client on a machine using cgroups v1 (e.g. Ubuntu 20.04) would cause the cpuset cgroups manager to disable itself. This is being caused by incorrectly interpreting a "file exists" error as problematic when ensuring the reserved cpuset exists. If we get a "file exists" error, that just means the Client was likely restarted. Note that a machine reboot would fix the issue - the groups interfaces are ephemoral. * cl: add cl	2023-03-13 17:00:17 -05:00
Luiz Aoqui	adf147cb36	acl: update job eval requirement to `submit-job` (#16463 ) The job evaluate endpoint creates a new evaluation for the job which is a write operation. This change modifies the necessary capability from `read-job` to `submit-job` to better reflect this.	2023-03-13 17:13:54 -04:00
Luiz Aoqui	c29a87b875	plugin: add missing fields to `TaskConfig` (#16434 )	2023-03-13 15:58:16 -04:00
Dao Thanh Tung	ca9a43eced	doc: Update `nomad fmt` doc to run against non-deprecated HCL2 jobspec only (#16435 ) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>	2023-03-13 15:26:27 -04:00
Michael Schurter	8da636c6d5	build: update from go1.20.1 to go1.20.2 (#16427 ) * build: update from go1.20.1 to go1.20.2 Note that the CVE fixed in go1.20.2 does not impact Nomad. https://github.com/golang/go/issues/58647	2023-03-13 09:47:07 -07:00
dependabot[bot]	5b9bbd12ea	build(deps): bump go.uber.org/goleak from 1.2.0 to 1.2.1 (#16439 ) Bumps [go.uber.org/goleak](https://github.com/uber-go/goleak) from 1.2.0 to 1.2.1. - [Release notes](https://github.com/uber-go/goleak/releases) - [Changelog](https://github.com/uber-go/goleak/blob/master/CHANGELOG.md) - [Commits](https://github.com/uber-go/goleak/compare/v1.2.0...v1.2.1) --- updated-dependencies: - dependency-name: go.uber.org/goleak dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-13 11:23:56 -05:00
Tim Gross	9dfb51579c	scheduler: refactor system util tests (#16416 ) The tests for the system allocs reconciling code path (`diffSystemAllocs`) include many impossible test environments, such as passing allocs for the wrong node into the function. This makes the test assertions nonsensible for use in walking yourself through the correct behavior. I've pulled this changeset out of PR #16097 so that we can merge these improvements and revisit the right approach to fix the problem in #16097 with less urgency now that the PFNR bug fix has been merged. This changeset breaks up a couple of tests, expands test coverage, and makes test assertions more clear. It also corrects one bit of production code that behaves fine in production because of canonicalization, but forces us to remember to set values in tests to compensate.	2023-03-13 11:59:31 -04:00
Seth Hoenig	630bd8eb68	scheduler: add simple benchmark for tasksUpdated (#16422 ) In preperation for some refactoring to tasksUpdated, add a benchmark to the old code so it's easy to compare with the changes, making sure nothing goes off the rails for performance.	2023-03-13 10:44:14 -05:00

1 2 3 4 5 ...

24440 Commits All Branches Search

24440 Commits

All Branches