open-nomad

Commit Graph

Author	SHA1	Message	Date
Charlie Voiselle	d57e333534	Update architecture-state-store.md (#15049 )	2022-10-27 14:03:43 -04:00
Tim Gross	8ac41c167f	Merge pull request #15062 from hashicorp/post-1.4.2-release Post 1.4.2 release	2022-10-27 13:38:36 -04:00
Tim Gross	2ce1728fa6	Merge release 1.4.2 files Changelog updates for 1.4.2 and backports.	2022-10-27 13:31:29 -04:00
hc-github-team-nomad-core	38b1c8a22a	Prepare for next release	2022-10-27 13:08:05 -04:00
hc-github-team-nomad-core	fbef8881cd	Generate files for 1.4.2 release	2022-10-27 13:08:05 -04:00
Tim Gross	9d906d4632	variables: fix filter on List RPC The List RPC correctly authorized against the prefix argument. But when filtering results underneath the prefix, it only checked authorization for standard ACL tokens and not Workload Identity. This results in WI tokens being able to read List results (metadata only: variable paths and timestamps) for variables under the `nomad/` prefix that belong to other jobs in the same namespace. Fixes the filtering and split the `handleMixedAuthEndpoint` function into separate authentication and authorization steps so that we don't need to re-verify the claim token on each filtered object. Also includes: * update semgrep rule for mixed auth endpoints * variables: List returns empty set when all results are filtered	2022-10-27 13:08:05 -04:00
James Rasell	da5069bded	event stream: ensure token expiry is correctly checked for subs. This change ensures that a token's expiry is checked before every event is sent to the caller. Previously, a token could still be used to listen for events after it had expired, as long as the subscription was made while it was unexpired. This would last until the token was garbage collected from state. The check occurs within the RPC as there is currently no state update when a token expires.	2022-10-27 13:08:05 -04:00
dependabot[bot]	81ac5d93f1	build(deps): bump github.com/kr/pretty from 0.3.0 to 0.3.1 in /api (#14859 ) * build(deps): bump github.com/kr/pretty from 0.3.0 to 0.3.1 in /api Bumps [github.com/kr/pretty](https://github.com/kr/pretty) from 0.3.0 to 0.3.1. - [Release notes](https://github.com/kr/pretty/releases) - [Commits](https://github.com/kr/pretty/compare/v0.3.0...v0.3.1) --- updated-dependencies: - dependency-name: github.com/kr/pretty dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * deps: update in root as well Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-10-27 11:58:00 -05:00
dependabot[bot]	7324d90ba7	build(deps): bump github.com/ryanuber/columnize (#14858 ) Bumps [github.com/ryanuber/columnize](https://github.com/ryanuber/columnize) from 2.1.1-0.20170703205827-abc90934186a+incompatible to 2.1.2+incompatible. - [Release notes](https://github.com/ryanuber/columnize/releases) - [Commits](https://github.com/ryanuber/columnize/commits/v2.1.2) --- updated-dependencies: - dependency-name: github.com/ryanuber/columnize dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-27 11:34:27 -05:00
dependabot[bot]	2f86f92d87	build(deps): bump github.com/shirou/gopsutil/v3 from 3.22.8 to 3.22.9 (#14857 ) Bumps [github.com/shirou/gopsutil/v3](https://github.com/shirou/gopsutil) from 3.22.8 to 3.22.9. - [Release notes](https://github.com/shirou/gopsutil/releases) - [Commits](https://github.com/shirou/gopsutil/compare/v3.22.8...v3.22.9) --- updated-dependencies: - dependency-name: github.com/shirou/gopsutil/v3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-27 11:33:50 -05:00
dependabot[bot]	07796965b1	build(deps): bump google.golang.org/grpc from 1.48.0 to 1.50.1 (#14897 ) * build(deps): bump google.golang.org/grpc from 1.48.0 to 1.50.1 Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.48.0 to 1.50.1. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.48.0...v1.50.1) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * cl: add changelog entry for grpc Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-10-27 11:32:48 -05:00
dependabot[bot]	eb210f2af7	build(deps): bump github.com/fsouza/go-dockerclient from 1.8.2 to 1.9.0 (#14898 ) * build(deps): bump github.com/fsouza/go-dockerclient from 1.8.2 to 1.9.0 Bumps [github.com/fsouza/go-dockerclient](https://github.com/fsouza/go-dockerclient) from 1.8.2 to 1.9.0. - [Release notes](https://github.com/fsouza/go-dockerclient/releases) - [Changelog](https://github.com/fsouza/go-dockerclient/blob/main/container_changes_test.go) - [Commits](https://github.com/fsouza/go-dockerclient/compare/v1.8.2...v1.9.0) --- updated-dependencies: - dependency-name: github.com/fsouza/go-dockerclient dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * cl: add changelog entry Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-10-27 11:05:45 -05:00
Seth Hoenig	4f3a1e6f7d	ci: use groups of tests in gha (#15018 ) * [no ci] use json for grouping packages for testing * [no ci] able to get packages in group * [no ci] able to run groups of tests * [no ci] more * [no ci] try disable circle unit tests * ci: use actions/checkout@v3 * ci: rename to quick * ci: need make dev in mods cache step * ci: make compile step depend on checks step * ci: bump consul and vault versions * ci: need make dev for group tests * ci: update ci unit testing docs * docs: spell plumbing correctly Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-10-27 09:02:58 -05:00
Charlie Voiselle	28cd831085	Update consul-template dep (#15045 )	2022-10-26 11:51:45 -04:00
Tim Gross	f29c781fa7	docs: improved documentation on hardening and required capabilities (#15036 ) The existing docs on required capabilities are a little sparse and have been the subject of a lots of questions. Expand on this information and provide a pointer to the ongoing design discussion around rootless Nomad.	2022-10-26 09:46:13 -04:00
Tim Gross	aca95c0bc6	keyring: remove root key GC (#15034 )	2022-10-25 17:06:18 -04:00
Seth Hoenig	d69556fb35	client: ensure minimal cgroup controllers enabled (#15027 ) * client: ensure minimal cgroup controllers enabled This PR fixes a bug where Nomad could not operate properly on operating systems that set the root cgroup.subtree_control to a set of controllers that do not include the minimal set of controllers needed by Nomad. Nomad needs these controllers enabled to operate: - cpuset - cpu - io - memory - pids Now, Nomad will ensure these controllers are enabled during Client initialization, adding them to cgroup.subtree_control as necessary. This should be particularly helpful on the RHEL/CentOS/Fedora family of system. Ubuntu systems should be unaffected as they enable all controllers by default. Fixes: https://github.com/hashicorp/nomad/issues/14494 * docs: cleanup doc string * client: cleanup controller writes, enhance log messages	2022-10-24 16:08:54 -05:00
Tim Gross	c45d9a9ea8	keyring: refactor to hold locks for less time (#15026 ) Follow-up from https://github.com/hashicorp/nomad/pull/14987/files#r1003611644 We don't need to hold the lock when querying the state store, so move the read-lock to the interior of the `activeKeySet` function.	2022-10-24 16:23:44 -04:00
Zach Shilton	4dd0bd916b	docs: add details to redirects file (#15020 )	2022-10-24 13:16:07 -04:00
Seth Hoenig	32744a3548	deps: update hashicorp/raft to v1.3.11 (#15021 ) * deps: update hashicorp/raft to v1.3.11 Includes part of the fix for https://github.com/hashicorp/raft/issues/524 * cl: add changelog entry	2022-10-24 12:10:24 -05:00
Seth Hoenig	dd2999d6af	ci: add -core suffix to mods action (#15015 ) Forgot to add this line to the new mods action; without it, it creates a cache different from the one used by the other jobs.	2022-10-24 08:49:01 -05:00
Jai	f4138a88e0	refact: preserve promise.then behavior for acceptance tests (#15003 )	2022-10-24 09:04:39 -04:00
Tim Gross	b9922631bd	keyring: fix missing GC config, don't rotate on manual GC (#15009 ) The configuration knobs for root keyring garbage collection are present in the consumer and present in the user-facing config, but we missed the spot where we copy from one to the other. Fix this so that users can set their own thresholds. The root key is automatically rotated every ~30d, but the function that does both rotation and key GC was wired up such that `nomad system gc` caused an unexpected key rotation. Split this into two functions so that `nomad system gc` cleans up old keys without forcing a rotation, which will be done periodially or by the `nomad operator root keyring rotate` command.	2022-10-24 08:43:42 -04:00
Seth Hoenig	91d29e6449	ci: use the same go mod cache across test-core jobs (#15006 ) * ci: use the same go mod cache for test-core jobs * ci: precache go modules * ci: add a mods precache job	2022-10-21 17:38:45 -05:00
Tim Gross	3a811ac5e7	keyring: fixes for keyring replication on cluster join (#14987 ) * keyring: don't unblock early if rate limit burst exceeded The rate limiter returns an error and unblocks early if its burst limit is exceeded (unless the burst limit is Inf). Ensure we're not unblocking early, otherwise we'll only slow down the cases where we're already pausing to make external RPC requests. * keyring: set MinQueryIndex on stale queries When keyring replication makes a stale query to non-leader peers to find a key the leader doesn't have, we need to make sure the peer we're querying has had a chance to catch up to the most current index for that key. Otherwise it's possible for newly-added servers to query another newly-added server and get a non-error nil response for that key ID. Ensure that we're setting the correct reply index in the blocking query. Note that the "not found" case does not return an error, just an empty key. So as a belt-and-suspenders, update the handling of empty responses so that we don't break the loop early if we hit a server that doesn't have the key. * test for adding new servers to keyring * leader: initialize keyring after we have consistent reads Wait until we're sure the FSM is current before we try to initialize the keyring. Also, if a key is rotated immediately following a leader election, plans that are in-flight may get signed before the new leader has the key. Allow for a short timeout-and-retry to avoid rejecting plans	2022-10-21 12:33:16 -04:00
Michael Schurter	9cac60dbed	test: use port collision instead of cpu exhaustion (#14994 ) Originally this test relied on Job 1 blocking Job 2 until Job 1 had a terminal ClientStatus. Job 2 ensured it would get blocked using 2 mechanisms: 1. A constraint requiring it is placed on the same node as Job 1. 2. Job 2 would require all unreserved CPU on the node to ensure it would be blocked until Job 1's resources were free. That 2nd assertion breaks if any previous job is still running on the target node! That seems very likely to happen in the flaky world of our e2e tests. In fact there may be some jobs we intentionally want running throughout; in hindsight it was never safe to assume my test would be the only thing scheduled when it ran. Ports to the rescue! Reserving a static port means that both Job 2 will now block on Job 1 being terminal. It will only conflict with other tests if those tests use that port on every node. I ensured no existing tests were using the port I chose. Other changes: - Gave job a bit more breathing room resource-wise. - Tightened timings a bit since previous failure ran into the `go test` time limit. - Cleaned up the DumpEvals output. It's quite nice and handy now!	2022-10-21 07:53:26 -07:00
Luiz Aoqui	8b8d85bce7	docs: use of `node_class` when autoscaling (#14950 ) Document how the value of `node_class` is used during cluster scaling. https://github.com/hashicorp/nomad-autoscaler/issues/255	2022-10-21 10:35:45 -04:00
Seth Hoenig	1f1b662e73	ci: use gotestsum for CI tests (#14995 ) Use gotestsum in both GHA and Circle with retries enabled.	2022-10-21 08:45:24 -05:00
James Rasell	206fb04dc1	acl: allow tokens to read policies linked via roles to the token. (#14982 ) ACL tokens are granted permissions either by direct policy links or via ACL role links. Callers should therefore be able to read policies directly assigned to the caller token or indirectly by ACL role links.	2022-10-21 09:05:17 +02:00
Luiz Aoqui	593e48e826	cli: prevent panic on `operator debug` (#14992 ) If the API returns an error during debug bundle collection the CLI was expanding the wrong error object, resulting in a panic since `err` is `nil`.	2022-10-20 15:53:58 -04:00
Jai	08fde3a4ff	refact: upgrade Promise.then to async/await (#14798 ) * refact: upgrade Promise.then to async/await * naive solution (#14800) * refact: use id instead of model * chore: add changelog entry * refact: add conditional safety around alloc	2022-10-20 14:25:41 -04:00
Luiz Aoqui	0fddb4d7e8	Post 1.4.1 release (#14988 ) * Generate files for 1.4.1 release * Prepare for next release Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2022-10-20 13:09:41 -04:00
Seth Hoenig	6e9c8a9955	deps: update go-memdb for goroutine leak fix (#14983 ) * deps: update go-memdb for goroutine leak fix * cl: update for goroutine leak go-memdb	2022-10-20 10:34:52 -05:00
Seth Hoenig	756b71b7d2	deps: bump shoenig for str func bugfixes (#14974 ) And fix the one place we use them.	2022-10-20 08:11:43 -05:00
James Rasell	215b4e7e36	acl: add ACL roles to event stream topic and resolve policies. (#14923 ) This changes adds ACL role creation and deletion to the event stream. It is exposed as a single topic with two types; the filter is primarily the role ID but also includes the role name. While conducting this work it was also discovered that the events stream has its own ACL resolution logic. This did not account for ACL tokens which included role links, or tokens with expiry times. ACL role links are now resolved to their policies and tokens are checked for expiry correctly.	2022-10-20 09:43:35 +02:00
James Rasell	d7b311ce55	acl: correctly resolve ACL roles within client cache. (#14922 ) The client ACL cache was not accounting for tokens which included ACL role links. This change modifies the behaviour to resolve role links to policies. It will also now store ACL roles within the cache for quick lookup. The cache TTL is configurable in the same manner as policies or tokens. Another small fix is included that takes into account the ACL token expiry time. This was not included, which meant tokens with expiry could be used past the expiry time, until they were GC'd.	2022-10-20 09:37:32 +02:00
Luiz Aoqui	75830a7161	docs: expand Autoscaling documentation (#14937 ) Rename `Internals` section to `Concepts` to match core docs structure and expand on how policies are evaluated. Also include missing documentation for check grouping and fix examples to use the new feature.	2022-10-19 17:57:08 -04:00
Phil Renaud	54eeb6ebe8	Adds searching and filtering for nodes on topology view (#14913 ) * Adds searching and filtering for nodes on topology view * Lintfix and changelog * Acceptance tests for topology search and filter * Search terms also apply to class and dc on topo page * Initialize queryparam values so as to not break history state	2022-10-19 15:00:35 -04:00
Luiz Aoqui	bb00f3d713	docs: add autoscaling debug (#14941 )	2022-10-19 14:17:41 -04:00
Luiz Aoqui	9f51e7ee40	docs: move autoscaling `source` agent config (#14947 ) Move the Autoscaler agent configuration `source` to the `policy` page since they are very closely related. Also update all headers in this section so they follow the proper `h1 > h2 > h3 > ...` hierarchy.	2022-10-19 14:17:09 -04:00
Luiz Aoqui	150b69daaf	docs: explain autoscaler target-value strategy (#14951 ) Provide more technical details about how the `target-value` strategy calculates new scaling actions.	2022-10-19 14:16:17 -04:00
Zach Shilton	fedeb84500	website: fix broken links (#14946 ) * fix: nomad license put link * fix: redirected URL * fix: avoid auto-formatting changes	2022-10-19 14:07:48 -04:00
Seth Hoenig	57375566d4	consul: register checks along with service on initial registration (#14944 ) * consul: register checks along with service on initial registration This PR updates Nomad's Consul service client to include checks in an initial service registration, so that the checks associated with the service are registered "atomically" with the service. Before, we would only register the checks after the service registration, which causes problems where the service is deemed healthy, even if one or more checks are unhealthy - especially problematic in the case where SuccessBeforePassing is configured. Fixes #3935 * cr: followup to fix cause of extra consul logging * cr: fix another bug * cr: fixup changelog	2022-10-19 12:40:56 -05:00
Michael Schurter	611abdf2cc	build: add ability to specify release targets (#14957 ) My make knowledge is very very limited, so if there's a better way to do this please let me know! This seems to work and lets me cut one off builds easily.	2022-10-19 10:27:47 -07:00
James Rasell	d95f27501b	deps: update consul-template to `61e288a` (#14955 )	2022-10-19 16:27:14 +02:00
Anthony	eb3515c8f5	Updated datacenter block description (#14953 ) * Updated datacenter block description * Replacing accidentally removed title * docs: add closing period Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-10-19 08:44:52 -05:00
Seth Hoenig	e66c9ede24	e2e: convert flaky exec download in chroot unit test into e2e test (#14949 ) Similar to https://github.com/hashicorp/nomad/pull/14710, convert flaky test into e2e test.	2022-10-19 08:22:32 -05:00
James Rasell	2db8c67a6d	api: add convenience string func to Topic type. (#14843 )	2022-10-19 14:12:23 +02:00
HashiBot	976e4870ec	chore: Update Digital Team Files (#14945 ) * Update generated scripts (website-start.sh) * Update generated scripts (should-build.sh) * Update generated scripts (website-build.sh) * Update generated website Makefile	2022-10-18 17:43:31 -04:00
Michael Schurter	01d90d18f6	test: expand timing and debugging for overlap test (#14920 ) attempt #9000	2022-10-18 13:02:18 -07:00

1 2 3 4 5 ...

23913 Commits All Branches Search

23913 Commits

All Branches