Commit graph

24939 commits

Author SHA1 Message Date
Tim Gross 0a19fe3b60 fix multiple overflow errors in exponential backoff (#18200)
We use capped exponential backoff in several places in the code when handling
failures. The code we've copy-and-pasted all over has a check to see if the
backoff is greater than the limit, but this check happens after the bitshift and
we always increment the number of attempts. This causes an overflow with a
fairly small number of failures (ex. at one place I tested it occurs after only
24 iterations), resulting in a negative backoff which then never recovers. The
backoff becomes a tight loop consuming resources and/or DoS'ing a Nomad RPC
handler or an external API such as Vault. Note this doesn't occur in places
where we cap the number of iterations so the loop breaks (usually to return an
error), so long as the number of iterations is reasonable.

Introduce a helper with a check on the cap before the bitshift to avoid overflow in all 
places this can occur.

Fixes: #18199
Co-authored-by: stswidwinski <stan.swidwinski@gmail.com>
2023-08-15 14:39:09 -04:00
Piotr Kazmierczak 04484a10a0
Merge pull request #18205 from hashicorp/manual-backport-of-d9341f066480cc924c8cf00f981e604e02c62140-to-1.6
admin: manual backport of #18184 to 1.6
2023-08-15 15:09:48 +02:00
Seth Hoenig a45b689d8e update go1.21 (#18184)
* build: update to go1.21

* go: eliminate helpers in favor of min/max

* build: run go mod tidy

* build: swap depguard for semgrep

* command: fixup broken tls error check on go1.21
2023-08-15 14:40:33 +02:00
Tim Gross a3a86a849a test: deflake node drain intergration test (#18171)
The `TestDrainer_AllTypes_NoDeadline` test has been flaky. It looks like this
might be because the final update of batch allocations to complete is improperly
updating the state store directly rather than by RPC. If the service jobs have
restarted in the meantime, the `allocClientStateSimulator` will have updated the
index on the allocations table and that will prevent the drainer from
unblocking (and being marked complete) when the batch jobs are written with an
earlier index.

This changeset attempts to fix that by making the update via RPC (as it normally
would be in real code).
2023-08-14 16:19:00 -04:00
Tim Gross 577d96034d test: deflake job endpoint registration test (#18170)
We've seen test flakiness in the `TestJobEndpoint_Register_NonOverlapping` test,
which asserts that we don't try to placed allocations for blocked evals until
resources have been actually freed by setting the client status of the previous
alloc to complete.

The flaky assertion includes sorting the two allocations by CreateIndex and this
appears to be a non-stable sort in the context of the test run, which results in
failures that shouldn't exist. There's no reason to sort the allocations instead
of just examining them by ID. This changeset does so.
2023-08-14 16:18:53 -04:00
Shantanu Gadgil 04a3628cc4 docs: ampersand and bash backgrounding problem (#18175)
the `&` symbol messes up the command when copy pasting into a shell
2023-08-14 15:13:19 -04:00
Esteban Barrios 9f19d7c373 config: add configurable content security policy (#18085) 2023-08-14 14:25:21 -04:00
hc-github-team-nomad-core c25c04816d
Backport of e2e: modernize vaultcompat testing into release/1.6.x (#18182)
This pull request was automerged via backport-assistant
2023-08-09 09:25:32 -05:00
hc-github-team-nomad-core 8bf5067f86
backport of commit 7bfd268c8ca9caa1c07a2ad4cf197f8b3f97dca1 (#18177)
This pull request was automerged via backport-assistant
2023-08-08 07:34:11 -05:00
hc-github-team-nomad-core f812bccb4e
Backport of Tuning job versions retention. #17635 into release/1.6.x (#18169)
This pull request was automerged via backport-assistant
2023-08-07 13:48:09 -05:00
hc-github-team-nomad-core ebcdd4d82d
backport of commit 65501ff97aa2ec6fa3c4f53d3f8c6c80c6a0e8a3 (#18166)
This pull request was automerged via backport-assistant
2023-08-07 10:17:34 -05:00
hc-github-team-nomad-core d3529d7be6
Backport of CLI: make snapshot name requiered in creating volume snapshots into release/1.6.x (#18152)
This pull request was automerged via backport-assistant
2023-08-04 04:36:50 -05:00
hc-github-team-nomad-core e959895ee7
backport of commit 53c53f1a47e872f7c25673a40c76818dcf9ef0c7 (#18145)
This pull request was automerged via backport-assistant
2023-08-03 15:06:24 -05:00
dependabot[bot] 7120cbefb1 build(deps): bump github.com/hashicorp/go-kms-wrapping/v2 (#17957)
Bumps [github.com/hashicorp/go-kms-wrapping/v2](https://github.com/hashicorp/go-kms-wrapping) from 2.0.8 to 2.0.12.
- [Commits](https://github.com/hashicorp/go-kms-wrapping/compare/v2.0.8...v2.0.12)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-kms-wrapping/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-03 15:43:42 -04:00
dependabot[bot] 5f078adade build(deps): bump github.com/opencontainers/runc from 1.1.5 to 1.1.8 (#18037)
Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.5 to 1.1.8.
- [Release notes](https://github.com/opencontainers/runc/releases)
- [Changelog](https://github.com/opencontainers/runc/blob/v1.1.8/CHANGELOG.md)
- [Commits](https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.8)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-03 15:37:32 -04:00
Charlie Voiselle bac4d112d1 [dep] bump golang.org/x/exp (#18102)
There are some refactorings that have to be made in the getter and state
where the api changed in `slices`

* Bump golang.org/x/exp
* Bump golang.org/x/exp in api
* Update job_endpoint_test
* [feedback] unexport sort function
2023-08-03 15:14:39 -04:00
dependabot[bot] 9fda7305b7 build(deps): bump github.com/hashicorp/consul/api from 1.18.0 to 1.23.0 (#18038)
Bumps [github.com/hashicorp/consul/api](https://github.com/hashicorp/consul) from 1.18.0 to 1.23.0.
- [Release notes](https://github.com/hashicorp/consul/releases)
- [Changelog](https://github.com/hashicorp/consul/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/consul/compare/api/v1.18.0...api/v1.23.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/consul/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-03 15:03:16 -04:00
Tim Gross 87101b131a scheduler: filter device instance IDs by constraints (#18141)
When the scheduler assigns a device instance, it iterates over the feasible
devices and then picks the first instance with availability. If the jobspec uses
a constraint on device ID, this can lead to buggy/surprising behavior where the
node's device matches the constraint but then the individual device instance
does not.

Add a second filter based on the `${device.ids}` constraint after selecting a
node's device to ensure the device instance ID falls within the constraint as
well.

Fixes: #18112
2023-08-03 14:58:51 -04:00
hc-github-team-nomad-core c4f223249d
backport of commit b3601e00578b26d879f8f2678dc8b03e17a2db18 (#18140)
This pull request was automerged via backport-assistant
2023-08-03 09:44:19 -05:00
Karuppiah Natarajan fe4dd2f5bd docs: fix link for stopping an agent (#18130) 2023-08-02 11:52:02 -04:00
hc-github-team-nomad-core 9301daa8e8
backport of commit a3a637ee8efe5e1251f60f781369bd9052c4d4a2 (#18132)
This pull request was automerged via backport-assistant
2023-08-02 08:47:19 -05:00
hc-github-team-nomad-core e9b6be87e2
[ui] Job Variables page (#17964) (#18106)
* Bones of a component that has job variable awareness

* Got vars listed woo

* Variables as its own subnav and some pathLinkedVariable perf fixes

* Automatic Access to Variables alerter

* Helper and component to conditionally render the right link

* A bit of cleanup post-template stuff

* testfix for looping right-arrow keynav bc we have a new subnav section

* A very roundabout way of ensuring that, if a job exists when saving a variable with a pathLinkedEntity of that job, its saved right through to the job itself

* hacky but an async version of pathLinkedVariable

* model-driven and async fetcher driven with cleanup

* Only run the update-job func if jobname is detected in var path

* Test cases begun

* Management token for variables to appear in tests

* Its a management token so it gets to see the clients tab under system jobs

* Pre-review cleanup

* More tests

* Number of requests test and small fix to groups-by-way-or-resource-arrays elsewhere

* Variable intro text tests

* Variable name re-use

* Simplifying our wording a bit

* parse json vs plainId

* Addressed PR feedback, including de-waterfalling

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
2023-08-01 09:59:39 -04:00
hc-github-team-nomad-core 3b076edf11
Backport of cli: search all namespaces for node volumes into release/1.6.x (#18119)
This pull request was automerged via backport-assistant
2023-08-01 08:56:34 -05:00
hc-github-team-nomad-core b75f552246
fingerprint: fix 'default' alias not added to interface specified by network_interface (#18096) (#18116)
Co-authored-by: Kevin Schoonover <github@kschoon.me>
2023-08-01 08:38:03 -04:00
Tim Gross 9fe88ebefe cli: support wildcard namespace in alloc subcommands (#18095)
The alloc exec and filesystem/logs commands allow passing the `-job` flag to
select a random allocation. If the namespace for the command is set to `*`, the
RPC handler doesn't handle this correctly as it's expecting to query for a
specific job. Most commands handle this ambiguity by first verifying that only a
single object of the type in question exists (ex. a single node or job).

Update these commands so that when the `-job` flag is set we first verify
there's a single job that matches. This also allows us to extend the
functionality to allow for the `-job` flag to support prefix matching.

Fixes: #12097
2023-07-31 13:15:49 -04:00
hc-github-team-nomad-core f2c70e9d51
Backport of [ui] Search results are overloading filter with sorted results into release/1.6.x (#18105)
This pull request was automerged via backport-assistant
2023-07-31 12:07:52 -05:00
Tim Gross 824a38c1f4 scheduler: fix panic in render_templates destructive update check (#18100)
In #18054 we introduced a new field `render_templates` in the `restart`
block. Previously changes to the `restart` block were always non-destructive in
the scheduler but we now need to check the new field so that we can update the
template runner. The check assumed that the block was always non-nil, which
causes panics in our scheduler tests.
2023-07-31 11:53:30 -04:00
Gunnar 7fb7f77b45 docs: added accessor info to Tuples in template.mdx (#18101) 2023-07-31 11:08:09 -04:00
hc-github-team-nomad-core 2ed92e0c6c
Backport of feature: Add new field render_templates on restart block into release/1.6.x (#18094)
This pull request was automerged via backport-assistant
2023-07-28 13:54:00 -05:00
Tim Gross bebed09677 GHA pinning updates (#18093)
Trusted Supply Chain Component Registry (TSCCR) enforcement starts Monday and an
internal report shows our semgrep action is pinned to a version that's not
currently permitted. Update all the action versions to whatever's the new
hotness to maximum the time-to-live on these until we have automated pinning
setup.

Also version bumps our chromedriver action, which randomly broke upstream today.
2023-07-28 11:52:42 -04:00
hc-github-team-nomad-core 34ac0e5aad
cli: add help message for -consul-namespace (#18081) (#18091)
Add missing help entry for the `-consul-namespace` flag in `nomad job
run`.
2023-07-28 10:34:44 -04:00
hc-github-team-nomad-core 77d1f188c6
backport of commit 1e73a8a6991214fb57afa37de425dd7b0e765623 (#18089)
This pull request was automerged via backport-assistant
2023-07-28 02:49:50 -05:00
James Rasell b8cb1e79a3
chore(lint): use Go stdlib variables for HTTP methods and status codes (#17968) (#18074)
Co-authored-by: Ville Vesilehto <ville@vesilehto.fi>
2023-07-26 16:38:39 +01:00
hc-github-team-nomad-core cf18df8eb4
backport of commit 14102979762cc48183cd70dc91e26c08f630ab9d (#18067)
This pull request was automerged via backport-assistant
2023-07-26 08:30:35 -05:00
hc-github-team-nomad-core b4c4dcb818
backport of commit b7d14f133c69a64e39c40417705d29b6f2b96f60 (#18065)
This pull request was automerged via backport-assistant
2023-07-26 08:23:49 -05:00
hc-github-team-nomad-core 1604894653
backport of commit 855dc2a41947e8843a8ee6bcae6b9b1533919126 (#18059)
This pull request was automerged via backport-assistant
2023-07-25 07:04:48 -05:00
hc-github-team-nomad-core 51b8993180
backport of commit 67e5b084a1ea525cc5a2b93c2f0ed890d1af0ce3 (#18052)
This pull request was automerged via backport-assistant
2023-07-24 15:00:43 -05:00
hc-github-team-nomad-core 04a95ef5c7
backport of commit 6219ebc93dbf575015fa2921eee13daaff2b1d5e (#18049)
This pull request was automerged via backport-assistant
2023-07-24 10:33:18 -05:00
hc-github-team-nomad-core 24282f5598
backport of commit 22984929ce232de50c3f7d0754e4a515fb53e01b (#18047)
This pull request was automerged via backport-assistant
2023-07-24 10:13:52 -05:00
hc-github-team-nomad-core 0400ec6df4
backport of commit 8cabda5ec5c1d0dd3290ece4789cd8fea1ca3f5c (#18043)
This pull request was automerged via backport-assistant
2023-07-24 04:44:15 -05:00
James Rasell 40549e1132
check in stderrFrame is nil before logging stderrFrame.Data (#17815) (#18041)
Co-authored-by: Kevin Mulvey <kmulvey@linux.com>
2023-07-24 10:32:10 +01:00
hc-github-team-nomad-core 4f087674f4
backport of commit 7fe432042eaa0a97c0aaa40d302055eb18e8a9b0 (#18040)
This pull request was automerged via backport-assistant
2023-07-24 02:28:28 -05:00
hc-github-team-nomad-core 02c2f1a50f
Backport of Retain task states for post stop tasks at the time of node GC into release/1.6.x (#18033)
This pull request was automerged via backport-assistant
2023-07-21 12:55:29 -05:00
Tim Gross 95e71a82a4 Merge branch 'release/1.6.1' into release/1.6.x 2023-07-21 11:12:06 -04:00
hc-github-team-nomad-core b807b5d967 Prepare for next release 2023-07-21 13:49:46 +00:00
hc-github-team-nomad-core 88ea0c3cc2 Generate files for 1.6.1 release 2023-07-21 13:49:42 +00:00
hc-github-team-nomad-core 7eb6d42cb5
backport of commit 80244bb7914876278597b205324f990cbe9ee15d (#18027)
This pull request was automerged via backport-assistant
2023-07-21 08:21:21 -05:00
Luiz Aoqui 21cf30178a ui: handle errors from unimplemented services (#18020)
When a request is made to an RPC service that doesn't exist (for
example, a cross-region request from a newer version of Nomad to an
older version that doesn't implement the endpoint) the application
should return an empty list as well.
2023-07-21 09:20:00 -04:00
hc-github-team-nomad-core 63316363c9
Backport of ui: handle node pool requests to older regions into release/1.6.x (#18024)
This pull request was automerged via backport-assistant
2023-07-21 08:17:08 -05:00
hc-github-team-nomad-core 30260f06e8
Backport of state: canonicalize namespace on restore into release/1.6.x (#18018)
This pull request was automerged via backport-assistant
2023-07-20 15:05:16 -05:00