Commit Graph

24873 Commits

Author SHA1 Message Date
Daniel Bennett f7e316e9cd
tests: enable newer windows (#17401)
* "allow" (don't try to drop) linux capabilities
  in the docker test driver harness (see #15181)
* refactor to allow different busybox images
  since windows containers need to be the same
  version as the underlying OS, and we're
  moving from 2016 to 2019
* one docker test was flaky from apparently
  being a bit slower on windows, so add Wait()
2023-06-02 11:38:38 -05:00
Luiz Aoqui 3a962d07f8
np: fix node pool search permission check (#17400)
When checking if a token is allowed to query the search endpoints we
need to return an error if the search context includes `node_pool` and
the token doesn't have access to _any_ pool. This prevents returning an
empty list instead of a permission denied error.
2023-06-02 12:22:47 -04:00
Phil Renaud 03dc959c2e
UI GHA test changes re-implemented (#17399) 2023-06-02 11:59:08 -04:00
Samantha b92a782b6e
check: Add support for Consul field tls_server_name (#17334) 2023-06-02 10:19:12 -04:00
Tim Gross 56e9b944e8
node pools: validate pool exists on job registration (#17386)
Add a new job admission hook for node pools that enforces the pool exists on
registration. Also provide the skeleton function we need for Enterprise
enforcement functions we'll implement later.
2023-06-02 09:32:07 -04:00
Luiz Aoqui f755b9469f
core: refactor task validation (#17344)
Move all validations related to task fields to Task.Validate(). Prior to
this, some task validations were being done inside TaskGroup.Validate()
because they required access to some group values.

But similarly to how TaskGroup.Validate() tasks the job as parameter,
it's fair to expect the task to receive its group.
2023-06-01 19:26:42 -04:00
Luiz Aoqui 4be8d7c049
core: fix kill_timeout validation when progress_deadline is 0 (#17342) 2023-06-01 19:01:32 -04:00
Luiz Aoqui 9bb57c08e3
node pool: add search support (#17385) 2023-06-01 17:48:14 -04:00
Tim Gross 4f14fa0518
node pools: add `node_pool` field to job spec (#17379)
This changeset only adds the `node_pool` field to the jobspec, and ensures that
it gets picked up correctly as a change. Without the rest of the implementation
landed yet, the field will be ignored.
2023-06-01 16:08:55 -04:00
Luiz Aoqui c61e75f302
node pools: add CRUD API (#17384) 2023-06-01 15:55:49 -04:00
Bryce Kalow 9e4bb87536
Delete check-legacy-links-format.yml (#17380) 2023-06-01 13:41:17 -04:00
Luiz Aoqui 45b0391378
np: implement ACL for node pools (#17365) 2023-06-01 13:03:20 -04:00
Seth Hoenig 53ca8b9552
repo: block pushing to release branches in git hook (#17377) 2023-06-01 09:36:20 -05:00
Daniel Kimsey b703acf671
Merge pull request #17376 from hashicorp/revert-accidental-17103-commit-on-main
Revert "fixup: address review changes"
2023-06-01 09:22:30 -05:00
Daniel Kimsey ec59fd1d54 Revert "fixup: address review changes"
This reverts commit ba736e4521bb4bec14d21862895eabec14a9c2f9.

This was accidentally added by fat-fingered Admin push...
2023-06-01 08:58:07 -05:00
Daniel Kimsey 5b886165da fixup: address review changes 2023-06-01 08:48:42 -05:00
Tim Gross 06972fae0c
prioritized client updates (#17354)
The allocrunner sends several updates to the server during the early lifecycle
of an allocation and its tasks. Clients batch-up allocation updates every 200ms,
but experiments like the C2M challenge has shown that even with this batching,
servers can be overwhelmed with client updates during high volume
deployments. Benchmarking done in #9451 has shown that client updates can easily
represent ~70% of all Nomad Raft traffic.

Each allocation sends many updates during its lifetime, but only those that
change the `ClientStatus` field are critical for progressing a deployment or
kicking off a reschedule to recover from failures.

Add a priority to the client allocation sync and update the `syncTicker`
receiver so that we only send an update if there's a high priority update
waiting, or on every 5th tick. This means when there are no high priority
updates, the client will send updates at most every 1s instead of
200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as
well as reduce client-to-server RPC traffic.

This changeset also switches from a channel-based collection of updates to a
shared buffer, so as to split batching from sending and prevent backpressure
onto the allocrunner when the RPC is slow. This doesn't have a major performance
benefit in the benchmarks but makes the implementation of the prioritized update
simpler.

Fixes: #9451
2023-05-31 15:34:16 -04:00
dependabot[bot] bdd4a55738
build(deps): bump github.com/elazarl/go-bindata-assetfs (#17339)
Bumps [github.com/elazarl/go-bindata-assetfs](https://github.com/elazarl/go-bindata-assetfs) from 1.0.1-0.20200509193318-234c15e7648f to 1.0.1.
- [Release notes](https://github.com/elazarl/go-bindata-assetfs/releases)
- [Changelog](https://github.com/elazarl/go-bindata-assetfs/blob/master/.goreleaser.yml)
- [Commits](https://github.com/elazarl/go-bindata-assetfs/commits/v1.0.1)

---
updated-dependencies:
- dependency-name: github.com/elazarl/go-bindata-assetfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-31 10:19:59 -04:00
Phil Renaud 52772ab0c0
Text type to password type input on profile sign-in page (#17345) 2023-05-30 16:58:34 -04:00
Phil Renaud 038d53c58f
Observe newlines when displaying variables (#17343) 2023-05-30 16:58:16 -04:00
Luiz Aoqui 6236cb8f82
cli: output errors when monitoring deployment (#17348) 2023-05-30 11:12:12 -04:00
Luiz Aoqui e236d6dedd
cli: fix panic on job restart (#17346)
When monitoring the replacement allocation, if the
`Allocations().Info()` request fails, the `alloc` variable is `nil`, so
it should not be read.
2023-05-30 11:08:49 -04:00
Luiz Aoqui bb2395031b
client: fix Consul version finterprint (#17349)
Consul v1.13.8 was released with a breaking change in the /v1/agent/self
endpoint version where a line break was being returned.

This caused the Nomad finterprint to fail because `NewVersion` errors on
parse.

This commit removes any extra space from the Consul version returned by
the API.
2023-05-30 11:07:57 -04:00
Seth Hoenig acfdf0f479
compliance: add headers with fixed copywrite tool (#17353)
Closes #17117
2023-05-30 09:20:32 -05:00
Piotr Kazmierczak cea48b24ee
fix: job canonicalization should set job priority to 50, not 0. (#17314)
Nomad API will reject jobs with priority set to 0.
2023-05-30 09:05:32 +02:00
Phil Renaud 294aa4bfe7
[ui] A few variables-ui-related bugfixes (#17319)
* A few variable-adding bugfixes

* Disable Delete button if only one KV is left, and remove entity warnings on Add More
2023-05-25 17:11:13 -04:00
Charlie Voiselle 86e04a4c6c
[core] nil check and error handling for client status in heartbeat responses (#17316)
Add a nil check to constructNodeServerInfoResponse to manage an apparent race between deregister and client heartbeats. Fixes #17310
2023-05-25 16:04:54 -04:00
Phil Renaud 98aa88c739
[ui, deployments] Show a "Latest Deployment Status" cell within the Job Status panel on steady service jobs (#17246)
* Failed or lost cell condensed

* Latest Deployment cell

* Stylistic changes and deploying state fixup

* Rewritten tooltip message and updated lost/failed tests

* failed-or-lost cell updates to job status panel acceptance tests
2023-05-25 14:20:48 -04:00
Seth Hoenig e04d8cf77b
docs: fixup example of readiness check (#17296)
A "readiness" check implies a failing healthcheck will not cause the
deployment of a service to stop - i.e. it is only used as a liveness
probe in the context of service discoverability.

Fix our docs example to reflect that a readiness check is created by
setting on_update to "ignore" (as opposed to "ignore_warnings").
2023-05-23 15:29:10 -05:00
Tim Gross b85a28b851
changelog entry for Vault SDK update (#17281) 2023-05-23 09:21:29 -04:00
dependabot[bot] deaf4c4240
build(deps): bump github.com/grpc-ecosystem/go-grpc-middleware (#16551)
Bumps [github.com/grpc-ecosystem/go-grpc-middleware](https://github.com/grpc-ecosystem/go-grpc-middleware) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/grpc-ecosystem/go-grpc-middleware/releases)
- [Commits](https://github.com/grpc-ecosystem/go-grpc-middleware/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: github.com/grpc-ecosystem/go-grpc-middleware
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-23 09:01:26 -04:00
dependabot[bot] 9ad9505ba0
build(deps): bump github.com/hashicorp/vault/sdk from 0.7.0 to 0.9.0 (#17031)
Bumps [github.com/hashicorp/vault/sdk](https://github.com/hashicorp/vault) from 0.7.0 to 0.9.0.
- [Release notes](https://github.com/hashicorp/vault/releases)
- [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/vault/compare/v0.7.0...v0.9.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/vault/sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-23 08:40:32 -04:00
dependabot[bot] 74b55ba1ed
build(deps): bump github.com/docker/distribution (#17162)
Bumps [github.com/docker/distribution](https://github.com/docker/distribution) from 2.8.1+incompatible to 2.8.2+incompatible.
- [Release notes](https://github.com/docker/distribution/releases)
- [Commits](https://github.com/docker/distribution/compare/v2.8.1...v2.8.2)

---
updated-dependencies:
- dependency-name: github.com/docker/distribution
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-23 08:40:18 -04:00
Seth Hoenig 9e43d197f8
post 1.5.6 release (#17276)
* Generate files for 1.5.6 release

* Prepare for next release

* Merge release 1.5.6 files

* manual revert bindata_assetfs because the one on main is better

---------

Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>
2023-05-22 14:03:51 -05:00
Tim Gross b9ca3bc9b1
build: remove 386 builds for Nomad 1.6.0 (#17239)
The 32-bit Intel builds (aka "386") are not tested and likely have bugs
involving platform-sized integers when operated at any non-trivial scale. Remove
these builds from the upcoming Nomad 1.6.0 and provide recommendations in the
upgrade notes for those users who might have hobbyist boards running 32-bit
ARM (this will primarily be the RaspberryPi Zero or older spins of the RaspPi).

DO NOT BACKPORT TO 1.5.x OR EARLIER!
2023-05-22 13:27:17 -04:00
Charlie Voiselle fc313b7f8f
[api] Return a shapely error for unexpected response (#16743)
* Add UnexpectedResultError to nomad/api

This allows users to perform additional status-based behavior by rehydrating the error using `errors.As` inside of consumers.
2023-05-22 11:45:31 -04:00
Lance Haig 568da5918b
cli: tls certs not created with correct SANs (#16959)
The `nomad tls cert` command did not create certificates with the correct SANs for
them to work with non default domain and region names. This changset updates the
code to support non default domains and regions in the certificates.
2023-05-22 09:31:56 -04:00
Roberto Hidalgo 2f702a9f11
allow periodic jobs to use workload identity ACL policies (#17018)
When resolving ACL policies, we were not using the parent ID for the policy
lookup for dispatch/periodic jobs, even though the claims were signed for that
parent ID. This prevents all calls to the Task API (and other WI-authenticated
API calls) from a periodically-dispatched job failing with 403.

Fix this by using the parent job ID whenever it's available.
2023-05-22 09:19:16 -04:00
Tim Gross 9838349c23
document which fields can be updated by `volume register` (#17249)
The `volume register` command can update a small subset of the volume's fields
in-place, with some restrictions depending on whether the volume is currently in
use. Document these in the `volume register` command docs and the volume
specification docs.

Fixes: #17247
2023-05-22 09:15:25 -04:00
dependabot[bot] 31a38d750b
build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.6 in /api (#17178)
* build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.5 in /api

* deps: update shoenig/test to v0.6.5

* deps: update again to v0.6.6

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-05-22 07:53:12 -05:00
Phil Renaud 48f5af77ce
Updates static JS/UI assets for upcoming 1.6 release (#17263) 2023-05-19 19:03:37 -04:00
Phil Renaud 46239e1e80
[ui, deployments] Add status panel to child jobs (#17217)
* Treated same-route as sub-route and didnt cancel watchers

* Adds panel to child jobs and sub-sorts

* removed the safety check in module-for-job tests

* [ui] Adds status panel to Sysbatch jobs (#17243)

* In working out periodic/param child jobs, realized the intersection with sysbatch is high enough that it ought to be worked on now

* Further removal of jobclientstatussummary

* Explicitly making mocked jobs in no-deployment mode

* remove last remnants of job-client-status-summary component

* Screwed up my sorting order a few commits ago; this corrects it

* noActiveDeployment gonna be the death of me
2023-05-19 15:51:35 -04:00
Tim Gross 4881f2451a
docs: describe the default Workload Identity ACL policy (#17245)
Workload Identities have an implicit default policy. This policy can't currently
be described via HCL because it includes task interpolation for Variables and
access to the Services API (which doesn't exist as its own ACL
capbility). Describe this in our WI documentation.

Fixes: #16277
2023-05-19 11:38:05 -04:00
Tim Gross 15fa0277cf
build: pin semgrep action (#17248)
The file path in the TSCCR repo for the `returntocorp/semgrep` action was
incorrect, so the pinning tool was not able to find the correct entry and it was
not pinned in #17238.

The repository is fixed in https://github.com/hashicorp/security-tsccr/pull/431
2023-05-19 10:27:51 -04:00
Tim Gross dbfcfbd623
build: move GitHub actions to versions allowed by prodsec (#17238)
The `backspace/ember-asset-size` action we're using is unmaintained and has a
bunch of vulns in it, so it won't pass security screening (this is a NodeJS
action so it has piles of dependencies, 99% of which won't be in use but fails
automated screening anyways). Move this to the upstream version.

The `machine-learning-apps/pr-comment` action also presents a problem for the
ProdSec security screening because it's archived and also runs an external
Docker image. Move this to a likely-ok maintained action for now, until we can
spare some time to remove this in lieu of something more reasonable that isn't a
GitHub action.
2023-05-19 09:07:02 -04:00
Phil Renaud 0b729e4bb9
Fixes to scheduling-filtering-in-ui (#17244) 2023-05-18 17:38:34 -04:00
Yethal 4073987de3
cli: show leader status in json output of server members (#17138) 2023-05-18 16:43:57 -04:00
Phil Renaud 7e56ca62d1
[ui] Adds a "Scheduling" filter to the job.allocations page (#17227)
* Basic filter concept

* Make sure NextAllocation gets sent up with allocation stub
2023-05-18 16:24:41 -04:00
Luiz Aoqui beba92ab36
Post 1.5.5 release (#17241)
* Generate files for 1.5.5 release

* Prepare for next release

---------

Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>
2023-05-18 14:06:56 -04:00
Jai b0b0b22f0f
ui: add option to filter for jobs that are packs (#17226)
* refact:  update job model

* refact: update view layer

* refact: update test
2023-05-18 12:47:11 -04:00