Commit graph

24673 commits

Author SHA1 Message Date
Tim Gross 17bd930ca9
logs: fix missing allocation logs after update to Nomad 1.5.4 (#17087)
When the server restarts for the upgrade, it loads the `structs.Job` from the
Raft snapshot/logs. The jobspec has long since been parsed, so none of the
guards around the default value are in play. The empty field value for `Enabled`
is the zero value, which is false.

This doesn't impact any running allocation because we don't replace running
allocations when either the client or server restart. But as soon as any
allocation gets rescheduled (ex. you drain all your clients during upgrades),
it'll be using the `structs.Job` that the server has, which has `Enabled =
false`, and logs will not be collected.

This changeset fixes the bug by adding a new field `Disabled` which defaults to
false (so that the zero value works), and deprecates the old field.

Fixes #17076
2023-05-04 16:01:18 -04:00
Seth Hoenig b4c9f3bbc2
client: fix job_max_source_size client config name (#17067)
Intended to be job_max_source_size, rather than max_job_source_size.
This way it fits better with existing client config options related
to jobs.
2023-05-04 13:54:51 -05:00
Seth Hoenig 4347c1d705
docs: move CNI reference plugins installation to CNI overview page (#17068)
* docs: move CNI reference plugins installation to CNI overview page

This PR moves the instruction steps for install the CNI reference plugins
from the Consul Mesh integration page to the general Networking CNI page.

These plugins are required for bridge networking, not just Consul Mesh,
so it makes sense to have them on the general CNI page.

Closes #17038

* docs: fix a link to post install steps
2023-05-04 11:32:06 -05:00
James Rasell 50414bba12
docs: update artifact jobspec sshkey example path. (#17077) 2023-05-04 14:29:36 +01:00
Michael Schurter 3b3b02b741
dep: update from jwt/v4 to jwt/v5 (#17062)
Their release notes are here: https://github.com/golang-jwt/jwt/releases

Seemed wise to upgrade before we do even more with JWTs. For example
this upgrade *would* have mattered if we already implemented common JWT
claims such as expiration. Since we didn't rely on any claim
verification this upgrade is a noop...

...except for 1 test that called `Claims.Valid()`! Removing that
assertion *seems* scary, but it didn't actually do anything because we
didn't implement any of the standard claims it validated:

https://github.com/golang-jwt/jwt/blob/v4.5.0/map_claims.go#L120-L151

So functionally this major upgrade is a noop.
2023-05-03 11:17:38 -07:00
Charlie Voiselle 8f6fa14e9e
[deps] Update consul-template to v0.31.0 (#16908)
* Update consul-template to v0.31.0
* Add changelog
2023-05-03 09:15:17 -04:00
Michael Schurter f8f9e91b8a
build: upgrade from go 1.20.3 to 1.20.4 (#17056)
Includes CVE fixes that do *not* impact Nomad:

https://groups.google.com/g/golang-announce/c/MEb0UyuSMsU
2023-05-02 13:09:11 -07:00
Charlie Voiselle 61f997d806
Add WriterUI (#17051)
This special purpose UI provides commands that can benefit from direct
access to the io.Reader and io.Writers of the base cli.Ui. It can
traverse a chain of ColoredUis to find the base. Currently, it can
retrieve writers from a cli.BasicUi (or cli.MockUi for testing).

Renames ui.go and ui_test.go to log_ui.go and log_ui_test.go
2023-05-02 13:40:44 -04:00
Seth Hoenig e9fec4ebc8
connect: remove unusable path for fallback envoy image names (#17044)
This PR does some cleanup of an old code path for versions of Consul that
did not support reporting the supported versions of Envoy in its API. Those
versions are no longer supported for years at this point, and the fallback
version of envoy hasn't been supported by any version of Consul for almost
as long. Remove this code path that is no longer useful.
2023-05-02 09:48:44 -05:00
Seth Hoenig e8d53ea30b
connect: use explicit docker.io prefix in default envoy image names (#17045)
This PR modifies references to the envoyproxy/envoy docker image to
explicitly include the docker.io prefix. This does not affect existing
users, but makes things easier for Podman users, who otherwise need to
specify the full name because Podman does not default to docker.io
2023-05-02 09:27:48 -05:00
Luiz Aoqui 7b5a8f1fb0
Revert "hashicorp/go-msgpack v2 (#16810)" (#17047)
This reverts commit 8a98520d56eed3848096734487d8bd3eb9162a65.
2023-05-01 17:18:34 -04:00
Seth Hoenig 86f6a38867
connect: do not restrict auto envoy version to docker task driver (#17041)
This PR updates the envoy_bootstrap_hook to no longer disable itself if
the task driver in use is not docker. In other words, make it work for
podman and other image based task drivers. The hook now only checks that

1. the task is a connect sidecar
2. the task.config block contains an "image" field
2023-05-01 15:07:35 -05:00
Phil Renaud 922c593203
[ui, deployments] Restarted and Rescheduled panel cells (#16972)
* Status panel shows failed and lost, but probably dont have the condition quite right

* Rescheduled and Replaced cells instead of a general failed/lost one

* Tests moving to acceptance

* Fixed desiredTotal and added acceptance test for restarted

* moved integration test into acceptance test generally

* Now that we represent Lost in the graph, have to make our unplaced testcase as Unknown

* No need to declare new vars for immediately returned getters

* Literal restart and resched add to the tallies, rather than 'would have but ran out of attampts' like before

* Testfixes now that weve redefined what restarts and reschedules are indicated by
2023-05-01 15:24:21 -04:00
Tim Gross 5503eb97f5
add copywrite headers commit to ignore-revs config file (#17037) 2023-05-01 10:57:43 -04:00
Phil Renaud a637354ae0
[ui, deployments] Don't separate allocation groups based on their deployment health unless they're "running" (#17016)
* Group up non-running allocs regardless of deploymenthealth

* Better asynchrony in test
2023-04-28 14:52:42 -04:00
Phil Renaud 0805271f8f
percy-specific css to hide table cells in the job status panel acceptance test (#17021) 2023-04-28 14:51:53 -04:00
Phil Renaud 5ca59aef56
Move the token JWT console log out of an interator (#17010) 2023-04-28 13:46:10 -04:00
Seth Hoenig 5744b2cd4f
docs: add more notes about artifact breaking changes in 1.5.0 (#17005)
* changelog: note artifact breaking changes for 1.5.0

* docs: add note about environment variables to artifact job spec docs

* Update website/content/docs/job-specification/artifact.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-27 11:41:18 -05:00
Michael Schurter d3b0bbc088
deps: update go-bexpr from 0.1.11 to 0.1.12 (#16991)
Pulls in https://github.com/hashicorp/go-bexpr/pull/38

Fixes #16758
2023-04-27 09:01:42 -07:00
Tim Gross 87f416943c
testing: improve fidelity of mock driver task restore (#16990)
While working on client status update improvements, I encountered problems
getting tests with the mock driver to correctly restore.

Unlike typical drivers the mock driver doesn't have an external source of truth
for whether the task is running (ex. making API calls to `dockerd` or looking
for a running PID), and so in order to make up that information, it re-parses
the original task config. But the taskrunner doesn't call the encoding step for
`RecoverTask`, only `StartTask`, so the task config the mock driver gets is
missing data.

Update the mock driver to stash the "external" state in the task state that
we'll get from the task runner, so that we don't have to try to recover from the
original `TaskConfig` anymore. This should bring the mock driver closer to the
behavior of the other drivers.
2023-04-27 11:54:10 -04:00
James Rasell fddef4c6e1
docs: use appropriate file extension for autoscaler agent config. (#16993) 2023-04-27 15:00:28 +01:00
Phil Renaud 7f7f764c5a
[ui] Fixed: Evaluations sidebar/response not scrollable (#16960)
* Sets up a CSS grid for Evaluations sidebar

* Flex seems more sensible for this actually

* Tighten up the header margin

* Percy found a diff; the expand button wasnt showing for view logs sidebar
2023-04-27 09:49:18 -04:00
James Rasell ac98c2ed40
vars: ensure struct reciever names are consistent. (#16995) 2023-04-27 13:51:11 +01:00
James Rasell 4d2c1403c2
scale: do not allow scaling of jobs with type system. (#16969) 2023-04-25 15:47:44 +01:00
Seth Hoenig f221e99572
tools: update dependencies and use tree set (#16974)
* tools: bump go mod deps for tools module

* tools: use treeset in tools/missing
2023-04-25 07:47:19 -05:00
Phil Renaud 7dbebe9a93
[ui, feature] Job Page Redesign (#16932)
* [ui] Service job status panel (#16134)

* it begins

* Hacky demo enabled

* Still very hacky but seems deece

* Floor of at least 3 must be shown

* Width from on-high

* Other statuses considered

* More sensible allocTypes listing

* Beginnings of a legend

* Total number of allocs running now maps over job.groups

* Lintfix

* base the number of slots to hold open on actual tallies, which should never exceed totalAllocs

* Versions get yer versions here

* Versions lookin like versions

* Mirage fixup

* Adds Remaining as an alloc chart status and adds historical status option

* Get tests passing again by making job status static for a sec

* Historical status panel click actions moved into their own component class

* job detail tests plz chill

* Testing if percy is fickle

* Hyper-specfic on summary distribution bar identifier

* Perhaps the 2nd allocSummary item no longer exists with the more accurate afterCreate data

* UI Test eschewing the page pattern

* Bones of a new acceptance test

* Track width changes explicitly with window-resize

* testlintfix

* Alloc counting tests

* Alloc grouping test

* Alloc grouping with complex resizing

* Refined the list of showable statuses

* PR feedback addressed

* renamed allocation-row to allocation-status-row

* [ui, job status] Make panel status mode a queryParam (#16345)

* queryParam changing

* Test for QP in panel

* Adding @tracked to legacy controller

* Move the job of switching to Historical out to larger context

* integration test mock passed func

* [ui] Service job deployment status panel (#16383)

* A very fast and loose deployment panel

* Removing Unknown status from the panel

* Set up oldAllocs list in constructor, rather than as a getter/tracked var

* Small amount of template cleanup

* Refactored latest-deployment new logic back into panel.js

* Revert now-unused latest-deployment component

* margin bottom when ungrouped also

* Basic integration tests for job deployment status panel

* Updates complete alloc colour to green for new visualizations only (#16618)

* Updates complete alloc colour to green for new visualizations only

* Pale green instead of dark green for viz in general

* [ui] Job Deployment Status: History and Update Props (#16518)

* Deployment history wooooooo

* Styled deployment history

* Update Params

* lintfix

* Types and groups for updateParams

* Live-updating history

* Harden with types, error states, and pending states

* Refactor updateParams to use trigger component

* [ui] Deployment History search (#16608)

* Functioning searchbox

* Some nice animations for history items

* History search test

* Fixing up some old mirage conventions

* some a11y rule override to account for scss keyframes

* Split panel into deploying and steady components

* HandleError passed from job index

* gridified panel elements

* TotalAllocs added to deploying.js

* Width perc to px

* [ui] Splitting deployment allocs by status, health, and canary status (#16766)

* Initial attempt with lots of scratchpad work

* Style mods per UI discussion

* Fix canary overflow bug

* Dont show canary or health for steady/prev-alloc blocks

* Steady state

* Thanks Julie

* Fixes steady-state versions

* Legen, wait for it...

* Test fixes now that we have a minimum block size

* PR prep

* Shimmer effect on pending and unplaced allocs (#16801)

* Shimmer effect on pending and unplaced

* Dont show animation in the legend

* [ui, deployments] Linking allocblocks and legends to allocation / allocations index routes (#16821)

* Conditional link-to component and basic linking to allocations and allocation routes

* Job versions filter added to allocations index page

* Steady state legends link

* Legend links

* Badge count links for versions

* Fix: faded class on steady-state legend items

* version link now wont show completed ones

* Fix a11y violations with link labels

* Combining some template conditional logic

* [ui, deployments] Conversions on long nanosecond update params (#16882)

* Conversions on long nanosecond nums

* Early return in updateParamGroups comp prop

* [ui, deployments] Mirage Actively Deploying Job and Deployment Integration Tests (#16888)

* Start of deployment alloc test scaffolding

* Bit of test cleanup and canary for ungrouped allocs

* Flakey but more robust integrations for deployment panel

* De-flake acceptance tests and add an actively deploying job to mirage

* Jitter-less alloc status distribution removes my bad math

* bugfix caused by summary.desiredTotal non-null

* More interesting mirage active deployment alloc breakdown

* Further tests for previous-allocs row

* Previous alloc legend tests

* Percy snapshots added to integration test

* changelog
2023-04-24 22:45:39 -04:00
Daniel Bennett 2c63d34296
Demo: NFS CSI Plugins (#16875)
Demo (and easily reproduce, locally) a CSI setup
with separate controller and node plugins.

This runs NFS in a container backed by a host volume
and CSI controller and node plugins from rocketduck:
  gitlab.com/rocketduck/csi-plugin-nfs

Co-authored-by: Florian Apolloner <florian@apolloner.eu>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-04-24 15:08:48 -05:00
Seth Hoenig 753c17c9de
services: un-mark group services as deregistered if restart hook runs (#16905)
* services: un-mark group services as deregistered if restart hook runs

This PR may fix a bug where group services will never be deregistered if the
group undergoes a task restart.

* e2e: add test case for restart and deregister group service

* cl: add cl

* e2e: add wait for service list call
2023-04-24 14:24:51 -05:00
dependabot[bot] 1633cab363
build(deps): bump github.com/shoenig/test from 0.6.3 to 0.6.4 in /api (#16895)
Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 0.6.3 to 0.6.4.
- [Release notes](https://github.com/shoenig/test/releases)
- [Commits](https://github.com/shoenig/test/compare/v0.6.3...v0.6.4)

---
updated-dependencies:
- dependency-name: github.com/shoenig/test
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-24 11:39:37 -05:00
Seth Hoenig e27f95448c
copywrite: excempt example assets from copywrite headers (#16971) 2023-04-24 10:36:11 -05:00
dependabot[bot] 801e4f961c
build(deps): bump github.com/hashicorp/vault/api from 1.8.3 to 1.9.1 (#16966)
Bumps [github.com/hashicorp/vault/api](https://github.com/hashicorp/vault) from 1.8.3 to 1.9.1.
- [Release notes](https://github.com/hashicorp/vault/releases)
- [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/vault/compare/v1.8.3...v1.9.1)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/vault/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-24 09:00:54 -05:00
Tim Gross 72cbe53f19
logs: allow disabling log collection in jobspec (#16962)
Some Nomad users ship application logs out-of-band via syslog. For these users
having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow
disabling the logmon and pointing the task's stdout/stderr to /dev/null.

This changeset is the first of several incremental improvements to log
collection short of full-on logging plugins. The next step will likely be to
extend the internal-only task driver configuration so that cluster
administrators can turn off log collection for the entire driver.

---

Fixes: #11175

Co-authored-by: Thomas Weber <towe75@googlemail.com>
2023-04-24 10:00:27 -04:00
valodzka 379497a484
fix host port handling for ipv6 (#16723) 2023-04-20 19:53:20 -07:00
Etienne Bruines 1e3531b978
cni: fix plugin fingerprinting versions (#16776)
CNI plugins v1.2.0 and above output a second line, containing supported protocol versions.
2023-04-20 18:44:39 -07:00
Luiz Aoqui a1ba068e1f
cli: fix panic on job plan when -diff=false (#16944)
PR #14492 introduced a new check to return 0 when the `nomad job plan`
command returns a diff of type `None`.

But the `-diff` CLI flag was also being used to control whether the plan
request should return the diff of not instead of just controlling if the
diff was printed.

This means that when `-diff=false` is set the response does not include
any diff information, and so the new check panics.

This commit fixes the problem by always requesting a diff and using the
`-diff` only for controlling output, as it's currently documented.
2023-04-20 17:33:29 -07:00
Tim Gross b5a54b3b5f
docs: fix keyring path in install docs (#16946) 2023-04-20 16:20:39 -04:00
astudentofblake 42c4c8d5ea
fix: added landlock access to /usr/libexec for getter (#16900) 2023-04-20 11:16:04 -05:00
Luiz Aoqui b0fe69fded
docs: add missing field Capabilities to Namespace API (#16931) 2023-04-19 08:14:36 -07:00
claire labry d2beea3435
changelog: add changelog update for vendor label for linux packaging (#16071) 2023-04-19 08:14:14 -07:00
Luiz Aoqui c7387dbd3a
docs: add missing API field JobACL and fix workload identity headers (#16930) 2023-04-19 08:12:58 -07:00
Chris van Meer d2f1766f3a
Updates to the UI block (#16328)
1. On the Consul address, following the recommendation for the HTTPS
   API on port 8501.
2. Add the hint to use HEX values for the colors.
2023-04-18 18:28:17 -07:00
Luiz Aoqui fb588fcbb8
allocrunner: prevent panic on network manager (#16921) 2023-04-18 13:39:13 -07:00
Süleyman Vurucu a5967b1bad
Update metric names (#16894)
Dashboard uses old metric names
2023-04-18 13:25:42 -07:00
James Rasell 367cfa6d93
rpc: use "+" concatination in hot path RPC rate limit metrics. (#16923) 2023-04-18 13:41:34 +01:00
Luiz Aoqui 37ddd2dd86
ui: fix notification service in token controller (#16918)
Remove unneeded service injection. This service is not being used in
this controller and currently only exists in `main`, causing
`release/1.5.x` to break.
2023-04-17 20:33:50 -04:00
Luiz Aoqui 8285be09e6
test: fix quota command autocomplete (#16917) 2023-04-17 20:08:55 -04:00
Charlie Voiselle 9e8f2a937c
[scheduler] Honor false for distinct hosts constraint (#16907)
* Honor value for distinct_hosts constraint
* Add test for feasibility checking for `false`
---------
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-04-17 17:43:56 -04:00
Ian Fijolek 619f49afcf
hashicorp/go-msgpack v2 (#16810)
* Upgrade from hashicorp/go-msgpack v1.1.5 to v2.1.0

Fixes #16808

* Update hashicorp/net-rpc-msgpackrpc to v2 to match go-msgpack

* deps: use go-msgpack v2.0.0

go-msgpack v2.1.0 includes some code changes that we will need to
investigate furthere to assess its impact on Nomad, so keeping this
dependency on v2.0.0 for now since it's no-op.

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-17 17:02:05 -04:00
Seth Hoenig 92d4a05534
users: eliminate nobody user memoization (#16904)
This PR eliminates code specific to looking up and caching the uid/gid/user.User
object associated with the nobody user in an init block. This code existed before
adding the generic users cache and was meant to optimize the one search path we
knew would happen often. Now that we have the cache, seems reasonable to eliminate
this init block and use the cache instead like for any other user.

Also fixes a constraint on the podman (and other) drivers, where building without
CGO became problematic on some OS like Fedora IoT where the nobody user cannot
be found with the pure-Go standard library.

Fixes github.com/hashicorp/nomad-driver-podman/issues/228
2023-04-17 12:30:30 -05:00
Tim Gross ee071531de
docs: disable secret scanning for documentation content (#16903)
Examples in the documentation frequently include tokens, including Vault tokens
which end up triggering GitHub's secret scanner. Remove these from consideration
so that we don't get false positive reports.
2023-04-17 10:03:52 -04:00