Commit Graph

24287 Commits

Author SHA1 Message Date
Seth Hoenig e7c534f9d0
git: ignore .fleet directory (#16144)
New IDE from jetbrains gets its own config directory.
2023-02-13 07:39:30 -06:00
Charlie Voiselle 37a078ebce
Add information about template to interpolation page (#10807)
* Add information about templating using `env` function to refer to environment variables.
2023-02-10 16:12:11 -05:00
Charlie Voiselle 65ce3ec8de
[core] Do not start the plugin loader on non-clients (#16111)
The plugin loader loads task and device driver plugins which are not
used on server nodes.
2023-02-10 15:33:16 -05:00
Phil Renaud 6955d66891
slashed comments and datacenters note (#16116) 2023-02-10 14:10:58 -05:00
Charlie Voiselle 0b9ba1ae92
chore: Convert assets from bindatafs to go embeds (#16066)
* Convert assets from bindatafs to go embeds
* Add command/asset to "uninteresting" list for missing test check
* Remove generate-examples target
* Update paths in tests
2023-02-10 12:02:29 -05:00
Tim Gross 65c7e149d3
eval broker: use write lock when reaping cancelable evals (#16112)
The eval broker's `Cancelable` method used by the cancelable eval reaper mutates
the slice of cancelable evals by removing a batch at a time from the slice. But
this method unsafely uses a read lock despite this mutation. Under normal
workloads this is likely to be safe but when the eval broker is under the heavy
load this feature is intended to fix, we're likely to have a race
condition. Switch this to a write lock, like the other locks that mutate the
eval broker state.

This changeset also adjusts the timeout to allow poorly-sized Actions runners
more time to schedule the appropriate goroutines. The test has also been updated
to use `shoenig/test/wait` so we can have sensible reporting of the results
rather than just a timeout error when things go wrong.
2023-02-10 10:40:41 -05:00
Tim Gross c2bd829fe2
tests: don't mutate global structs in core scheduler tests (#16120)
Some of the core scheduler tests need the maximum batch size for writes to be
smaller than the usual `structs.MaxUUIDsPerWriteRequest`. But they do so by
unsafely modifying the global struct, which creates test flakes in other tests.

Modify the functions under test to take a batch size parameter. Production code
will pass the global while the tests can inject smaller values. Turn the
`structs.MaxUUIDsPerWriteRequest` into a constant, and add a semgrep rule for
avoiding this kind of thing in the future.
2023-02-10 09:26:00 -05:00
Michael Schurter 312dc5d859
Workload Identity, Task API, and Dynamic Node Metadata Docs (#16102)
* docs: add dynamic node metadata api docs

Also update all paths in the client API docs to explicitly state the
`/v1/` prefix. We're inconsistent about that, but I think it's better to
display the full path than to only show the fragment. If we ever do a
`/v2/` whether or not we explicitly state `/v1/` in our docs won't be
our greatest concern.

* docs: add task-api docs
2023-02-09 16:03:43 -08:00
Luiz Aoqui e7546e9e16
ci: reduce number of ember-test-audit iterations (#16118) 2023-02-09 15:28:30 -05:00
Charlie Voiselle 7cfb938723
[chore] Move TestUtil_loadVersionControlGlobalConfigs into build flagged file (#16114) 2023-02-09 14:25:26 -05:00
Luiz Aoqui 85df736049
ui: hide custom label if not text is provided (#16115) 2023-02-09 14:24:27 -05:00
Seth Hoenig bf2f3885a3
tests: reduce timeout on test-api (#16107)
Sometimes GHA gets wedged, and there is no need to waste 30 minutes
on a job that usually takes < 3.

e.g. stuck loading module cache, which is out of our control
https://github.com/hashicorp/nomad/actions/runs/4135098558/jobs/7147128419
2023-02-09 10:26:10 -06:00
Tim Gross 69a2040e82
acl: never return auth errors for `ACL.Bootstrap` RPC (#16108)
In #15901 we introduced pre-forwarding authentication for RPCs so that we can
grab the identity for rate metrics. The `ACL.Bootstrap` RPC is an
unauthenticated endpoint, so any error message from authentication is not
particularly useful. This would be harmless, but if you try to bootstrap with
your `NOMAD_TOKEN` already set (perhaps because you were talking to another
cluster previously from the same shell session), you'll get an authentication
error instead of just having the token be ignored. This is a regression from the
existing behavior, so have this endpoint ignore auth errors the same way we do
for every other unauthenticated endpoint (ex `Status.Peers`)
2023-02-09 10:02:56 -05:00
Tim Gross be3cd7236f
docs: fix links in 1.5.0 upgrade guide (#16106) 2023-02-09 09:39:49 -05:00
Seth Hoenig af28ac1610
users: create cache for user lookups (#16100)
* users: create cache for user lookups

This PR introduces a global cache for OS user lookups. This should
relieve pressure on the OS domain/directory lookups, which would be
queried more now that Task API exists.

Hits are cached for 1 hour, and misses are cached for 1 minute. These
values are fairly arbitrary - we can tweak them if there is any reason to.

Closes #16010

* users: delete expired negative entry from cache
2023-02-09 08:37:50 -06:00
James Rasell 0d37892024
cli: fix use of the sanitized method type for the login command. (#16105)
When an auth method was not supplied and the OIDC type was given
in lower case, the CLI was not matching the default method due to
casing and responded with a confusing user message.

This change fixes the above problem, along with making use of the
santized type easier.
2023-02-09 15:23:54 +01:00
Seth Hoenig 00d08da504
circle: hardcode go1.20 string because yaml (#16104) 2023-02-09 07:41:43 -06:00
James Rasell ac90826bc2
read: fix incorrect link to ref. arch. (#16103) 2023-02-09 11:52:31 +01:00
Seth Hoenig 0e7bf87ee1
deps: upgrade to hashicorp/golang-lru/v2 (#16085) 2023-02-08 15:20:33 -06:00
Seth Hoenig a4519c739d
users: eliminate LookupGroupId and its one use case (#16093)
This PR deletes the user.LookupGroupId function as it was only being used
in a single test case, and its value was not important to the test.
2023-02-08 14:57:09 -06:00
Michael Schurter aa9d035c58
docs: cleanup 1.5 changelog entries (#16094) 2023-02-08 12:10:08 -08:00
Michael Schurter 79aa7e59e2
Merge pull request #16092 from hashicorp/release/1.5.0-beta.1
Release/1.5.0 beta.1
2023-02-08 10:56:13 -08:00
hc-github-team-nomad-core be51807dc0 Prepare for next release 2023-02-08 08:54:40 +00:00
hc-github-team-nomad-core 6dedb795cf Generate files for 1.5.0-beta.1 release 2023-02-08 08:54:36 +00:00
James Rasell 579db85ffd
Prepare release 1.5.0-beta.1 2023-02-08 08:45:05 +00:00
James Rasell 53e0f424e9
rcp: bump SSO feature gate version. (#16080) 2023-02-07 15:45:07 -08:00
Michael Schurter 35d65c7c7e
Dynamic Node Metadata (#15844)
Fixes #14617
Dynamic Node Metadata allows Nomad users, and their jobs, to update Node metadata through an API. Currently Node metadata is only reloaded when a Client agent is restarted.

Includes new UI for editing metadata as well.

---------

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
2023-02-07 14:42:25 -08:00
Tim Gross bdfe11b396
docs: update example `license_path` (#16082)
In #13374 we updated the commented-out `license_path` in the packaged example
configuration file to match the existing documentation. Although this config
value was commented-out, it was reported that changing the value was
confusing. Update the commented-out line to the previous value and update the
documented examples to match that. This matches most of the examples for
Consul/Vault licensing as well. I've double-checked the tutorials and it looks
like it'd been left on the previous value there, so no additional work to be
done.
2023-02-07 16:28:51 -05:00
Charlie Voiselle 31a289891d
Add sprig for command templates (#9053)
Adds the sprig functions to the template funcmap prepended with `sprig_` to match the behavior in consul-template
2023-02-07 14:07:20 -05:00
James Rasell 8cc212167b
agent: fix agent HTTP server audit event implementation access. (#16076) 2023-02-07 17:20:11 +01:00
Dao Thanh Tung 54dc2f629a
doc: specifiy the default output is of json format for `nomad quota inspect` command (#15984)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-02-07 16:34:05 +01:00
Seth Hoenig 590ae08752
main: remove deprecated uses of rand.Seed (#16074)
* main: remove deprecated uses of rand.Seed

go1.20 deprecates rand.Seed, and seeds the rand package
automatically. Remove cases where we seed the random package,
and cleanup the one case where we intentionally create a
known random source.

* cl: update cl

* mod: update go mod
2023-02-07 09:19:38 -06:00
Tim Gross 8a7d6b0cde
cli: remove deprecated `keyring` and `keygen` commands (#16068)
These command were marked as deprecated in 1.4.0 with intent to remove in
1.5.0. Remove them and clean up the docs.
2023-02-07 09:49:52 -05:00
Luiz Aoqui d6bb417795
docs: update default Nomad bridge config (#16072) 2023-02-07 09:47:41 -05:00
Luiz Aoqui 2cf2b5831a
docs: update CLI contrib checklist (#16073) 2023-02-07 09:46:27 -05:00
Seth Hoenig c923bc59b1
e2e: mark framework package as deprecated (#16075)
Nothing more motivating than lots of deprecation warnings
to get some code refactored.
2023-02-07 08:10:40 -06:00
Dao Thanh Tung ae720fe28d
Add `-json` and `-t` flag for `nomad acl token create` command (#16055)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-02-07 12:05:41 +01:00
Bryce Kalow 45cf63d88f
website: add content-check command (#16064) 2023-02-06 16:37:50 -06:00
Seth Hoenig 68894bdc62
docker: disable driver when running as non-root on cgroups v2 hosts (#16063)
* docker: disable driver when running as non-root on cgroups v2 hosts

This PR modifies the docker driver to behave like exec when being run
as a non-root user on a host machine with cgroups v2 enabled. Because
of how cpu resources are managed by the Nomad client, the nomad agent
must be run as root to manage docker-created cgroups.

* cl: update cl
2023-02-06 14:09:19 -06:00
Michael Schurter 0a496c845e
Task API via Unix Domain Socket (#15864)
This change introduces the Task API: a portable way for tasks to access Nomad's HTTP API. This particular implementation uses a Unix Domain Socket and, unlike the agent's HTTP API, always requires authentication even if ACLs are disabled.

This PR contains the core feature and tests but followup work is required for the following TODO items:

- Docs - might do in a followup since dynamic node metadata / task api / workload id all need to interlink
- Unit tests for auth middleware
- Caching for auth middleware
- Rate limiting on negative lookups for auth middleware

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-02-06 11:31:22 -08:00
Seth Hoenig 21895fb6f0
build: update hc-install and golangci-lint tools (#16067)
Update golangci-lint again, and also hc-install which now has a real version.
2023-02-06 12:54:07 -06:00
Seth Hoenig ad28708511
build: update hclogvet (#16065) 2023-02-06 11:04:38 -06:00
Jai f23613d234
ui: conditionally render buttons on job model (#16062) 2023-02-06 11:17:12 -05:00
Tim Gross 811fe333da
scheduler: move utils into files specific to their scheduler type (#16051)
Many of the functions in the `utils.go` file are specific to a particular
scheduler, and very few of them have guards (or even names) that help avoid
misuse with features specific to a given scheduler type. Move these
functions (and their tests) into files specific to their scheduler type without
any functionality changes to make it clear which bits go with what.
2023-02-03 12:29:39 -05:00
Seth Hoenig 911700ffea
build: update to go1.20 (#16029)
* build: update to go1.20

* build: use stringy go1.20 in circle yaml

* tests: handle new x509 certificate error structure in go1.20

* cl: add cl entry
2023-02-03 08:14:53 -06:00
Bryce Kalow d2a9fbf03d
docs: fix outstanding content conformance errors (#16040) 2023-02-02 15:40:07 -06:00
Phil Renaud 9208b8ad46
Footer styles re-applied to job template footer (#16037) 2023-02-02 16:31:08 -05:00
Phil Renaud d3c351d2d2
Label for the Web UI (#16006)
* Demoable state

* Demo mirage color

* Label as a block with foreground and background colours

* Test mock updates

* Go test updated

* Documentation update for label support
2023-02-02 16:29:04 -05:00
Tim Gross 19a2c065f4
System and sysbatch jobs always have zero index (#16030)
Service jobs should have unique allocation Names, derived from the
Job.ID. System jobs do not have unique allocation Names because the index is
intended to indicated the instance out of a desired count size. Because system
jobs do not have an explicit count but the results are based on the targeted
nodes, the index is less informative and this was intentionally omitted from the
original design.

Update docs to make it clear that NOMAD_ALLOC_INDEX is always zero for 
system/sysbatch jobs

Validate that `volume.per_alloc` is incompatible with system/sysbatch jobs.
System and sysbatch jobs always have a `NOMAD_ALLOC_INDEX` of 0. So
interpolation via `per_alloc` will not work as soon as there's more than one
allocation placed. Validate against this on job submission.
2023-02-02 16:18:01 -05:00
Daniel Bennett 335f0a5371
docs: how to troubleshoot consul connect envoy (#15908)
* largely a doc-ification of this commit message:
  d47678074bf8ae9ff2da3c91d0729bf03aee8446
  this doesn't spell out all the possible failure modes,
  but should be a good starting point for folks.

* connect: add doc link to envoy bootstrap error

* add Unwrap() to RecoverableError
  mainly for easier testing
2023-02-02 14:20:26 -06:00