Commit graph

23759 commits

Author SHA1 Message Date
Seth Hoenig 14a7872150
Merge pull request #14385 from hashicorp/f-cg-use-kill
cgroups: refactor v2 kill path to use cgroups.kill interface file
2022-08-30 09:02:02 -05:00
Seth Hoenig 52de2dc09d
Merge pull request #14290 from hashicorp/cleanup-more-helper-cleanup
cleanup: tidy up helper package some more
2022-08-30 08:19:48 -05:00
quoing ce7a3745d5
docs: template change script example correction (#14368)
"path" parameter doesn't work, should be command
2022-08-30 12:09:55 +02:00
James Rasell 1ed17ada46
Merge pull request #14320 from hashicorp/f-gh-13120-sso-umbrella-merged-main
acl: add token expiration and ACL role functionality
2022-08-30 10:42:20 +02:00
James Rasell 755b4745ed
Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-30 08:59:13 +01:00
Phil Renaud 08ad4d0b42
Merge pull request #14383 from hashicorp/variables-rename-ui
[ui] Variables Rename
2022-08-29 16:03:38 -04:00
Seth Hoenig 5d5c8af930 cgroups: refactor v2 kill path to use cgroups.kill interface file
This PR refactors the cgroups v2 group kill code path to use the
cgroups.kill interface file for destroying the cgroup. Previously
we copied the freeze + sigkill + unfreeze pattern from the v1 code,
but v2 provides a more efficient and more race-free way to handle
this.

Closes #14371
2022-08-29 14:55:13 -05:00
Luiz Aoqui cacfb8e380
Revert "refact: conditional checks for token secret before fetch (#14134)" (#14381)
This reverts commit b9fec224bac4c0d6b3a2d699367913526fe1aa31.
2022-08-29 15:20:26 -04:00
Seth Hoenig 3e1e2001b9
Merge pull request #14143 from hashicorp/cleanup-slice-sets-3
cleanup: more cleanup of slices that are really sets
2022-08-29 13:52:59 -05:00
Phil Renaud eedee3e53d Mostly user-facing renames and a few test renames 2022-08-29 14:45:49 -04:00
Phil Renaud 2b4996e070 Policy block name changed 2022-08-29 14:41:58 -04:00
Phil Renaud e9a185d634 Variable form and css classes 2022-08-29 14:35:43 -04:00
Phil Renaud 4f457fcc14 ability rename 2022-08-29 14:31:16 -04:00
Tim Gross 7d1eb2efd5
keyring: split structs to its own file (#14378) 2022-08-29 14:18:35 -04:00
Seth Hoenig 9d0e274f27 cleanup: cleanup more slice-set comparisons 2022-08-29 12:04:21 -05:00
Tim Gross d7652fdd3a
docs: rename Secure Variables to Variables (#14352) 2022-08-29 11:37:08 -04:00
Tim Gross 62a968f443
Merge pull request #14351 from hashicorp/variables-rename
Variables rename
2022-08-29 11:36:50 -04:00
Piotr Kazmierczak 5f353503e5
bugfix: fixed template validation panic in case of incorrect ChangeScript configuration (#14374)
Fixes #14367
2022-08-29 17:11:15 +02:00
Michael Schurter dbffe22465
consul: allow stale namespace results (#12953)
Nomad reconciles services it expects to be registered in Consul with
what is actually registered in the local Consul agent. This is necessary
to prevent leaking service registrations if Nomad crashes at certain
points (or if there are bugs).

When Consul has namespaces enabled, we must iterate over each available
namespace to be sure no services were leaked into non-default
namespaces.

Since this reconciliation happens often, there's no need to require
results from the Consul leader server. In large clusters this creates
far more load than the "freshness" of the response is worth.

Therefore this patch switches the request to AllowStale=true
2022-08-26 16:05:12 -07:00
Tim Gross 1dc053b917 rename SecureVariables to Variables throughout 2022-08-26 16:06:24 -04:00
Tim Gross dcfd31296b file rename 2022-08-26 16:06:24 -04:00
Jai e427291d65
service-health-bar (#14295)
* ui: add service-status-bar

* test: service-status-bar
2022-08-26 12:04:59 -04:00
Vladimir Sokolov b646810401
cli: force periodic job if its id equals search prefix 2022-08-26 10:54:37 -04:00
Seth Hoenig b87689d2d1
Merge pull request #14318 from hashicorp/cleanup-create-pointer-compare
cleanup: create pointer.Compare helper function
2022-08-26 09:15:41 -05:00
Luiz Aoqui ad84b22a72
Post 1.3.4 release (#14329)
* Generate files for 1.3.4 release

* Prepare for next release

* Update CHANGELOG.md

Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>
2022-08-26 10:09:13 -04:00
dependabot[bot] 451194397f
build(deps): bump github.com/hashicorp/go-memdb from 1.3.2 to 1.3.3 (#14206)
Bumps [github.com/hashicorp/go-memdb](https://github.com/hashicorp/go-memdb) from 1.3.2 to 1.3.3.
- [Release notes](https://github.com/hashicorp/go-memdb/releases)
- [Changelog](https://github.com/hashicorp/go-memdb/blob/main/changes.go)
- [Commits](https://github.com/hashicorp/go-memdb/compare/v1.3.2...v1.3.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-memdb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-26 10:07:41 -04:00
Seth Hoenig 6b2655ad86 cleanup: create pointer.Compare helper function
This PR creates a pointer.Compare helper for comparing equality of
two pointers. Strictly only works with primitive types we know are
safe to derefence and compare using '=='.
2022-08-26 08:55:59 -05:00
dependabot[bot] 42792c4813
build(deps): bump github.com/hashicorp/go-hclog from 1.2.0 to 1.2.2 (#14208)
Bumps [github.com/hashicorp/go-hclog](https://github.com/hashicorp/go-hclog) from 1.2.0 to 1.2.2.
- [Release notes](https://github.com/hashicorp/go-hclog/releases)
- [Commits](https://github.com/hashicorp/go-hclog/compare/v1.2.0...v1.2.2)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-hclog
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-26 09:31:54 -04:00
dependabot[bot] 1eb34c1099
build(deps): bump github.com/aws/aws-sdk-go from 1.42.27 to 1.44.84 (#14326)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.42.27 to 1.44.84.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.42.27...v1.44.84)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-26 09:13:37 -04:00
Charlie Voiselle ad737d008b
SV API: return upserted variable to caller (#14325)
* Return created variable to caller in HTTP and Go APIs
* Update tests for returned values
2022-08-25 17:38:15 -04:00
dependabot[bot] 6d3389653b
build(deps): bump github.com/shirou/gopsutil/v3 from 3.21.12 to 3.22.7 (#14209)
* build(deps): bump github.com/shirou/gopsutil/v3 from 3.21.12 to 3.22.7

Bumps [github.com/shirou/gopsutil/v3](https://github.com/shirou/gopsutil) from 3.21.12 to 3.22.7.
- [Release notes](https://github.com/shirou/gopsutil/releases)
- [Commits](https://github.com/shirou/gopsutil/compare/v3.21.12...v3.22.7)

---
updated-dependencies:
- dependency-name: github.com/shirou/gopsutil/v3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* changelog entry

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-08-25 14:15:41 -04:00
Seth Hoenig 38ad855ae7
Merge pull request #14230 from hashicorp/b-fix-cpuset-init
client: refactor cpuset manager initialization
2022-08-25 11:19:39 -05:00
Seth Hoenig 51384dd63f client: refactor cpuset manager initialization
This PR refactors the code path in Client startup for setting up the cpuset
cgroup manager (non-linux systems not affected).

Before, there was a logic bug where we would try to read the cpuset.cpus.effective
cgroup interface file before ensuring nomad's parent cgroup existed. Therefor that
file would not exist, and the list of useable cpus would be empty. Tasks started
thereafter would not have a value set for their cpuset.cpus.

The refactoring fixes some less than ideal coding style. Instead we now bootstrap
each cpuset manager type (v1/v2) within its own constructor. If something goes
awry during bootstrap (e.g. cgroups not enabled), the constructor returns the
noop implementation and logs a warning.

Fixes #14229
2022-08-25 11:18:43 -05:00
Seth Hoenig fd9744b9eb
Merge pull request #14301 from hashicorp/b-fix-check-status-test-racey
testing: fix flakey check status test
2022-08-25 08:30:46 -05:00
James Rasell 601588df6b
Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-25 12:14:29 +01:00
James Rasell 2293dcf35f
Merge pull request #14291 from hashicorp/f-gh-13120-sso-various-small-fixes
acl: three small fixes for CLI and state consistency
2022-08-25 12:05:32 +02:00
James Rasell 7a0798663d
acl: fix a bug where roles could be duplicated by name.
An ACL roles name must be unique, however, a bug meant multiple
roles of the same same could be created. This fixes that problem
with checks in the RPC handler and state store.
2022-08-25 09:20:43 +01:00
Luiz Aoqui 31ab7964bd
ui: task lifecycle restart all tasks (#14223)
Now that tasks that have finished running can be restarted, the UI needs
to use the actual task state to determine which CSS class to use when
rendering the task lifecycle chart element.
2022-08-24 18:43:44 -04:00
Luiz Aoqui e012d9411e
Task lifecycle restart (#14127)
* allocrunner: handle lifecycle when all tasks die

When all tasks die the Coordinator must transition to its terminal
state, coordinatorStatePoststop, to unblock poststop tasks. Since this
could happen at any time (for example, a prestart task dies), all states
must be able to transition to this terminal state.

* allocrunner: implement different alloc restarts

Add a new alloc restart mode where all tasks are restarted, even if they
have already exited. Also unifies the alloc restart logic to use the
implementation that restarts tasks concurrently and ignores
ErrTaskNotRunning errors since those are expected when restarting the
allocation.

* allocrunner: allow tasks to run again

Prevent the task runner Run() method from exiting to allow a dead task
to run again. When the task runner is signaled to restart, the function
will jump back to the MAIN loop and run it again.

The task runner determines if a task needs to run again based on two new
task events that were added to differentiate between a request to
restart a specific task, the tasks that are currently running, or all
tasks that have already run.

* api/cli: add support for all tasks alloc restart

Implement the new -all-tasks alloc restart CLI flag and its API
counterpar, AllTasks. The client endpoint calls the appropriate restart
method from the allocrunner depending on the restart parameters used.

* test: fix tasklifecycle Coordinator test

* allocrunner: kill taskrunners if all tasks are dead

When all non-poststop tasks are dead we need to kill the taskrunners so
we don't leak their goroutines, which are blocked in the alloc restart
loop. This also ensures the allocrunner exits on its own.

* taskrunner: fix tests that waited on WaitCh

Now that "dead" tasks may run again, the taskrunner Run() method will
not return when the task finishes running, so tests must wait for the
task state to be "dead" instead of using the WaitCh, since it won't be
closed until the taskrunner is killed.

* tests: add tests for all tasks alloc restart

* changelog: add entry for #14127

* taskrunner: fix restore logic.

The first implementation of the task runner restore process relied on
server data (`tr.Alloc().TerminalStatus()`) which may not be available
to the client at the time of restore.

It also had the incorrect code path. When restoring a dead task the
driver handle always needs to be clear cleanly using `clearDriverHandle`
otherwise, after exiting the MAIN loop, the task may be killed by
`tr.handleKill`.

The fix is to store the state of the Run() loop in the task runner local
client state: if the task runner ever exits this loop cleanly (not with
a shutdown) it will never be able to run again. So if the Run() loops
starts with this local state flag set, it must exit early.

This local state flag is also being checked on task restart requests. If
the task is "dead" and its Run() loop is not active it will never be
able to run again.

* address code review requests

* apply more code review changes

* taskrunner: add different Restart modes

Using the task event to differentiate between the allocrunner restart
methods proved to be confusing for developers to understand how it all
worked.

So instead of relying on the event type, this commit separated the logic
of restarting an taskRunner into two methods:
- `Restart` will retain the current behaviour and only will only restart
  the task if it's currently running.
- `ForceRestart` is the new method where a `dead` task is allowed to
  restart if its `Run()` method is still active. Callers will need to
  restart the allocRunner taskCoordinator to make sure it will allow the
  task to run again.

* minor fixes
2022-08-24 17:43:07 -04:00
Tim Gross c732b215f0
vault: detect namespace change in config reload (#14298)
The `namespace` field was not included in the equality check between old and new
Vault configurations, which meant that a Vault config change that only changed
the namespace would not be detected as a change and the clients would not be
reloaded.

Also, the comparison for boolean fields such as `enabled` and
`allow_unauthenticated` was on the pointer and not the value of that pointer,
which results in spurious reloads in real config reload that is easily missed in
typical test scenarios.

Includes a minor refactor of the order of fields for `Copy` and `Merge` to match
the struct fields in hopes it makes it harder to make this mistake in the
future, as well as additional test coverage.
2022-08-24 17:03:29 -04:00
Seth Hoenig 6398548ebd
Merge pull request #14283 from hashicorp/f-java-corretto-test-case
drivers/java: add parsing test case for corretto 17
2022-08-24 15:28:20 -05:00
Seth Hoenig 5e18c7b5b2
Merge pull request #14297 from hashicorp/b-logmon-fork-mystery-bin
client/logmon: acquire executable in init block
2022-08-24 15:25:09 -05:00
Seth Hoenig ff59b90d41 testing: fix flakey check status test
This PR fixes a flakey test where we did not wait on the check
status to actually become failing (go too fast and you just get
a pending check).

Instead add a helper for waiting on any check in the alloc to become
the state we are looking for.
2022-08-24 15:11:41 -05:00
Seth Hoenig 062c817450 cleanup: move fs helpers into escapingfs 2022-08-24 14:45:34 -05:00
Seth Hoenig 423ea1a5c4 client/logmon: acquire executable in init block
This PR causes the logmon task runner to acquire the binary of the
Nomad executable in an 'init' block, so as to almost certainly get
the name while the nomad file still exists.

This is an attempt at fixing the case where a deleted Nomad file
(e.g. during upgrade) may be getting renamed with a mysterious
suffix first.

If this doesn't work, as a last resort we can literally just trim
the mystery string.

Fixes: #14079
2022-08-24 13:17:20 -05:00
Piotr Kazmierczak 7077d1f9aa
template: custom change_mode scripts (#13972)
This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707
2022-08-24 17:43:01 +02:00
Luiz Aoqui 848f2dcc22
changelog: update #14212 to breaking-change (#14292) 2022-08-24 11:36:53 -04:00
Seth Hoenig bff6c88683 cleanup: remove more copies of min/max from helper 2022-08-24 09:56:15 -05:00
Luiz Aoqui ea1802ffa0
deps: sync versions of go-discover in go.mod (#14269)
In #13491 the version of `go-discover` was updated in `go.mod` but the
comment above it mentions that it also needs to be updated in the
`replace` directive.
2022-08-24 10:32:13 -04:00
Piotr Kazmierczak 077b6e7098
docs: Update upgrade guide to reflect enterprise changes introduced in nomad-enterprise (#14212)
This PR documents a change made in the enterprise version of nomad that addresses the following issue:

When a user tries to filter audit logs, they do so with a stanza that looks like the following:

audit {
  enabled = true

  filter "remove deletes" {
    type = "HTTPEvent"
    endpoints  = ["*"]
    stages = ["OperationComplete"]
    operations = ["DELETE"]
  }
}

When specifying both an "endpoint" and a "stage", the events with both matching a "endpoint" AND a matching "stage" will be filtered.

When specifying both an "endpoint" and an "operation" the events with both matching a "endpoint" AND a matching "operation" will be filtered.

When specifying both a "stage" and an "operation" the events with a matching a "stage" OR a matching "operation" will be filtered.

The "OR" logic with stages and operations is unexpected and doesn't allow customers to get specific on which events they want to filter. For instance the following use-case is impossible to achieve: "I want to filter out all OperationReceived events that have the DELETE verb".
2022-08-24 16:31:49 +02:00