Commit graph

4286 commits

Author SHA1 Message Date
Florian Apolloner f66d61e17f
consul: Removed unused ConsulUsage.Kinds. (#11303) 2022-09-22 10:07:14 -05:00
Jorge Marey 584ddfe859
Add Namespace, Job and Group to envoy stats (#14311) 2022-09-22 10:38:21 -04:00
Seth Hoenig 2088ca3345
cleanup more helper updates (#14638)
* cleanup: refactor MapStringStringSliceValueSet to be cleaner

* cleanup: replace SliceStringToSet with actual set

* cleanup: replace SliceStringSubset with real set

* cleanup: replace SliceStringContains with slices.Contains

* cleanup: remove unused function SliceStringHasPrefix

* cleanup: fixup StringHasPrefixInSlice doc string

* cleanup: refactor SliceSetDisjoint to use real set

* cleanup: replace CompareSliceSetString with SliceSetEq

* cleanup: replace CompareMapStringString with maps.Equal

* cleanup: replace CopyMapStringString with CopyMap

* cleanup: replace CopyMapStringInterface with CopyMap

* cleanup: fixup more CopyMapStringString and CopyMapStringInt

* cleanup: replace CopySliceString with slices.Clone

* cleanup: remove unused CopySliceInt

* cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice

* cleanup: replace CopyMap with maps.Clone

* cleanup: run go mod tidy
2022-09-21 14:53:25 -05:00
Michael Schurter bd4b4b8f66
Data race fixes in tests and a new semgrep rule (#14594)
* test: don't use loop vars in goroutines

fixes a data race in the test

* test: copy objects in statestore before mutating

fixes data race in test

* test: @lgfa29's segmgrep rule for loops/goroutines

Found 2 places where we were improperly using loop variables inside
goroutines.
2022-09-15 10:35:08 -07:00
Tim Gross 89dfdef95d
variables: handler should catch errors before conflicts (#14591) 2022-09-14 13:14:17 -04:00
Mahmood Ali a9d5e4c510
scheduler: stopped-yet-running allocs are still running (#10446)
* scheduler: stopped-yet-running allocs are still running

* scheduler: test new stopped-but-running logic

* test: assert nonoverlapping alloc behavior

Also add a simpler Wait test helper to improve line numbers and save few
lines of code.

* docs: tried my best to describe #10446

it's not concise... feedback welcome

* scheduler: fix test that allowed overlapping allocs

* devices: only free devices when ClientStatus is terminal

* test: output nicer failure message if err==nil

Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-09-13 12:52:47 -07:00
Seth Hoenig bf4dd30919
Merge pull request #14553 from hashicorp/f-nsd-check-watcher
servicedisco: implement check_restart support for nomad service checks
2022-09-13 09:55:51 -05:00
Charlie Voiselle 6ab59d2aa6
var: Correct 0-index CAS Deletes (#14555)
* Add missing 0 case for VarDeleteCAS, more comments
* Add tests for VarDeleteCAS
2022-09-13 10:12:08 -04:00
Seth Hoenig 9a943107c7 servicedisco: implement check_restart for nomad service checks
This PR implements support for check_restart for checks registered
in the Nomad service provider.

Unlike Consul, Nomad service checks never report a "warning" status,
and so the check_restart.ignore_warnings configuration is not valid
for Nomad service checks.
2022-09-13 08:59:23 -05:00
Seth Hoenig b960925939
Merge pull request #14546 from hashicorp/f-refactor-check-watcher
client: refactor check watcher to be reusable
2022-09-13 07:32:32 -05:00
Tim Gross 03312f3227
variables: restrict allowed paths for variables (#14547)
Restrict variable paths to RFC3986 URL-safe characters that don't conflict with
the use of characters "@" and "." in `template` blocks. This prevents users from
writing variables that will require tricky templating syntax or that they simply
won't be able to use.

Also restrict the length so that a user can't make queries in the state store
unusually expensive (as they are O(k) on the key length).
2022-09-12 16:37:33 -04:00
Seth Hoenig feff36f3f7 client: refactor check watcher to be reusable
This PR refactors agent/consul/check_watcher into client/serviceregistration,
and abstracts away the Consul-specific check lookups.

In doing so we should be able to reuse the existing check watcher logic for
also watching NSD checks in a followup PR.

A chunk of consul/unit_test.go is removed - we'll cover that in e2e tests
in a follow PR if needed. In the long run I'd like to remove this whole file.
2022-09-12 10:13:31 -05:00
Derek Strickland 5ca934015b
job_endpoint: check spec for all regions (#14519)
* job_endpoint: check spec for all regions
2022-09-12 09:24:26 -04:00
Charlie Voiselle b55112714f
Vars: CLI commands for var get, var put, var purge (#14400)
* Includes updates to `var init`
2022-09-09 17:55:20 -04:00
Charlie Voiselle e58998e218
Add client scheduling eligibility to heartbeat (#14483) 2022-09-08 14:31:36 -04:00
Tim Gross 3fc7482ecd
CSI: failed allocation should not block its own controller unpublish (#14484)
A Nomad user reported problems with CSI volumes associated with failed
allocations, where the Nomad server did not send a controller unpublish RPC.

The controller unpublish is skipped if other non-terminal allocations on the
same node claim the volume. The check has a bug where the allocation belonging
to the claim being freed was included in the check incorrectly. During a normal
allocation stop for job stop or a new version of the job, the allocation is
terminal. But allocations that fail are not yet marked terminal at the point in
time when the client sends the unpublish RPC to the server.

For CSI plugins that support controller attach/detach, this means that the
controller will not be able to detach the volume from the allocation's host and
the replacement claim will fail until a GC is run. This changeset fixes the
conditional so that the claim's own allocation is not included, and makes the
logic easier to read. Include a test case covering this path.

Also includes two minor extra bugfixes:

* Entities we get from the state store should always be copied before
altering. Ensure that we copy the volume in the top-level unpublish workflow
before handing off to the steps.

* The list stub object for volumes in `nomad/structs` did not match the stub
object in `api`. The `api` package also did not include the current
readers/writers fields that are expected by the UI. True up the two objects and
add the previously undocumented fields to the docs.
2022-09-08 13:30:05 -04:00
James Rasell 3fa8b0b270
client: fix RPC forwarding when querying checks for alloc. (#14498)
When querying the checks for an allocation, the request must be
forwarded to the agent that is running the allocation. If the
initial request is made to a server agent, the request can be made
directly to the client agent running the allocation. If the
request is made to a client agent not running the alloc, the
request needs to be forwarded to a server and then the correct
client.
2022-09-08 16:55:23 +02:00
Tim Gross 2a961af44c
test: fix concurrent map access in TestStatsFetcher (#14496)
The map of in-flight RPCs gets cleared by a goroutine in the test without first
locking it to make sure that it's not being accessed concurrently by the stats
fetcher itself. This can cause a panic in tests.
2022-09-08 10:41:15 -04:00
Tim Gross 5c57a84e99
autopilot: deflake tests (#14475)
Includes:

* Remove leader upgrade raft version test, as older versions of raft are now
  incompatible with our autopilot library.

* Remove attempt to assert initial non-voter status on the `PromoteNonVoter`
  test, as this happens too quickly to reliably detect.

* Unskip some previously-skipped tests which we should make stable.

* Remove the `consul/sdk` retry helper for these tests; this uses panic recovery
  in a kind of a clever/gross way to reduce LoC but it seems to introduce some
  timing issues in the process.

* Add more test step logging and reduce logging noise from the scheduler
  goroutines to make it easier to debug failing tests.

* Be more consistent about using the `waitForStableLeadership` helper so that we
  can assert the cluster is fully stable and not just that we've added peers.
2022-09-07 09:35:01 -04:00
James Rasell 962b1f78e8
core: clarify ACL token expiry GC messages to show global param. (#14466) 2022-09-06 15:42:45 +02:00
Kellen Fox 5086368a1e
Add a log line to help track node eligibility (#14125)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2022-09-06 14:03:33 +02:00
Yan 6e927fa125
warn destructive update only when count > 1 (#13103) 2022-09-02 15:30:06 -04:00
Tim Gross 7921f044e5
migrate autopilot implementation to raft-autopilot (#14441)
Nomad's original autopilot was importing from a private package in Consul. It
has been moved out to a shared library. Switch Nomad to use this library so that
we can eliminate the import of Consul, which is necessary to build Nomad ENT
with the current version of the Consul SDK. This also will let us pick up
autopilot improvements shared with Consul more easily.
2022-09-01 14:27:10 -04:00
Luiz Aoqui 19de803503
cli: ignore VaultToken when generating job diff (#14424) 2022-09-01 10:01:53 -04:00
James Rasell 4b9bcf94da
chore: remove use of "err" a log line context key for errors. (#14433)
Log lines which include an error should use the full term "error"
as the context key. This provides consistency across the codebase
and avoids a Go style which operators might not be aware of.
2022-09-01 15:06:10 +02:00
Luiz Aoqui dc6525336b
ci: fix TestNomad_BootstrapExpect_NonVoter test (#14407)
PR #12130 refactored the test to use the `wantPeers` helper, but this
function only returns the number of voting peers, which in this test
should be equal to 2.

I think the tests were passing back them because of a bug in Raft
(https://github.com/hashicorp/raft/pull/483) where a non-voting server
was able to transition to candidate state.

One possible evidence of this is that a successful test run would have
the following log line:

```
raft@v1.3.5/raft.go:1058: nomad.raft: updating configuration: command=AddVoter server-id=127.0.0.1:9101 server-addr=127.0.0.1:9101 servers="[{Suffrage:Voter ID:127.0.0.1:9107 Address:127.0.0.1:9107} {Suffrage:Voter ID:127.0.0.1:9105 Address:127.0.0.1:9105} {Suffrage:Voter ID:127.0.0.1:9103 Address:127.0.0.1:9103} {Suffrage:Voter ID:127.0.0.1:9101 Address:127.0.0.1:9101}]"
```

This commit reverts the test logic to check for peer count, regardless
of voting status.
2022-08-30 16:32:54 -04:00
Tim Gross 5784fb8c58
search: enforce correct ACL for search over variables (#14397) 2022-08-30 13:27:31 -04:00
Tim Gross c9d678a91a
keyring: wrap root key in key encryption key (#14388)
Update the on-disk format for the root key so that it's wrapped with a unique
per-key/per-server key encryption key. This is a bit of security theatre for the
current implementation, but it uses `go-kms-wrapping` as the interface for
wrapping the key. This provides a shim for future support of external KMS such
as cloud provider APIs or Vault transit encryption.

* Removes the JSON serialization extension we had on the `RootKey` struct; this
  struct is now only used for key replication and not for disk serialization, so
  we don't need this helper.

* Creates a helper for generating cryptographically random slices of bytes that
  properly accounts for short reads from the source.

* No observable functional changes outside of the on-disk format, so there are
  no test updates.
2022-08-30 10:59:25 -04:00
Seth Hoenig 52de2dc09d
Merge pull request #14290 from hashicorp/cleanup-more-helper-cleanup
cleanup: tidy up helper package some more
2022-08-30 08:19:48 -05:00
James Rasell 755b4745ed
Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-30 08:59:13 +01:00
Seth Hoenig 3e1e2001b9
Merge pull request #14143 from hashicorp/cleanup-slice-sets-3
cleanup: more cleanup of slices that are really sets
2022-08-29 13:52:59 -05:00
Tim Gross 7d1eb2efd5
keyring: split structs to its own file (#14378) 2022-08-29 14:18:35 -04:00
Seth Hoenig 9d0e274f27 cleanup: cleanup more slice-set comparisons 2022-08-29 12:04:21 -05:00
Tim Gross 62a968f443
Merge pull request #14351 from hashicorp/variables-rename
Variables rename
2022-08-29 11:36:50 -04:00
Piotr Kazmierczak 5f353503e5
bugfix: fixed template validation panic in case of incorrect ChangeScript configuration (#14374)
Fixes #14367
2022-08-29 17:11:15 +02:00
Tim Gross 1dc053b917 rename SecureVariables to Variables throughout 2022-08-26 16:06:24 -04:00
Tim Gross dcfd31296b file rename 2022-08-26 16:06:24 -04:00
Seth Hoenig b87689d2d1
Merge pull request #14318 from hashicorp/cleanup-create-pointer-compare
cleanup: create pointer.Compare helper function
2022-08-26 09:15:41 -05:00
Seth Hoenig 6b2655ad86 cleanup: create pointer.Compare helper function
This PR creates a pointer.Compare helper for comparing equality of
two pointers. Strictly only works with primitive types we know are
safe to derefence and compare using '=='.
2022-08-26 08:55:59 -05:00
James Rasell 601588df6b
Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-25 12:14:29 +01:00
James Rasell 7a0798663d
acl: fix a bug where roles could be duplicated by name.
An ACL roles name must be unique, however, a bug meant multiple
roles of the same same could be created. This fixes that problem
with checks in the RPC handler and state store.
2022-08-25 09:20:43 +01:00
Luiz Aoqui e012d9411e
Task lifecycle restart (#14127)
* allocrunner: handle lifecycle when all tasks die

When all tasks die the Coordinator must transition to its terminal
state, coordinatorStatePoststop, to unblock poststop tasks. Since this
could happen at any time (for example, a prestart task dies), all states
must be able to transition to this terminal state.

* allocrunner: implement different alloc restarts

Add a new alloc restart mode where all tasks are restarted, even if they
have already exited. Also unifies the alloc restart logic to use the
implementation that restarts tasks concurrently and ignores
ErrTaskNotRunning errors since those are expected when restarting the
allocation.

* allocrunner: allow tasks to run again

Prevent the task runner Run() method from exiting to allow a dead task
to run again. When the task runner is signaled to restart, the function
will jump back to the MAIN loop and run it again.

The task runner determines if a task needs to run again based on two new
task events that were added to differentiate between a request to
restart a specific task, the tasks that are currently running, or all
tasks that have already run.

* api/cli: add support for all tasks alloc restart

Implement the new -all-tasks alloc restart CLI flag and its API
counterpar, AllTasks. The client endpoint calls the appropriate restart
method from the allocrunner depending on the restart parameters used.

* test: fix tasklifecycle Coordinator test

* allocrunner: kill taskrunners if all tasks are dead

When all non-poststop tasks are dead we need to kill the taskrunners so
we don't leak their goroutines, which are blocked in the alloc restart
loop. This also ensures the allocrunner exits on its own.

* taskrunner: fix tests that waited on WaitCh

Now that "dead" tasks may run again, the taskrunner Run() method will
not return when the task finishes running, so tests must wait for the
task state to be "dead" instead of using the WaitCh, since it won't be
closed until the taskrunner is killed.

* tests: add tests for all tasks alloc restart

* changelog: add entry for #14127

* taskrunner: fix restore logic.

The first implementation of the task runner restore process relied on
server data (`tr.Alloc().TerminalStatus()`) which may not be available
to the client at the time of restore.

It also had the incorrect code path. When restoring a dead task the
driver handle always needs to be clear cleanly using `clearDriverHandle`
otherwise, after exiting the MAIN loop, the task may be killed by
`tr.handleKill`.

The fix is to store the state of the Run() loop in the task runner local
client state: if the task runner ever exits this loop cleanly (not with
a shutdown) it will never be able to run again. So if the Run() loops
starts with this local state flag set, it must exit early.

This local state flag is also being checked on task restart requests. If
the task is "dead" and its Run() loop is not active it will never be
able to run again.

* address code review requests

* apply more code review changes

* taskrunner: add different Restart modes

Using the task event to differentiate between the allocrunner restart
methods proved to be confusing for developers to understand how it all
worked.

So instead of relying on the event type, this commit separated the logic
of restarting an taskRunner into two methods:
- `Restart` will retain the current behaviour and only will only restart
  the task if it's currently running.
- `ForceRestart` is the new method where a `dead` task is allowed to
  restart if its `Run()` method is still active. Callers will need to
  restart the allocRunner taskCoordinator to make sure it will allow the
  task to run again.

* minor fixes
2022-08-24 17:43:07 -04:00
Tim Gross c732b215f0
vault: detect namespace change in config reload (#14298)
The `namespace` field was not included in the equality check between old and new
Vault configurations, which meant that a Vault config change that only changed
the namespace would not be detected as a change and the clients would not be
reloaded.

Also, the comparison for boolean fields such as `enabled` and
`allow_unauthenticated` was on the pointer and not the value of that pointer,
which results in spurious reloads in real config reload that is easily missed in
typical test scenarios.

Includes a minor refactor of the order of fields for `Copy` and `Merge` to match
the struct fields in hopes it makes it harder to make this mistake in the
future, as well as additional test coverage.
2022-08-24 17:03:29 -04:00
Piotr Kazmierczak 7077d1f9aa
template: custom change_mode scripts (#13972)
This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707
2022-08-24 17:43:01 +02:00
Seth Hoenig bff6c88683 cleanup: remove more copies of min/max from helper 2022-08-24 09:56:15 -05:00
James Rasell 9782d6d7ff
acl: allow tokens to lookup linked roles. (#14227)
When listing or reading an ACL role, roles linked to the ACL token
used for authentication can be returned to the caller.
2022-08-24 13:51:51 +02:00
Michael Schurter 0114bcfe5b
core: move LicenseConfig to shared file (#14247)
This moves LicenseConfig and its Copy method to a shared file so that it can be shared with enterprise code.
2022-08-23 13:44:10 -07:00
Luiz Aoqui 7a8cacc9ec
allocrunner: refactor task coordinator (#14009)
The current implementation for the task coordinator unblocks tasks by
performing destructive operations over its internal state (like closing
channels and deleting maps from keys).

This presents a problem in situations where we would like to revert the
state of a task, such as when restarting an allocation with tasks that
have already exited.

With this new implementation the task coordinator behaves more like a
finite state machine where task may be blocked/unblocked multiple times
by performing a state transition.

This initial part of the work only refactors the task coordinator and
is functionally equivalent to the previous implementation. Future work
will build upon this to provide bug fixes and enhancements.
2022-08-22 18:38:49 -04:00
Tim Gross bf57d76ec7
allow ACL policies to be associated with workload identity (#14140)
The original design for workload identities and ACLs allows for operators to
extend the automatic capabilities of a workload by using a specially-named
policy. This has shown to be potentially unsafe because of naming collisions, so
instead we'll allow operators to explicitly attach a policy to a workload
identity.

This changeset adds workload identity fields to ACL policy objects and threads
that all the way down to the command line. It also a new secondary index to the
ACL policy table on namespace and job so that claim resolution can efficiently
query for related policies.
2022-08-22 16:41:21 -04:00
Charlie Voiselle 29e63a6cb2
Make var get a blocking query as expected (#14205) 2022-08-22 16:37:21 -04:00
Luiz Aoqui dbffdca92e
template: use pointer values for gid and uid (#14203)
When a Nomad agent starts and loads jobs that already existed in the
cluster, the default template uid and gid was being set to 0, since this
is the zero value for int. This caused these jobs to fail in
environments where it was not possible to use 0, such as in Windows
clients.

In order to differentiate between an explicit 0 and a template where
these properties were not set we need to use a pointer.
2022-08-22 16:25:49 -04:00
James Rasell 2736cf0dfa
acl: make listing RPC and HTTP API a stub return object. (#14211)
Making the ACL Role listing return object a stub future-proofs the
endpoint. In the event the role object grows, we are not bound by
having to return all fields within the list endpoint or change the
signature of the endpoint to reduce the list return size.
2022-08-22 17:20:23 +02:00
James Rasell 802d005ef5
acl: add replication to ACL Roles from authoritative region. (#14176)
ACL Roles along with policies and global token will be replicated
from the authoritative region to all federated regions. This
involves a new replication loop running on the federated leader.

Policies and roles may be replicated at different times, meaning
the policies and role references may not be present within the
local state upon replication upsert. In order to bypass the RPC
and state check, a new RPC request parameter has been added. This
is used by the replication process; all other callers will trigger
the ACL role policy validation check.

There is a new ACL RPC endpoint to allow the reading of a set of
ACL Roles which is required by the replication process and matches
ACL Policies and Tokens. A bug within the ACL Role listing RPC has
also been fixed which returned incorrect data during blocking
queries where a deletion had occurred.
2022-08-22 08:54:07 +02:00
Michael Schurter 26637ab55d
core: fix race mutating jobs in scaling api (#14192)
Since the state store returns a pointer to the shared job structs in
memdb we must always copy it before mutating it and applying the new
version via raft. Otherwise if the rpc fails before the mutated job is
committed to raft (either due to validation, bug, crash, or other exit
condition), the leader server will have an updated copy of the job that
other servers will not have.
2022-08-19 15:46:54 -07:00
Michael Schurter 3b57df33e3
client: fix data races in config handling (#14139)
Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked.

This change takes the following approach to safely handling configs in the client:

1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex
2. Since the mutex *only protects the config pointer itself and not fields inside the Config struct:* all config mutation must be done on a *copy* of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates.
3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion.
4. To facilitate deep copying I made an *internally backward incompatible API change:* our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"
2022-08-18 16:32:04 -07:00
Derek Strickland 8dba52cee2
sentinel: add support for Nomad ACL Token and Namespace (#14171)
* sentinel: add ability to reference Nomad ACL Token and Namespace in Sentinel policies
2022-08-18 16:33:00 -04:00
Michael Schurter 1d1526cd24
events: fix race in acl event handling (#14188) 2022-08-18 11:39:14 -07:00
Piotr Kazmierczak b63944b5c1
cleanup: replace TypeToPtr helper methods with pointer.Of (#14151)
Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.
2022-08-17 18:26:34 +02:00
James Rasell 9e3f1581fb
core: add ACL role functionality to ACL tokens.
ACL tokens can now utilize ACL roles in order to provide API
authorization. Each ACL token can be created and linked to an
array of policies as well as an array of ACL role links. The link
can be provided via the role name or ID, but internally, is always
resolved to the ID as this is immutable whereas the name can be
changed by operators.

When resolving an ACL token, the policies linked from an ACL role
are unpacked and combined with the policy array to form the
complete auth set for the token.

The ACL token creation endpoint handles deduplicating ACL role
links as well as ensuring they exist within state.

When reading a token, Nomad will also ensure the ACL role link is
current. This handles ACL roles being deleted from under a token
from a UX standpoint.
2022-08-17 14:45:01 +01:00
Seth Hoenig bb58b94e15
Merge pull request #14141 from hashicorp/cleanup-slice-sets-2
cleanup: helper func for comparing slices of .Equals types
2022-08-16 15:12:39 -05:00
Michael Schurter cdf5a74998
core: fix data races in blocked eval chan handling (#14142)
Similar to the deployment watcher fix in #14121 - the server code loves these mutable structs so we need to guard access to the struct fields with locks.

Capturing ch := b.capacityChangeCh is sufficient to satisfy the data race detector, but I noticed it was also possible to leak goroutines:

Since the watchCapacity loop is in charge of receiving from capacityChangeCh and exits when stopCh is closed, senders to capacityChangeCh also must exit when stopCh is closed. Otherwise they may block forever if capacityChangeCh is full because it will never be received on again. I did not find evidence of this occurring in my meager smattering of prod goroutine dumps I have laying around, but this isn't surprising as the chan has a buffer of 8096! I would imagine that is sufficient to handle "late" sends and then just get GC'd away when the last reference to the old chan is dropped. This is just additional safety/correctness.
2022-08-16 12:33:53 -07:00
Seth Hoenig 47f5452825 cleanup: helper funcs for comparing slices of references 2022-08-16 13:47:47 -05:00
Michael Schurter 0c3cfb073a
nomad: fix race in Bootstrapped access (#14120)
* nomad: fix race in Bootstrapped access
2022-08-16 11:22:33 -07:00
Michael Schurter 285979e96c
deployments: fix data races (#14121)
* deployments: fix data races

Both priority and state related fields may be mutated concurrently and
need to be accessed with the lock acquired.
2022-08-16 10:50:40 -07:00
Seth Hoenig 7728cf5a9a
Merge pull request #14132 from hashicorp/build-update-go1.19
build: update to go1.19
2022-08-16 11:20:27 -05:00
Seth Hoenig b3ea68948b build: run gofmt on all go source files
Go 1.19 will forecefully format all your doc strings. To get this
out of the way, here is one big commit with all the changes gofmt
wants to make.
2022-08-16 11:14:11 -05:00
Seth Hoenig 56b0b456dc
Merge pull request #14102 from hashicorp/cleanup-mesh-gateway-value
cleanup: consul mesh gateway type need not be pointer
2022-08-16 10:07:16 -05:00
Tim Gross a4e89d72a8
secure vars: filter by path in List RPCs (#14036)
The List RPCs only checked the ACL for the Prefix argument of the request. Add
an ACL filter to the paginator for the List RPC.

Extend test coverage of ACLs in the List RPC and in the `acl` package, and add a
"deny" capability so that operators can deny specific paths or prefixes below an
allowed path.
2022-08-15 11:38:20 -04:00
Tim Gross 4005759d28
move secure variable conflict resolution to state store (#13922)
Move conflict resolution implementation into the state store with a new Apply RPC. 
This also makes the RPC for secure variables much more similar to Consul's KV, 
which will help us support soft deletes in a post-1.4.0 version of Nomad.

Reimplement quotas in the state store functions.

Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
2022-08-15 11:19:53 -04:00
Seth Hoenig f9355c29fb cleanup: consul mesh gateway type need not be pointer
This PR changes the use of structs.ConsulMeshGateway to value types
instead of via pointers. This will help in a follow up PR where we
cleanup a lot of custom comparison code with helper functions instead.
2022-08-13 11:26:58 -05:00
James Rasell 581a5bb6ad
rpc: add ACL Role RPC endpoint for CRUD actions.
New ACL Role RPC endpoints have been created to allow the creation,
update, read, and deletion of ACL roles. All endpoints require a
management token; in the future readers will also be allowed to
view roles associated to their ACL token.

The create endpoint in particular is responsible for deduplicating
ACL policy links and ensuring named policies are found within
state. This is done within the RPC handler so we perform a single
loop through the links for slight efficiency.
2022-08-11 08:43:50 +01:00
Seth Hoenig 3aaaedf52e cli: forward request for job validation to nomad leader
This PR changes the behavior of 'nomad job validate' to forward the
request to the nomad leader, rather than responding from any server.

This is because we need the leader when validating Vault tokens, since
the leader is the only server with an active vault client.
2022-08-10 14:34:04 -05:00
James Rasell e660c9a908
core: add ACL role state schema and functionality. (#13955)
This commit includes the new state schema for ACL roles along with
state interaction functions for CRUD actions.

The change also includes snapshot persist and restore
functionality and the addition of FSM messages for Raft updates
which will come via RPC endpoints.
2022-08-09 09:33:41 +02:00
Seth Hoenig 0b52c27a15
Merge pull request #14045 from Abirdcfly/main
fix minor unreachable code caused by t.Fatal
2022-08-08 11:47:02 -05:00
Abirdcfly d66943d4f7 fix minor unreachable code caused by t.Fatal
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-08-08 23:50:11 +08:00
Seth Hoenig 2b6bda49b9 core: automatically plumb task name into task-level services and checks 2022-08-05 12:42:41 -05:00
Seth Hoenig f6f26fb72c nsd: add support for setting request body in http checks
This PR adds support for settings check.body in checks of services
making use of Nomad's service provider.
2022-08-04 14:40:23 -05:00
Charles Z 7a8ec90fbe
allow unhealthy canaries without blocking autopromote (#14001) 2022-08-04 11:53:50 -04:00
Seth Hoenig dcda57e729 nsd: add support for setting headers on nomad service http checks
This PR enables setting of the headers block on services registered
into Nomad's service provider. Works just like the existing support
in Consul checks.
2022-08-03 10:06:44 -05:00
Seth Hoenig 067aa00a6a
Merge pull request #13953 from hashicorp/f-nsd-check-methods
nsd: add support for specifying check.method in nomad service checks
2022-08-03 08:28:38 -05:00
Piotr Kazmierczak 530280505f
client: enable specifying user/group permissions in the template stanza (#13755)
* Adds Uid/Gid parameters to template.

* Updated diff_test

* fixed order

* update jobspec and api

* removed obsolete code

* helper functions for jobspec parse test

* updated documentation

* adjusted API jobs test.

* propagate uid/gid setting to job_endpoint

* adjusted job_endpoint tests

* making uid/gid into pointers

* refactor

* updated documentation

* updated documentation

* Update client/allocrunner/taskrunner/template/template_test.go

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* Update website/content/api-docs/json-jobs.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* propagating documentation change from Luiz

* formatting

* changelog entry

* changed changelog entry

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-08-02 22:15:38 +02:00
Eric Weber cbce13c1ac
Add stage_publish_base_dir field to csi_plugin stanza of a job (#13919)
* Allow specification of CSI staging and publishing directory path
* Add website documentation for stage_publish_dir
* Replace erroneous reference to csi_plugin.mount_config with csi_plugin.mount_dir
* Avoid requiring CSI plugins to be redeployed after introducing StagePublishDir
2022-08-02 09:42:44 -04:00
Tim Gross e5ac6464f6
secure vars: enforce ENT quotas (OSS work) (#13951)
Move the secure variables quota enforcement calls into the state store to ensure
quota checks are atomic with quota updates (in the same transaction).

Switch to a machine-size int instead of a uint64 for quota tracking. The
ENT-side quota spec is described as int, and negative values have a meaning as
"not permitted at all". Using the same type for tracking will make it easier to
the math around checks, and uint64 is infeasibly large anyways.

Add secure vars to quota HTTP API and CLI outputs and API docs.
2022-08-02 09:32:09 -04:00
James Rasell 663aa92b7a
Merge branch 'main' into f-gh-13120-sso-umbrella 2022-08-02 08:30:03 +01:00
Seth Hoenig a4d4a76994 nsd: add support for specifying check.method in nomad service checks
Unblock 'check.method' in service validation. Add tests around making
sure this value gets plumbed through.
2022-08-01 16:13:48 -05:00
Tim Gross 04677d205e
block deleting namespace if it contains a secure variable (#13888)
When we delete a namespace, we check to ensure that there are no non-terminal
jobs or CSI volume, which also covers evals, allocs, etc. Secure variables are
also namespaces, so extend this check to them as well.
2022-07-22 10:06:35 -04:00
Seth Hoenig 5aaa31a6dc
Merge pull request #13882 from hashicorp/cleanup-onupdate-consts
cleanup: use constants for on_update values
2022-07-22 08:58:36 -05:00
Tim Gross c7a11a86c6
block deleting namespaces if the namespace contains a volume (#13880)
When we delete a namespace, we check to ensure that there are no non-terminal
jobs, which effectively covers evals, allocs, etc. CSI volumes are also
namespaced, so extend this check to cover CSI volumes.
2022-07-21 16:13:52 -04:00
Seth Hoenig d8fe1d10ba cleanup: use constants for on_update values 2022-07-21 13:09:47 -05:00
Seth Hoenig c61e779b48
Merge pull request #13715 from hashicorp/dev-nsd-checks
client: add support for checks in nomad services
2022-07-21 10:22:57 -05:00
Seth Hoenig 606e3ebdd4 client: updates from pr feedback 2022-07-21 09:54:27 -05:00
Seth Hoenig 8e6eeaa37e
Merge pull request #13869 from hashicorp/b-uniq-services-2
servicedisco: ensure service uniqueness in job validation
2022-07-21 08:24:24 -05:00
Tim Gross d11da1df5c
workload identity: use parent ID for dispatch/periodic jobs (#13748)
Workload identities grant implicit access to policies, and operators
will not want to craft separate policies for each invocation of a
periodic or dispatch job. Use the parent job's ID as the JobID claim.
2022-07-21 09:05:54 -04:00
Tim Gross 9c43c28575
search: use secure vars ACL policy for secure vars context (#13788)
The search RPC used a placeholder policy for searching within the secure
variables context. Now that we have ACL policies built for secure variables, we
can use them for search. Requires a new loose policy for checking if a token has
any secure variables access within a namespace, so that we can filter on
specific paths in the iterator.
2022-07-21 08:39:36 -04:00
Tim Gross 97a6346da0
keyring: use nanos for CreateTime in key metadata (#13849)
Most of our objects use int64 timestamps derived from `UnixNano()` instead of
`time.Time` objects. Switch the keyring metadata to use `UnixNano()` for
consistency across the API.
2022-07-20 14:46:57 -04:00
Tim Gross 428e23043c
secure vars: limit maximum size of variable data (#13743)
To discourage accidentally DoS'ing the cluster with secure variables
data, we're providing a very low limit to the maximum size of a given
secure variable. This currently matches the limit for dispatch
payloads.

In future versions, we may increase this limit or make it
configurable, once we have better metrics from real-world operators.
2022-07-20 14:46:43 -04:00
Seth Hoenig e5978a9cbf jobspec: ensure service uniqueness in job validation 2022-07-20 12:38:08 -05:00
Seth Hoenig d83aae253f cleanup: track task names and providers using set 2022-07-20 11:48:36 -05:00
Seth Hoenig bd2935ee54 cleanup: tweaks from cr feedback 2022-07-20 10:42:35 -05:00
Seth Hoenig 93cfeb177b cleanup: example refactoring out map[string]struct{} using set.Set
This PR is a little demo of using github.com/hashicorp/go-set to
replace the use of map[T]struct{} as a make-shift set.
2022-07-19 22:50:49 -05:00
Tim Gross ea38582b40
secure vars: rename automatically accessible vars path for jobs (#13848)
Tasks are automatically granted access to variables on a path that matches their
workload identity, with a well-known prefix. Change the prefix to `nomad/jobs`
to allow for future prefixes like `nomad/volumes` or `nomad/plugins`. Reserve
the prefix by emitting errors during validation.
2022-07-19 16:17:34 -04:00
James Rasell 9264f07cc1
core: add expired token garbage collection periodic jobs. (#13805)
Two new periodic core jobs have been added which handle removing
expired local and global tokens from state. The local core job is
run on every leader; the global core job is only run on the leader
within the authoritative region.
2022-07-19 15:37:46 +02:00
Tim Gross cfa2cb140e
fsm: one-time token expiration should be deterministic (#13737)
When applying a raft log to expire ACL tokens, we need to use a
timestamp provided by the leader so that the result is deterministic
across servers. Use leader's timestamp from RPC call
2022-07-18 14:19:29 -04:00
Seth Hoenig c23da281a1 metrics: even classless blocked evals get metrics
This PR fixes a bug where blocked evaluations with no class set would
not have metrics exported at the dc:class scope.

Fixes #13759
2022-07-15 14:12:44 -05:00
Tim Gross 05cd91155d
keyring: fix flake in replication-after-election test (#13749)
The test for simulating a key rotation across leader elections was
flaky because we weren't waiting for a leader election and was
checking the server configs rather than raft for which server was
currently the leader. Fixing the flake revealed a bug in the test that
we weren't ensuring the new leader was running its own replication, so
it wouldn't pick up the key material from the previous follower.
2022-07-15 11:09:09 -04:00
Tim Gross aa15e0fe7e
secure vars: updates should reduce quota tracking if smaller (#13742)
When secure variables are updated, we were adding the update to the
existing quota tracking without first checking whether it was an
update to an existing variable. In that case we need to add/subtract
only the difference between the new and existing quota usage.
2022-07-15 11:08:53 -04:00
James Rasell a8a8b1f84f
acl: add token expiry checking to ACL token resolution. (#13756)
This commit adds basic expiry checking when performing ACL token
resolution. This expiry checking is local to each server and does
not at this time take into account potential time skew on server
hosts.

A new error message has been created so clients whose token has
expired get a clear message, rather than a generic token not
found.

The ACL resolution tests have been refactored into table driven
tests, so additions are easier in the future.
2022-07-15 15:20:50 +02:00
Tim Gross 0cf8a580c7
search: refactor OSS/ENT split for ACL checks (#13760)
The split between OSS/ENT in ACL checks for the Search RPC has a lot
of repeated code that results in merge conflicts. Move most of the
logic into the shared code so that we can call out to thin functions
for ENT checks.
2022-07-14 11:31:08 -04:00
Tim Gross cc9fb1c876
keyring: upserting key metadata in FSM must be deterministic (#13733) 2022-07-14 08:38:14 -04:00
James Rasell 0cde3182eb
core: add ACL token expiry state, struct, and RPC handling. (#13718)
The ACL token state schema has been updated to utilise two new
indexes which track expiration of tokens that are configured with
an expiration TTL or time. A new state function allows listing
ACL expired tokens which will be used by internal garbage
collection.

The ACL endpoint has been modified so that all validation happens
within a single function call. This is easier to understand and
see at a glance. The ACL token validation now also includes logic
for expiry TTL and times. The ACL endpoint upsert tests have been
condensed into a single, table driven test.

There is a new token canonicalize which provides a single place
for token canonicalization, rather than logic spread in the RPC
handler.
2022-07-13 15:40:34 +02:00
Luiz Aoqui b656981cf0
Track plan rejection history and automatically mark clients as ineligible (#13421)
Plan rejections occur when the scheduler work and the leader plan
applier disagree on the feasibility of a plan. This may happen for valid
reasons: since Nomad does parallel scheduling, it is expected that
different workers will have a different state when computing placements.

As the final plan reaches the leader plan applier, it may no longer be
valid due to a concurrent scheduling taking up intended resources. In
these situations the plan applier will notify the worker that the plan
was rejected and that they should refresh their state before trying
again.

In some rare and unexpected circumstances it has been observed that
workers will repeatedly submit the same plan, even if they are always
rejected.

While the root cause is still unknown this mitigation has been put in
place. The plan applier will now track the history of plan rejections
per client and include in the plan result a list of node IDs that should
be set as ineligible if the number of rejections in a given time window
crosses a certain threshold. The window size and threshold value can be
adjusted in the server configuration.

To avoid marking several nodes as ineligible at one, the operation is rate
limited to 5 nodes every 30min, with an initial burst of 10 operations.
2022-07-12 18:40:20 -04:00
Seth Hoenig 297d386bdc client: add support for checks in nomad services
This PR adds support for specifying checks in services registered to
the built-in nomad service provider.

Currently only HTTP and TCP checks are supported, though more types
could be added later.
2022-07-12 17:09:50 -05:00
Michael Schurter 3e50f72fad
core: merge reserved_ports into host_networks (#13651)
Fixes #13505

This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports).

As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility:

Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly.
Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me.
So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.
2022-07-12 14:40:25 -07:00
Charlie Voiselle 6be7a41351
SV: CLI: var list command (#13707)
* SV CLI: var list
* Fix wildcard prefix filtering

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-07-12 12:49:39 -04:00
Tim Gross 8054b3c9e6
secure vars: fix enterprise test by upserting the namespace (#13719)
In OSS we can upsert an allocation without worrying about whether that
alloc is in a namespace that actually exists, but in ENT that upsert
will add to the namespace's quotas. Ensure we're doing so in this
secure variables RPC test to fix the test breaking in the ENT repo.
2022-07-12 12:05:52 -04:00
Charlie Voiselle f4784e8d69
SV: fixes for namespace handling (#13705)
* ACL check namespace value in SecureVariable
* Error on wildcard namespace
2022-07-12 11:15:57 -04:00
James Rasell 0292f48396
server: add ACL token expiration config parameters. (#13667)
This commit adds configuration parameters to control ACL token
expirations. This includes both limits on the min and max TTL
expiration values, as well as a GC threshold for expired tokens.
2022-07-12 13:43:25 +02:00
Phil Renaud e9219a1ae0
Allow wildcard for Evaluations API (#13530)
* Failing test and TODO for wildcard

* Alias the namespace query parameter for Evals

* eval: fix list when using ACLs and * namespace

Apply the same verification process as in job, allocs and scaling
policy list endpoints to handle the eval list when using an ACL token
with limited namespace support but querying using the `*` wildcard
namespace.

* changelog: add entry for #13530

* ui: set namespace when querying eval

Evals have a unique UUID as ID, but when querying them the Nomad API
still expects a namespace query param, otherwise it assumes `default`.

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-07-11 16:42:17 -04:00
Tim Gross a5a9eedc81 core job for secure variables re-key (#13440)
When the `Full` flag is passed for key rotation, we kick off a core
job to decrypt and re-encrypt all the secure variables so that they
use the new key.
2022-07-11 13:34:06 -04:00
Charlie Voiselle 555ac432cd SV: CAS: Implement Check and Set for Delete and Upsert (#13429)
* SV: CAS
    * Implement Check and Set for Delete and Upsert
    * Reading the conflict from the state store
    * Update endpoint for new error text
    * Updated HTTP api tests
    * Conflicts to the HTTP api

* SV: structs: Update SV time to UnixNanos
    * update mock to UnixNano; refactor

* SV: encrypter: quote KeyID in error
* SV: mock: add mock for namespace w/ SV
2022-07-11 13:34:06 -04:00
Tim Gross 8a50d2c3e8 implement quota tracking for secure variablees (#13453)
We need to track per-namespace storage usage for secure variables even
in Nomad OSS so that a cluster can be seamlessly upgraded from OSS to
ENT without having to re-calculate quota usage.

Provide a hook in the upsert RPC for enforcement of quotas in
ENT. This will be a no-op in Nomad OSS.
2022-07-11 13:34:06 -04:00
Tim Gross eaf430bfd5 secure variable server configuration (#13307)
Add fields for configuring root key garbage collection and automatic
rotation. Fix the keystore path so that we write to a tempdir when in
dev mode.
2022-07-11 13:34:06 -04:00
Tim Gross 6300427228 core job for key rotation (#13309)
Extend the GC job to support periodic key rotation.

Update the GC process to safely support signed workload identity. We
can't GC any key used to sign a workload identity. Finding which key
was used to sign every allocation will be expensive, but there are not
that many keys. This lets us take a conservative approach: find the
oldest live allocation and ensure that we don't GC any key older than
that key.
2022-07-11 13:34:06 -04:00
Tim Gross 350fe3495c fix blocking query for Keyring.List RPC (#13384)
The blocking query for `Keyring.List` appended the keys for each pass
through the blocking query to the response. This results in mulitple
copies of keys in the response. Overwrite the `reply.Keys` field on
each pass through the blocking query to ensure we only get the
expected page of responses.
2022-07-11 13:34:05 -04:00
Tim Gross 83dc3ec758 secure variables ACL policies (#13294)
Adds a new policy block inside namespaces to control access to secure
variables on the basis of path, with support for globbing.

Splits out VerifyClaim from ResolveClaim.
The ServiceRegistration RPC only needs to be able to verify that a
claim is valid for some allocation in the store; it doesn't care about
implicit policies or capabilities. Split this out to its own method on
the server so that the SecureVariables RPC can reuse it as a separate
step from resolving policies (see next commit).

Support implicit policies based on workload identity
2022-07-11 13:34:05 -04:00
Tim Gross 160ad9b64f implement Encrypt/Decrypt methods of encrypter (#13375)
The `Encrypt` method generates an appropriately-sized nonce and uses
that buffer as the prefix for the ciphertext. This keeps the
ciphertext and nonce together for decryption, and reuses the buffer as
much as possible without presenting the temptation to reuse the
cleartext buffer owned by the caller.

We include the key ID as the "additional data" field that's used as an
extra input to the authentication signature, to provide additional
protection that a ciphertext originated with that key.

Refactors the locking for the keyring so that the public methods are
generally (with one commented exception) responsible for taking the
lock and then inner methods are assumed locked.
2022-07-11 13:34:05 -04:00
Charlie Voiselle 1fe080c6de Implement HTTP search API for Variables (#13257)
* Add Path only index for SecureVariables
* Add GetSecureVariablesByPrefix; refactor tests
* Add search for SecureVariables
* Add prefix search for secure variables
2022-07-11 13:34:05 -04:00
Charlie Voiselle 06c6a950c4 Secure Variables: Seperate Encrypted and Decrypted structs (#13355)
This PR splits SecureVariable into SecureVariableDecrypted and
SecureVariableEncrypted in order to use the type system to help
verify that cleartext secret material is not committed to file.

* Make Encrypt function return KeyID
* Split SecureVariable

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-07-11 13:34:05 -04:00
Tim Gross bfcbc00f4e workload identity (#13223)
In order to support implicit ACL policies for tasks to get their own
secrets, each task would need to have its own ACL token. This would
add extra raft overhead as well as new garbage collection jobs for
cleaning up task-specific ACL tokens. Instead, Nomad will create a
workload Identity Claim for each task.

An Identity Claim is a JSON Web Token (JWT) signed by the server’s
private key and attached to an Allocation at the time a plan is
applied. The encoded JWT can be submitted as the X-Nomad-Token header
to replace ACL token secret IDs for the RPCs that support identity
claims.

Whenever a key is is added to a server’s keyring, it will use the key
as the seed for a Ed25519 public-private private keypair. That keypair
will be used for signing the JWT and for verifying the JWT.

This implementation is a ruthlessly minimal approach to support the
secure variables feature. When a JWT is verified, the allocation ID
will be checked against the Nomad state store, and non-existent or
terminal allocation IDs will cause the validation to be rejected. This
is sufficient to support the secure variables feature at launch
without requiring implementation of a background process to renew
soon-to-expire tokens.
2022-07-11 13:34:05 -04:00
Tim Gross 7055ce89b1 keyring replication (#13167)
Replication for the secure variables keyring. Because only key
metadata is stored in raft, we need to distribute key material
out-of-band from raft replication. A goroutine runs on each server and
watches for changes to the `RootKeyMeta`. When a new key is received,
attempt to fetch the key from the leader. If the leader doesn't have
the key (which may happen if a key is rotated right before a leader
transition), try to get the key from any peer.
2022-07-11 13:34:04 -04:00
Tim Gross d5a214484c core job for root key GC (#13199)
Inactive and unused keys older than a threshold will be periodically
garbage collected.
2022-07-11 13:34:04 -04:00
Tim Gross 5a85d96322 remove end-user algorithm selection (#13190)
After internal design review, we decided to remove exposing algorithm
choice to the end-user for the initial release. We'll solve nonce
rotation by forcing rotations automatically on key GC (in a core job,
not included in this changeset). Default to AES-256 GCM for the
following criteria:

* faster implementation when hardware acceleration is available
* FIPS compliant
* implementation in pure go
* post-quantum resistance

Also fixed a bug in the decoding from keystore and switched to a 
harder-to-misuse encoding method.
2022-07-11 13:34:04 -04:00
Tim Gross 973b474b3c provide state store query for variables by key ID (#13195)
The core jobs to garbage collect unused keys and perform full key
rotations will need to be able to query secure variables by key ID for
efficiency. Add an index to the state store and associated query
function and test.
2022-07-11 13:34:04 -04:00
Tim Gross f2ee585830 bootstrap keyring (#13124)
When a server becomes leader, it will check if there are any keys in
the state store, and create one if there is not. The key metadata will
be replicated via raft to all followers, who will then get the key
material via key replication (not implemented in this changeset).
2022-07-11 13:34:04 -04:00
Charlie Voiselle 3717688f3e Secure Variables: Variables - State store, FSM, RPC (#13098)
* Secure Variables: State Store
* Secure Variables: FSM
* Secure Variables: RPC
* Secure Variables: HTTP API

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-07-11 13:34:04 -04:00
Tim Gross 05eef2b95c keystore serialization (#13106)
This changeset implements the keystore serialization/deserialization:

* Adds a JSON serialization extension for the `RootKey` struct, along with a metadata stub. When we serialize RootKey to the on-disk keystore, we want to base64 encode the key material but also exclude any frequently-changing fields which are stored in raft.
* Implements methods for loading/saving keys to the keystore.
* Implements methods for restoring the whole keystore from disk.
* Wires it all up with the `Keyring` RPC handlers and fixes up any fallout on tests.
2022-07-11 13:34:04 -04:00
Tim Gross 2f0fd556ad keyring RPC handlers (#13075)
Implement the upsert, list, delete, and rotate RPC handlers for the
secure variables keyring. Operations on the keyring itself are still
stubbed out.
2022-07-11 13:34:04 -04:00
Tim Gross b1dc6dcef0 keyring state store operations (#13016)
Implement the basic upsert, list, and delete operations for
`RootKeyMeta` needed by the Keyring RPCs.

This changeset also implements two convenience methods
`RootKeyMetaByID` and `GetActiveRootKeyMeta` which are useful for
testing but also will be needed to implement the rest of the RPCs.
2022-07-11 13:34:04 -04:00
Charlie Voiselle 2019eab2c8 Provide mock secure variables implementation (#12980)
* Add SecureVariable mock
* Add SecureVariableStub
* Add SecureVariable Copy and Stub funcs
2022-07-11 13:34:03 -04:00
Tim Gross d29e85d150 secure variables: initial state store (#12932)
Implement the core SecureVariable and RootKey structs in memdb,
provide the minimal skeleton for FSM, and a dummy storage and keyring
RPC endpoint.
2022-07-11 13:34:01 -04:00
Tim Gross b6dd1191b2
snapshot restore-from-archive streaming and filtering (#13658)
Stream snapshot to FSM when restoring from archive
The `RestoreFromArchive` helper decompresses the snapshot archive to a
temporary file before reading it into the FSM. For large snapshots
this performs a lot of disk IO. Stream decompress the snapshot as we
read it, without first writing to a temporary file.

Add bexpr filters to the `RestoreFromArchive` helper.
The operator can pass these as `-filter` arguments to `nomad operator
snapshot state` (and other commands in the future) to include only
desired data when reading the snapshot.
2022-07-11 10:48:00 -04:00
Seth Hoenig 239eaf9a29
Merge pull request #13626 from hashicorp/b-client-max-kill-timeout
client: enforce max_kill_timeout client configuration
2022-07-07 13:44:39 -05:00
Luiz Aoqui 85908415f9
state: fix eval list by prefix with * namespace (#13551) 2022-07-07 14:21:51 -04:00
Michael Schurter f21272065d
core: emit node evals only for sys jobs in dc (#12955)
Whenever a node joins the cluster, either for the first time or after
being `down`, we emit a evaluation for every system job to ensure all
applicable system jobs are running on the node.

This patch adds an optimization to skip creating evaluations for system
jobs not in the current node's DC. While the scheduler performs the same
feasability check, skipping the creation of the evaluation altogether
saves disk, network, and memory.
2022-07-06 14:35:18 -07:00
Seth Hoenig 5dd8aa3e27 client: enforce max_kill_timeout client configuration
This PR fixes a bug where client configuration max_kill_timeout was
not being enforced. The feature was introduced in 9f44780 but seems
to have been removed during the major drivers refactoring.

We can make sure the value is enforced by pluming it through the DriverHandler,
which now uses the lesser of the task.killTimeout or client.maxKillTimeout.
Also updates Event.SetKillTimeout to require both the task.killTimeout and
client.maxKillTimeout so that we don't make the mistake of using the wrong
value - as it was being given only the task.killTimeout before.
2022-07-06 15:29:38 -05:00
Luiz Aoqui a9a66ad018
api: apply new ACL check for wildcard namespace (#13608)
api: apply new ACL check for wildcard namespace

In #13606 the ACL check was refactored to better support the all
namespaces wildcard (`*`). This commit applies the changes to the jobs
and alloc list endpoints.
2022-07-06 16:17:16 -04:00
Luiz Aoqui 74c5578432
api: refactor ACL check for namespace wildcard (#13606)
Improve how the all namespaces wildcard (`*`) is handled when checking
ACL permissions. When using the wildcard namespace the `AllowNsOp` would
return false since it looks for a namespace called `*` to match.

This commit changes this behavior to return `true` when the queried
namespace is `*` and the token allows the operation in _any_ namespace.

Actual permission must be checked per object. The helper function
`AllowNsOpFunc` returns a function that can be used to make this
verification.
2022-07-06 15:22:30 -04:00
James Rasell 0c0b028a59
core: allow deleting of evaluations (#13492)
* core: add eval delete RPC and core functionality.

* agent: add eval delete HTTP endpoint.

* api: add eval delete API functionality.

* cli: add eval delete command.

* docs: add eval delete website documentation.
2022-07-06 16:30:11 +02:00
James Rasell 181b247384
core: allow pausing and un-pausing of leader broker routine (#13045)
* core: allow pause/un-pause of eval broker on region leader.

* agent: add ability to pause eval broker via scheduler config.

* cli: add operator scheduler commands to interact with config.

* api: add ability to pause eval broker via scheduler config

* e2e: add operator scheduler test for eval broker pause.

* docs: include new opertor scheduler CLI and pause eval API info.
2022-07-06 16:13:48 +02:00
Seth Hoenig 97726c2fd8
Merge pull request #12862 from hashicorp/f-choose-services
api: enable selecting subset of services using rendezvous hashing
2022-06-30 15:17:40 -05:00
Michael Schurter 1cc0ae8795
docs: fix Plan{,Result}.NodeUpdate comment (#13534)
It appears way back when this was first implemented in
9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from
NodeEvict (with a correct comment) to NodeUpdate. The comment was
changed from referring to only evictions to referring to "all allocs" in
the first sentence and "stop or evict" in the second.

This confuses every time I see it because I read the name (NodeUpdate)
and first sentence ("all the allocs") and assume this represents *all*
allocations... which isn't true.

I'm going to assume I'm the only one who doesn't read the 2nd sentence
and that's why this suboptimal wording has lasted 7 years, but can we
change it for my sake?
2022-06-30 12:47:14 -07:00
James Rasell d080eed9ae
client: fixed a problem calculating a service namespace. (#13493)
When calculating a services namespace for registration, the code
assumed the first task within the task array would include a
service block. This is incorrect as it is possible only a latter
task within the array contains a service definition.

This change fixes the logic, so we correctly search for a service
definition before identifying the namespace.
2022-06-28 09:47:28 +02:00
Shishir Mahajan 6ba8245283 Fix typo: orthogonal.
Signed-off-by: Shishir Mahajan <smahajan@roblox.com>
2022-06-27 12:12:51 -07:00
Seth Hoenig 9467bc9eb3 api: enable selecting subset of services using rendezvous hashing
This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint.

The value of 'choose' is in the form '<number>|<key>', number is the number
of desired services and key is a value unique but consistent to the requester
(e.g. allocID).

Folks aren't really expected to use this API directly, but rather through consul-template
which will soon be getting a new helper function making use of this query parameter.

Example,

curl 'localhost:4646/v1/service/redis?choose=2|abc123'

Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.
2022-06-25 10:37:37 -05:00
Tim Gross 4368dcc02f
fix deadlock in plan_apply (#13407)
The plan applier has to get a snapshot with a minimum index for the
plan it's working on in order to ensure consistency. Under heavy raft
loads, we can exceed the timeout. When this happens, we hit a bug
where the plan applier blocks waiting on the `indexCh` forever, and
all schedulers will block in `Plan.Submit`.

Closing the `indexCh` when the `asyncPlanWait` is done with it will
prevent the deadlock without impacting correctness of the previous
snapshot index.

This changeset includes the a PoC failing test that works by injecting
a large timeout into the state store. We need to turn this into a test
we can run normally without breaking the state store before we can
merge this PR.

Increase `snapshotMinIndex` timeout to 10s.
This timeout creates backpressure where any concurrent `Plan.Submit`
RPCs will block waiting for results. This sheds load across all
servers and gives raft some CPU to catch up, because schedulers won't
dequeue more work while waiting. Increase it to 10s based on
observations of large production clusters.
2022-06-23 12:06:27 -04:00
Grant Griffiths 99896da443
CSI: make plugin health_timeout configurable in csi_plugin stanza (#13340)
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2022-06-14 10:04:16 -04:00
Tim Gross 9d5523a72d
CSI: skip node unpublish on GC'd or down nodes (#13301)
If the node has been GC'd or is down, we can't send it a node
unpublish. The CSI spec requires that we don't send the controller
unpublish before the node unpublish, but in the case where a node is
gone we can't know the final fate of the node unpublish step.

The `csi_hook` on the client will unpublish if the allocation has
stopped and if the host is terminated there's no mount for the volume
anyways. So we'll now assume that the node has unpublished at its
end. If it hasn't, any controller unpublish will potentially hang or
error and need to be retried.
2022-06-09 11:33:22 -04:00
James Rasell f5e78a3791
state: only update index on change when deleting evals. (#13227)
When deleting evaluations and allocations during a reap event, the
index table entries for evals and allocs was updated irregardless
of whether changes were made.

This change modifies the state logic so that the index table is
only modified when the corresponding table has actually been
modified. Along with matching expected behaviour, this change has
the potential to reduce the number of times blocking queries will
return without any real state change.
2022-06-07 11:56:43 +02:00
Lance Haig 4bf27d743d
Allow Operator Generated bootstrap token (#12520) 2022-06-03 07:37:24 -04:00
Huan Wang 7d15157635
adding support for customized ingress tls (#13184) 2022-06-02 18:43:58 -04:00
Seth Hoenig 0399b7e4c5
Merge pull request #12951 from jorgemarey/f-srv-tagged-addresses
Allow setting tagged addresses on services
2022-06-01 10:51:49 -05:00
Seth Hoenig 189176f052 consul: avoid reflection in comparing service map types 2022-06-01 10:22:00 -05:00
Tim Gross 6873670dd6
refactor index threshold calculation for core GC jobs (#13196)
Almost all GC jobs check the index of the objects being GC'd to see if
they're older than a configured threshold. This code was repeated six
times in `CoreScheduler` with only logging changes, so it seems safe
to extract it as its own method.
2022-06-01 11:12:20 -04:00
Seth Hoenig dca954faac build: update golangci-lint to v1.46.2
This version of golangci-lint improves support for generics, but also
is more strict in copy vs. loop for slice copying.
2022-05-31 23:32:01 +00:00
Seth Hoenig 54efec5dfe docs: add docs and tests for tagged_addresses 2022-05-31 13:02:48 -05:00
Jorge Marey f966614602 Allow setting tagged addresses on services 2022-05-31 10:06:55 -05:00
Seth Hoenig 4631045d83 connect: enable setting connect upstream destination namespace 2022-05-26 09:39:36 -05:00
Luiz Aoqui 769ff1dcc3
Merge pull request #13109 from hashicorp/merge-release-1.3.1-branch
Merge release 1.3.1 branch
2022-05-25 10:45:09 -04:00
Seth Hoenig 626a345fb2 ci: switch to 22.04 LTS for GHA Core CI tests 2022-05-25 08:19:40 -05:00
Michael Schurter 2965dc6a1a
artifact: fix numerous go-getter security issues
Fix numerous go-getter security issues:

- Add timeouts to http, git, and hg operations to prevent DoS
- Add size limit to http to prevent resource exhaustion
- Disable following symlinks in both artifacts and `job run`
- Stop performing initial HEAD request to avoid file corruption on
  retries and DoS opportunities.

**Approach**

Since Nomad has no ability to differentiate a DoS-via-large-artifact vs
a legitimate workload, all of the new limits are configurable at the
client agent level.

The max size of HTTP downloads is also exposed as a node attribute so
that if some workloads have large artifacts they can specify a high
limit in their jobspecs.

In the future all of this plumbing could be extended to enable/disable
specific getters or artifact downloading entirely on a per-node basis.
2022-05-24 16:29:39 -04:00
Luiz Aoqui 0a00059f3c
core: test duplicated blocked eval stats
In the original test, the eval generator would use a random value for
the job ID, resulting in an unxercised code path for duplicate blocked
evals.
2022-05-24 15:44:06 -04:00
Seth Hoenig a5943da0c7 core: add tests for blocked evals math 2022-05-24 09:05:18 -05:00
Seth Hoenig 0c145ac1e4 core: remove correct set of resources on blocked eval 2022-05-23 15:18:55 -05:00
Seth Hoenig fc58f4972c cli: correctly use and validate job with vault token set
This PR fixes `job validate` to respect '-vault-token', '$VAULT_TOKEN',
'-vault-namespace' if set.
2022-05-19 12:13:34 -05:00
Eng Zer Jun 97d1bc735c
test: use T.TempDir to create temporary test directory (#12853)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix TestLogmon_Start_restart on Windows

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestConsul_Integration

t.TempDir fails to perform the cleanup properly because the folder is
still in use

testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-05-12 11:42:40 -04:00
James Rasell 9ea1a6faf6
fsm: add service registration snapshot persistence. (#12896) 2022-05-06 15:53:27 +02:00
James Rasell a05114fdac
core: add namespace to plan for node rejected log line. (#12868) 2022-05-05 10:56:40 +02:00
Tim Gross 45b238ec82
CSI: node drain should end once only plugins remain (#12846)
In #12324 we made it so that plugins wait until the node drain is
complete, as we do for system jobs. But we neglected to mark the node
drain as complete once only plugins (or system jobs) remaining, which
means that the node drain is left in a draining state until the
`deadline` time expires. This was incorrectly documented as expected
behavior in #12324.
2022-05-03 10:20:22 -04:00
Luiz Aoqui a8cc633156
vault: revert support for entity aliases (#12723)
After a more detailed analysis of this feature, the approach taken in
PR #12449 was found to be not ideal due to poor UX (users are
responsible for setting the entity alias they would like to use) and
issues around jobs potentially masquerading itself as another Vault
entity.
2022-04-22 10:46:34 -04:00
Seth Hoenig c4aab10e53 services: cr followup 2022-04-22 09:14:29 -05:00
Seth Hoenig 3fcac242c6 services: enable setting arbitrary address value in service registrations
This PR introduces the `address` field in the `service` block so that Nomad
or Consul services can be registered with a custom `.Address.` to advertise.

The address can be an IP address or domain name. If the `address` field is
set, the `service.address_mode` must be set in `auto` mode.
2022-04-22 09:14:29 -05:00
James Rasell 716b8e658b
api: Add support for filtering and pagination to the node list endpoint (#12727) 2022-04-21 17:04:33 +02:00
Derek Strickland 5e309f3f33
reconciler: Handle canaries when client disconnects (#12539)
* plan_apply: Allow node updates in disconnected node plans
* plan: Keep the job when persisting unknown allocs
* reconciler: stop unknown allocs when stopping all
* reconcile_util: reorder filtering to handle canaries; skip rescheduling unknown
* heartbeat: Fix bug in node heartbeating
2022-04-21 10:05:58 -04:00
James Rasell 257e1c4f96
autopilot: correctly return errors within state functions. (#12714) 2022-04-21 08:54:50 +02:00
James Rasell 010acce59f
job_hooks: add implicit constraint when using Consul for services. (#12602) 2022-04-20 14:09:13 +02:00
Seth Hoenig 411158acff
Merge pull request #12586 from hashicorp/f-local-si-token
connect: create SI tokens in local scope
2022-04-19 07:53:01 -05:00
chavacava eb1c42e643
QueryOptions.SetTimeToBlock should take pointer receiver
Fixes a bug where blocking queries that are retried don't have their blocking 
timeout reset, resulting in them running longer than expected.
2022-04-18 10:41:27 -04:00
Seth Hoenig df587d8263 docs: update documentation with connect acls changes
This PR updates the changelog, adds notes the 1.3 upgrade guide, and
updates the connect integration docs with documentation about the new
requirement on Consul ACL policies of Consul agent default anonymous ACL
tokens.
2022-04-18 08:22:33 -05:00
Jorge Marey 707c7f3a11 Change consul SI tokens to be local 2022-04-18 08:22:33 -05:00
Shishir f5121d261e
Add os to NodeListStub struct. (#12497)
* Add os to NodeListStub struct.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>

* Add os as a query param to /v1/nodes.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>

* Add test: os as a query param to /v1/nodes.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>
2022-04-15 17:22:45 -07:00
Tim Gross 826d9d47f9
CSI: replace structs->api with serialization extension (#12583)
The CSI HTTP API has to transform the CSI volume to redact secrets,
remove the claims fields, and to consolidate the allocation stubs into
a single slice of alloc stubs. This was done manually in #8590 but
this is a large amount of code and has proven both very bug prone
(see #8659, #8666, #8699, #8735, and #12150) and requires updating
lots of code every time we add a field to volumes or plugins.

In #10202 we introduce encoding improvements for the `Node` struct
that allow a more minimal transformation. Apply this same approach to
serializing `structs.CSIVolume` to API responses.

Also, the original reasoning behind #8590 for plugins no longer holds
because the counts are now denormalized within the state store, so we
can simply remove this transformation entirely.
2022-04-15 14:29:34 -04:00
Derek Strickland 4d3a0aae6d
heartbeat: Handle transitioning from disconnected to down (#12559) 2022-04-15 09:47:45 -04:00
Derek Strickland 0891218ee9
system_scheduler: support disconnected clients (#12555)
* structs: Add helper method for checking if alloc is configured to disconnect
* system_scheduler: Add support for disconnected clients
2022-04-15 09:31:32 -04:00
James Rasell 4cdc46ae75
service discovery: add pagination and filtering support to info requests (#12552)
* services: add pagination and filter support to info RPC.
* cli: add filter flag to service info command.
* docs: add pagination and filter details to services info API.
* paginator: minor updates to comment and func signature.
2022-04-13 07:41:44 +02:00
Luiz Aoqui 82027edb2f
add some godocs for the API pagination tokenizer options (#12547) 2022-04-12 10:27:22 -04:00
Tim Gross 4078e6ea0e
scripts: fix interpreter for bash (#12549)
Many of our scripts have a non-portable interpreter line for bash and
use bash-specific variables like `BASH_SOURCE`. Update the interpreter
line to be portable between various Linuxes and macOS without
complaint from posix shell users.
2022-04-12 10:08:21 -04:00
Yoan Blanc 5e8254beda
feat: remove dependency to consul/lib
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2022-04-09 13:22:44 +02:00
Lars Lehtonen df1edf5cf4
nomad/state: fix dropped test errors (#12406) 2022-04-07 10:48:10 -04:00
Derek Strickland 065b3ed886
plan_apply: Add missing unit test for validating plans for disconnected clients (#12495) 2022-04-07 09:58:09 -04:00
Jasmine Dahilig 38efb3c8d8
metrics: emit stats for vault token next_renewal & last_renewal #5222 (#12435) 2022-04-06 10:03:11 -07:00