Commit graph

4408 commits

Author SHA1 Message Date
hc-github-team-nomad-core 552dad53f5
test: fix name of state service registration test file. (#18406) (#18422)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-09-07 11:00:13 +01:00
hc-github-team-nomad-core 286164288d
backport of commit b6f6541f5024fb847a230ecd612a15f4a4fc05a4 (#18416)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-09-07 09:14:56 +01:00
hc-github-team-nomad-core a9441eb589
backport of commit c28cd59655b88d64f841bed5da7996ae8d88772f (#18411)
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2023-09-06 11:32:42 -05:00
hc-github-team-nomad-core 1b2237d6a8
backport of commit 776a26bce7cf3a320fc7e7f4a6bf9da2b30f3da7 (#18375)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-09-01 10:25:08 +01:00
hc-github-team-nomad-core b93dc92ec2
backport of commit f187afab9f06b7489f7103c3e3c8eed72f210621 (#18350)
Co-authored-by: Gerard Nguyen <nguyenvanthao1991@gmail.com>
2023-08-28 19:14:45 -04:00
hc-github-team-nomad-core b0bece8a18
backport of commit da830b10463f1cc0a704ec4a4f66e35d4324d728 (#18337)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-08-25 21:36:35 -04:00
James Rasell 3730b66d8c
test: use correct parallel test setup func (#18326) (#18330) 2023-08-25 14:48:06 +01:00
hc-github-team-nomad-core 4ebd0d251f
backport of commit f7a336d2ba95a362504d6094e581b8aeedbd554e (#18323)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-08-25 09:37:13 +01:00
hc-github-team-nomad-core e4c7388608
backport of commit 3e61b3a37df9ff0836b52ba5440106ad0f607dd7 (#18294)
Co-authored-by: Андрей Неустроев <99169437+aneustroev@users.noreply.github.com>
2023-08-22 16:01:24 -04:00
James Rasell 1397ec4ad6
nomad: remove custom max func and use Go 1.21.0 builtin (#18237) (#18251) 2023-08-17 16:17:43 +01:00
Luiz Aoqui d910457159 csi: prevent panic on volume delete (#18234)
When a CSI volume is deleted while its plugin is not running, the
function `volAndPluginLookup` returns a `nil` plugin value resulting in a
panic in the request handler.
2023-08-17 10:09:45 -04:00
Tim Gross 0a19fe3b60 fix multiple overflow errors in exponential backoff (#18200)
We use capped exponential backoff in several places in the code when handling
failures. The code we've copy-and-pasted all over has a check to see if the
backoff is greater than the limit, but this check happens after the bitshift and
we always increment the number of attempts. This causes an overflow with a
fairly small number of failures (ex. at one place I tested it occurs after only
24 iterations), resulting in a negative backoff which then never recovers. The
backoff becomes a tight loop consuming resources and/or DoS'ing a Nomad RPC
handler or an external API such as Vault. Note this doesn't occur in places
where we cap the number of iterations so the loop breaks (usually to return an
error), so long as the number of iterations is reasonable.

Introduce a helper with a check on the cap before the bitshift to avoid overflow in all 
places this can occur.

Fixes: #18199
Co-authored-by: stswidwinski <stan.swidwinski@gmail.com>
2023-08-15 14:39:09 -04:00
Seth Hoenig a45b689d8e update go1.21 (#18184)
* build: update to go1.21

* go: eliminate helpers in favor of min/max

* build: run go mod tidy

* build: swap depguard for semgrep

* command: fixup broken tls error check on go1.21
2023-08-15 14:40:33 +02:00
Tim Gross a3a86a849a test: deflake node drain intergration test (#18171)
The `TestDrainer_AllTypes_NoDeadline` test has been flaky. It looks like this
might be because the final update of batch allocations to complete is improperly
updating the state store directly rather than by RPC. If the service jobs have
restarted in the meantime, the `allocClientStateSimulator` will have updated the
index on the allocations table and that will prevent the drainer from
unblocking (and being marked complete) when the batch jobs are written with an
earlier index.

This changeset attempts to fix that by making the update via RPC (as it normally
would be in real code).
2023-08-14 16:19:00 -04:00
Tim Gross 577d96034d test: deflake job endpoint registration test (#18170)
We've seen test flakiness in the `TestJobEndpoint_Register_NonOverlapping` test,
which asserts that we don't try to placed allocations for blocked evals until
resources have been actually freed by setting the client status of the previous
alloc to complete.

The flaky assertion includes sorting the two allocations by CreateIndex and this
appears to be a non-stable sort in the context of the test run, which results in
failures that shouldn't exist. There's no reason to sort the allocations instead
of just examining them by ID. This changeset does so.
2023-08-14 16:18:53 -04:00
Esteban Barrios 9f19d7c373 config: add configurable content security policy (#18085) 2023-08-14 14:25:21 -04:00
hc-github-team-nomad-core f812bccb4e
Backport of Tuning job versions retention. #17635 into release/1.6.x (#18169)
This pull request was automerged via backport-assistant
2023-08-07 13:48:09 -05:00
hc-github-team-nomad-core ebcdd4d82d
backport of commit 65501ff97aa2ec6fa3c4f53d3f8c6c80c6a0e8a3 (#18166)
This pull request was automerged via backport-assistant
2023-08-07 10:17:34 -05:00
Charlie Voiselle bac4d112d1 [dep] bump golang.org/x/exp (#18102)
There are some refactorings that have to be made in the getter and state
where the api changed in `slices`

* Bump golang.org/x/exp
* Bump golang.org/x/exp in api
* Update job_endpoint_test
* [feedback] unexport sort function
2023-08-03 15:14:39 -04:00
hc-github-team-nomad-core 2ed92e0c6c
Backport of feature: Add new field render_templates on restart block into release/1.6.x (#18094)
This pull request was automerged via backport-assistant
2023-07-28 13:54:00 -05:00
hc-github-team-nomad-core 4f087674f4
backport of commit 7fe432042eaa0a97c0aaa40d302055eb18e8a9b0 (#18040)
This pull request was automerged via backport-assistant
2023-07-24 02:28:28 -05:00
hc-github-team-nomad-core 30260f06e8
Backport of state: canonicalize namespace on restore into release/1.6.x (#18018)
This pull request was automerged via backport-assistant
2023-07-20 15:05:16 -05:00
hc-github-team-nomad-core e891026755
Backport of CSI: improve controller RPC reliability into release/1.6.x (#18015)
This pull request was automerged via backport-assistant
2023-07-20 13:52:27 -05:00
hc-github-team-nomad-core c67a225882 Prepare for next release 2023-07-18 18:51:15 +00:00
hc-github-team-nomad-core 609a97cfab Generate files for 1.6.0 release 2023-07-18 18:51:11 +00:00
Tim Gross e8bfef8148 search: fix ACL filtering for plugins and variables
ACL permissions for the search endpoints are done in three passes. The
first (the `sufficientSearchPerms` method) is for performance and coarsely
rejects requests based on the passed-in context parameter if the user has no
permissions to any object in that context. The second (the
`filteredSearchContexts` method) filters out contexts based on whether the user
has permissions either to the requested namespace or again by context (to catch
the "all" context). Finally, when iterating over the objects available, we do
the usual filtering in the iterator.

Internal testing found several bugs in this filtering:
* CSI plugins can be searched by any authenticated user.
* Variables can be searched if the user has `job:read` permissions to the
  variable's namespace instead of `variable:list`.
* Variables cannot be searched by wildcard namespace.

This is an information leak of the plugin names and variable paths, which we
don't consider to be privileged information but intended to protect anyways.

This changeset fixes these bugs by ensuring CSI plugins are filtered in the 1st
and 2nd pass ACL filters, and changes variables to check `variable:list` in the
2nd pass filter unless the wildcard namespace is passed (at which point we'll
fallback to filtering in the iterator).

Fixes: CVE-2023-3300
Fixes: #17906
2023-07-18 12:09:55 -04:00
hc-github-team-nomad-core 0951fe1c50
backport of commit 0a5e90120b18ff450457463d6bcee68ec6804bb0 (#17900)
This pull request was automerged via backport-assistant
2023-07-11 10:00:05 -05:00
Lance Haig 0455389534
Add the ability to customise the details of the CA (#17309)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-07-11 08:53:09 +01:00
Tim Gross ad7355e58b
CSI: persist previous mounts on client to restore during restart (#17840)
When claiming a CSI volume, we need to ensure the CSI node plugin is running
before we send any CSI RPCs. This extends even to the controller publish RPC
because it requires the storage provider's "external node ID" for the
client. This primarily impacts client restarts but also is a problem if the node
plugin exits (and fingerprints) while the allocation that needs a CSI volume
claim is being placed.

Unfortunately there's no mapping of volume to plugin ID available in the
jobspec, so we don't have enough information to wait on plugins until we either
get the volume from the server or retrieve the plugin ID from data we've
persisted on the client.

If we always require getting the volume from the server before making the claim,
a client restart for disconnected clients will cause all the allocations that
need CSI volumes to fail. Even while connected, checking in with the server to
verify the volume's plugin before trying to make a claim RPC is inherently racy,
so we'll leave that case as-is and it will fail the claim if the node plugin
needed to support a newly-placed allocation is flapping such that the node
fingerprint is changing.

This changeset persists a minimum subset of data about the volume and its plugin
in the client state DB, and retrieves that data during the CSI hook's prerun to
avoid re-claiming and remounting the volume unnecessarily.

This changeset also updates the RPC handler to use the external node ID from the
claim whenever it is available.

Fixes: #13028
2023-07-10 13:20:15 -04:00
Tim Gross 5025731ebe
consul: handle "not found" errors from Consul when deleting tokens (#17847)
In Consul 1.15.0, the Delete Token API was changed so as to return an error when
deleting a non-existent ACL token. This means that if Nomad successfully deletes
the token but fails to persist that fact, it will get stuck trying to delete a
non-existent token forever.

Update the token deletion function to ignore "not found" errors and treat them
as successful deletions.

Fixes: #17833
2023-07-07 16:22:13 -04:00
James Rasell 45073e8a05
job: ensure node pool is canonicalized for state restores. (#17765) 2023-06-30 07:37:22 +01:00
nicoche 649831c1d3
deploymentwatcher: fail early whenever possible (#17341)
Given a deployment that has a `progress_deadline`, if a task group runs
out of reschedule attempts, allow it to fail at this time instead of
waiting until the `progress_deadline` is reached.

Fixes: #17260
2023-06-26 14:01:03 -04:00
James Rasell 74ab0badb4
test: add drain config tests. (#17724) 2023-06-26 16:23:13 +01:00
Luiz Aoqui 66962b2b28
np: fix list of jobs for node pool all (#17705)
Unlike nodes, jobs are allowed to be registered in the node pool `all`,
in which case all nodes are used for evaluating placements. When listing
jobs for the `all` node pool only those that are explicitly in this node
pool should be returned.
2023-06-23 15:47:53 -04:00
grembo 7936c1e33f
Add disable_file parameter to job's vault stanza (#13343)
This complements the `env` parameter, so that the operator can author
tasks that don't share their Vault token with the workload when using 
`image` filesystem isolation. As a result, more powerful tokens can be used 
in a job definition, allowing it to use template stanzas to issue all kinds of 
secrets (database secrets, Vault tokens with very specific policies, etc.), 
without sharing that issuing power with the task itself.

This is accomplished by creating a directory called `private` within
the task's working directory, which shares many properties of
the `secrets` directory (tmpfs where possible, not accessible by
`nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the
container.

If the `disable_file` parameter is set to `false` (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.
2023-06-23 15:15:04 -04:00
James Rasell 78cdf0d0d8
server: remove unused endpoints struct. (#17665) 2023-06-23 08:20:33 +01:00
Luiz Aoqui 8f05eaaa68
np: check for license on RPC endpoints (#17656) 2023-06-22 12:52:20 -04:00
Tim Gross 11216d09af
client: send node secret with every client-to-server RPC (#16799)
In Nomad 1.5.3 we fixed a security bug that allowed bypass of ACL checks if the
request came thru a client node first. But this fix broke (knowingly) the
identification of many client-to-server RPCs. These will be now measured as if
they were anonymous. The reason for this is that many client-to-server RPCs do
not send the node secret and instead rely on the protection of mTLS.

This changeset ensures that the node secret is being sent with every
client-to-server RPC request. In a future version of Nomad we can add
enforcement on the server side, but this was left out of this changeset to
reduce risks to the safe upgrade path.

Sending the node secret as an auth token introduces a new problem during initial
introduction of a client. Clients send many RPCs concurrently with
`Node.Register`, but until the node is registered the node secret is unknown to
the server and will be rejected as invalid. This causes permission denied
errors.

To fix that, this changeset introduces a gate on having successfully made a
`Node.Register` RPC before any other RPCs can be sent (except for `Status.Ping`,
which we need earlier but which also ignores the error because that handler
doesn't do an authorization check). This ensures that we only send requests with
a node secret already known to the server. This also makes client startup a
little easier to reason about because we know `Node.Register` must succeed
first, and it should make for a good place to hook in future plans for secure
introduction of nodes. The tradeoff is that an existing client that has running
allocs will take slightly longer (a second or two) to transition to ready after
a restart, because the transition in `Node.UpdateStatus` is gated at the server
by first submitting `Node.UpdateAlloc` with client alloc updates.
2023-06-22 11:06:49 -04:00
James Rasell 4e2d019639
variables: remove unused state store functions. (#17660) 2023-06-22 13:54:58 +01:00
James Rasell 71fdd7e891
core: use faster concatenation for alloc name generation. (#17591) 2023-06-22 07:46:28 +01:00
Luiz Aoqui ac08fc751b
node pools: apply node pool scheduler configuration (#17598) 2023-06-21 20:31:50 -04:00
Tim Gross ff9ba8ff73
scheduler: tolerate having only one dynamic port available (#17619)
If the dynamic port range for a node is set so that the min is equal to the max,
there's only one port available and this passes config validation. But the
scheduler panics when it tries to pick a random port. Only add the randomness
when there's more than one to pick from.

Adds a test for the behavior but also adjusts the commentary on a couple of the
existing tests that made it seem like this case was already covered if you
didn't look too closely.

Fixes: #17585
2023-06-20 13:29:25 -04:00
Luiz Aoqui 2f5df1d8a4
test: add MultiregionMinJob mock (#17614) 2023-06-20 10:57:02 -04:00
James Rasell 86e4c6cb9d
state: move variables tests to use must library. (#17609) 2023-06-20 15:46:16 +01:00
James Rasell 68df578c73
state: remove vague scaling event schema todo item. (#17610) 2023-06-20 15:22:11 +01:00
Luiz Aoqui a56b10e857
chore: fix typo and copyright header (#17605) 2023-06-20 10:09:47 -04:00
Luiz Aoqui cfb3bb517f
np: scheduler configuration updates (#17575)
* jobspec: rename node pool scheduler_configuration

In HCL specifications we usually call configuration blocks `config`
instead of `configuration`.

* np: add memory oversubscription config

* np: make scheduler config ENT
2023-06-19 11:41:46 -04:00
Luiz Aoqui d5aa72190f
node pools: namespace integration (#17562)
Add structs and fields to support the Nomad Pools Governance Enterprise
feature of controlling node pool access via namespaces.

Nomad Enterprise allows users to specify a default node pool to be used
by jobs that don't specify one. In order to accomplish this, it's
necessary to distinguish between a job that explicitly uses the
`default` node pool and one that did not specify any.

If the `default` node pool is set during job canonicalization it's
impossible to do this, so this commit allows a job to have an empty node
pool value during registration but sets to `default` at the admission
controller mutator.

In order to guarantee state consistency the state store validates that
the job node pool is set and exists before inserting it.
2023-06-16 16:30:22 -04:00
Luiz Aoqui bdc7f3305f
rpc: fix log message in Node.UpdateStatus (#17537) 2023-06-14 16:51:46 -04:00
Luiz Aoqui bc17cffaef
node pool: node pool upsert on multiregion node register (#17503)
When registering a node with a new node pool in a non-authoritative
region we can't create the node pool because this new pool will not be
replicated to other regions.

This commit modifies the node registration logic to only allow automatic
node pool creation in the authoritative region.

In non-authoritative regions, the client is registered, but the node
pool is not created. The client is kept in the `initialing` status until
its node pool is created in the authoritative region and replicated to
the client's region.
2023-06-13 11:28:28 -04:00