Commit graph

831 commits

Author SHA1 Message Date
Luiz Aoqui c29a87b875
plugin: add missing fields to TaskConfig (#16434) 2023-03-13 15:58:16 -04:00
Michael Schurter 8da636c6d5
build: update from go1.20.1 to go1.20.2 (#16427)
* build: update from go1.20.1 to go1.20.2

Note that the CVE fixed in go1.20.2 does *not* impact Nomad.

https://github.com/golang/go/issues/58647
2023-03-13 09:47:07 -07:00
Tim Gross 1cf28996e7 acl: prevent privilege escalation via workload identity
ACL policies can be associated with a job so that the job's Workload Identity
can have expanded access to other policy objects, including other
variables. Policies set on the variables the job automatically has access to
were ignored, but this includes policies with `deny` capabilities.

Additionally, when resolving claims for a workload identity without any attached
policies, the `ResolveClaims` method returned a `nil` ACL object, which is
treated similarly to a management token. While this was safe in Nomad 1.4.x,
when the workload identity token was exposed to the task via the `identity`
block, this allows a user with `submit-job` capabilities to escalate their
privileges.

We originally implemented automatic workload access to Variables as a separate
code path in the Variables RPC endpoint so that we don't have to generate
on-the-fly policies that blow up the ACL policy cache. This is fairly brittle
but also the behavior around wildcard paths in policies different from the rest
of our ACL polices, which is hard to reason about.

Add an `ACLClaim` parameter to the `AllowVariableOperation` method so that we
can push all this logic into the `acl` package and the behavior can be
consistent. This will allow a `deny` policy to override automatic access (and
probably speed up checks of non-automatic variable access).
2023-03-13 11:13:27 -04:00
Luiz Aoqui 7305a374e3
allocrunner: fix health check monitoring for Consul services (#16402)
Services must be interpolated to replace runtime variables before they
can be compared against the values returned by Consul.
2023-03-10 14:43:31 -05:00
Juana De La Cuesta 5089f13f1d
cli: add -json and -t flag for alloc checks command (#16405)
* cli: add -json flag to alloc checks for completion

* CLI: Expand test to include testing the json flag for allocation checks

* Documentation: Add the checks command

* Documentation: Add example for alloc check command

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* CLI: Add template flag to alloc checks command

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* CLI: Extend test to include -t flag for alloc checks

* func: add changelog for added flags to alloc checks

* cli[doc]: Make usage section on alloc checks clearer

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Delete modd.conf

* cli[doc]: add -t flag to command description for alloc checks

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
Co-authored-by: Juanita De La Cuesta Morales <juanita.delacuestamorales@juanita.delacuestamorales-LHQ7X0QG9X>
2023-03-10 16:58:53 +01:00
Michael Schurter 0021b282ef
env/aws: update ec2 cpu info data (#16417)
Update AWS EC2 CPU tables using `make ec2info`
2023-03-09 14:33:21 -08:00
Luiz Aoqui 1aceff7806
cli: remove hard requirement on list-jobs (#16380)
Most job subcommands allow for job ID prefix match as a convenience
functionality so users don't have to type the full job ID.

But this introduces a hard ACL requirement that the token used to run
these commands have the `list-jobs` permission, even if the token has
enough permission to execute the basic command action and the user
passed an exact job ID.

This change softens this requirement by not failing the prefix match in
case the request results in a permission denied error and instead using
the information passed by the user directly.
2023-03-09 15:00:04 -05:00
Tim Gross 99d46e5a49
scheduling: prevent self-collision in dynamic port network offerings (#16401)
When the scheduler tries to find a placement for a new allocation, it iterates
over a subset of nodes. For each node, we populate a `NetworkIndex` bitmap with
the ports of all existing allocations and any other allocations already proposed
as part of this same evaluation via its `SetAllocs` method. Then we make an
"ask" of the `NetworkIndex` in `AssignPorts` for any ports we need and receive
an "offer" in return. The offer will include both static ports and any dynamic
port assignments.

The `AssignPorts` method was written to support group networks, and it shares
code that selects dynamic ports with the original `AssignTaskNetwork`
code. `AssignTaskNetwork` can request multiple ports from the bitmap at a
time. But `AssignPorts` requests them one at a time and does not account for
possible collisions, and doesn't return an error in that case.

What happens next varies:

1. If the scheduler doesn't place the allocation on that node, the port
   conflict is thrown away and there's no problem.
2. If the node is picked and this is the only allocation (or last allocation),
   the plan applier will reject the plan when it calls `SetAllocs`, as we'd expect.
3. If the node is picked and there are additional allocations in the same eval
   that iterate over the same node, their call to `SetAllocs` will detect the
   impossible state and the node will be rejected. This can have the puzzling
   behavior where a second task group for the job without any networking at all
   can hit a port collision error!

It looks like this bug has existed since we implemented group networks, but
there are several factors that add up to making the issue rare for many users
yet frustratingly frequent for others:

* You're more likely to hit this bug the more tightly packed your range for
  dynamic ports is. With 12000 ports in the range by default, many clusters can
  avoid this for a long time.
* You're more likely to hit case (3) for jobs with lots of allocations or if a
  scheduler has to iterate over a large number of nodes, such as with system jobs,
  jobs with `spread` blocks, or (sometimes) jobs using `unique` constraints.

For unlucky combinations of these factors, it's possible that case (3) happens
repeatedly, preventing scheduling of a given job until a client state
change (ex. restarting the agent so all its allocations are rescheduled
elsewhere) re-opens the range of dynamic ports available.

This changeset:

* Fixes the bug by accounting for collisions in dynamic port selection in
  `AssignPorts`.
* Adds test coverage for `AssignPorts`, expands coverage of this case for the
  deprecated `AssignTaskNetwork`, and tightens the dynamic port range in a
  scheduler test for spread scheduling to more easily detect this kind of problem
  in the future.
* Adds a `String()` method to `Bitmap` so that any future "screaming" log lines
  have a human-readable list of used ports.
2023-03-09 10:09:54 -05:00
Seth Hoenig ff4503aac6
client: disable running artifact downloader as nobody (#16375)
* client: disable running artifact downloader as nobody

This PR reverts a change from Nomad 1.5 where artifact downloads were
executed as the nobody user on Linux systems. This was done as an attempt
to improve the security model of artifact downloading where third party
tools such as git or mercurial would be run as the root user with all
the security implications thereof.

However, doing so conflicts with Nomad's own advice for securing the
Client data directory - which when setup with the recommended directory
permissions structure prevents artifact downloads from working as intended.

Artifact downloads are at least still now executed as a child process of
the Nomad agent, and on modern Linux systems make use of the kernel Landlock
feature for limiting filesystem access of the child process.

* docs: update upgrade guide for 1.5.1 sandboxing

* docs: add cl

* docs: add title to upgrade guide fix
2023-03-08 15:58:43 -06:00
Phil Renaud 54bb97f299
Outage recovery link fix (#16365) 2023-03-07 15:52:26 -05:00
Seth Hoenig 835365d2a4
docker: fix bug where network pause containers would be erroneously reconciled (#16352)
* docker: fix bug where network pause containers would be erroneously gc'd

* docker: cl: thread context from driver into pause container restoration
2023-03-07 12:17:32 -06:00
James Rasell 7507c92139
cli: support json and t on acl binding-rule info command. (#16357) 2023-03-07 18:27:02 +01:00
Tim Gross a2ceab3d8c
scheduler: correctly detect inplace update with wildcard datacenters (#16362)
Wildcard datacenters introduced a bug where a job with any wildcard datacenters
will always be treated as a destructive update when we check whether a
datacenter has been removed from the jobspec.

Includes updating the helper so that callers don't have to loop over the job's
datacenters.
2023-03-07 10:05:59 -05:00
Phil Renaud edf59597d2
[ui] Fix: Wildcard-datacenter system/sysbatch jobs stopped showing client links/chart (#16274)
* Fix for wildcard DC sys/sysbatch jobs

* A few extra modules for wildcard DC in systemish jobs

* doesMatchPattern moved to its own util as match-glob

* DC glob lookup using matchGlob

* PR feedback
2023-03-06 10:06:31 -05:00
Luiz Aoqui 2a1a790820
client: don't emit task shutdown delay event if not waiting (#16281) 2023-03-03 18:22:06 -05:00
Luiz Aoqui 3f1ea9da4b
api: set last index and request time on alloc stop (#16319)
Some of the methods in `Allocations()` incorrectly use the `putQuery`
in API calls where `put` is more appropriate since they are not reading
information back. These methods are also not returning request metadata
such as `LastIndex` back to callers, which can be useful to have in some
scenarios.

They also provide poor developer experience as they take an
`*api.Allocation` struct when only the allocation ID is necessary. This
can lead consumers to make unnecessary API calls to fetch the full
allocation.

Fixing these problems require updating the methods' signatures so they
take `*WriteOptions` instead of `*QueryOptions` and return `*WriteMeta`,
but this is a breaking change that requires advanced notice to consumers.

This commit adds a future breaking change notice and also fixes the
`Stop` method so it properly returns request metadata in a backwards
compatible way.
2023-03-03 15:52:41 -05:00
Tim Gross 3c0eaba9db
remove backcompat support for non-atomic job registration (#16305)
In Nomad 0.12.1 we introduced atomic job registration/deregistration, where the
new eval was written in the same raft entry. Backwards-compatibility checks were
supposed to have been removed in Nomad 1.1.0, but we missed that. This is long
safe to remove.
2023-03-03 15:52:22 -05:00
Luiz Aoqui 1d051d834d
cli: use shared logic for resolving job prefix (#16306)
Several `nomad job` subcommands had duplicate or slightly similar logic
for resolving a job ID from a CLI argument prefix, while others did not
have this functionality at all.

This commit pulls the shared logic to the command Meta and updates all
`nomad job` subcommands to use it.
2023-03-03 14:43:20 -05:00
Tim Gross 8747059b86
service: fix regression in task access to list/read endpoint (#16316)
When native service discovery was added, we used the node secret as the auth
token. Once Workload Identity was added in Nomad 1.4.x we needed to use the
claim token for `template` blocks, and so we allowed valid claims to bypass the
ACL policy check to preserve the existing behavior. (Invalid claims are still
rejected, so this didn't widen any security boundary.)

In reworking authentication for 1.5.0, we unintentionally removed this
bypass. For WIs without a policy attached to their job, everything works as
expected because the resulting `acl.ACL` is nil. But once a policy is attached
to the job the `acl.ACL` is no longer nil and this causes permissions errors.

Fix the regression by adding back the bypass for valid claims. In future work,
we should strongly consider getting turning the implicit policies into real
`ACLPolicy` objects (even if not stored in state) so that we don't have these
kind of brittle exceptions to the auth code.
2023-03-03 11:41:19 -05:00
Valentino 1f9d11feff
Add namespace argument to the job verification help text (#16243) 2023-03-02 16:42:14 -05:00
Dao Thanh Tung ed31e0a5f5
cli: sort Node value in nomad operator raft list-peers command (#16221)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-03-02 16:16:30 -05:00
Phil Renaud 93574ce085
[ui, helios] Toast Component (#16099)
* Template and styles

* @type to @color on flash messages

* Notifications service as wrapper

* Test cases updated for new notifs
2023-03-02 13:52:16 -05:00
Tim Gross 0e1b554299
handle FSM.Apply errors in raftApply (#16287)
The signature of the `raftApply` function requires that the caller unwrap the
first returned value (the response from `FSM.Apply`) to see if it's an
error. This puts the burden on the caller to remember to check two different
places for errors, and we've done so inconsistently.

Update `raftApply` to do the unwrapping for us and return any `FSM.Apply` error
as the error value. Similar work was done in Consul in
https://github.com/hashicorp/consul/pull/9991. This eliminates some boilerplate
and surfaces a few minor bugs in the process:

* job deregistrations of already-GC'd jobs were still emitting evals
* reconcile job summaries does not return scheduler errors
* node updates did not report errors associated with inconsistent service
  discovery or CSI plugin states

Note that although _most_ of the `FSM.Apply` functions return only errors (which
makes it tempting to remove the first return value entirely), there are few that
return `bool` for some reason and Variables relies on the response value for
proper CAS checking.
2023-03-02 13:51:09 -05:00
Tim Gross bb4880ec13
client: use RPC address and not serf after initial Consul discovery (#16217)
Nomad servers can advertise independent IP addresses for `serf` and
`rpc`. Somewhat unexpectedly, the `serf` address is also used for both Serf and
server-to-server RPC communication (including Raft RPC). The address advertised
for `rpc` is only used for client-to-server RPC. This split was introduced
intentionally in Nomad 0.8.

When clients are using Consul discovery for connecting to servers, they get an
initial discovery set from Consul and use the correct `rpc` tag in Consul to get
a list of adddresses for servers. The client then makes a `Status.Peers` RPC to
get the list of those servers that are raft peers. But this endpoint is shared
between servers and clients, and provides the address used for Raft.

Most of the time this is harmless because servers will bind on 0.0.0.0 anyways.,
But in topologies where servers are on a private network and clients are on
separate subnets (or even public subnets), clients will make initial contact
with the server to get the list of peers but then populate their local server
set with unreachable addresses.

Cluster administrators can work around this problem by using `server_join` with
specific IP addresses (or DNS names), because the `Node.UpdateStatus` endpoint
returns the correct set of RPC addresses when updating the node. So once a
client has registered, it will get the correct set of RPC addresses.

This changeset updates the client logic to query `Status.Members` instead of
`Status.Peers`, and then extract the correctly advertised address and port from
the response body.
2023-03-02 13:36:45 -05:00
Daniel Bennett 39e3a1ac3e
build/cli: Add BuildDate (#16216)
* build: add BuildDate to version info

will be used in enterprise to compare to license expiration time

* cli: multi-line version output, add BuildDate

before:
$ nomad version
Nomad v1.4.3 (coolfakecommithashomgoshsuchacoolonewoww)

after:
$ nomad version
Nomad v1.5.0-dev
BuildDate 2023-02-17T19:29:26Z
Revision coolfakecommithashomgoshsuchacoolonewoww

compare consul:
$ consul version
Consul v1.14.4
Revision dae670fe
Build Date 2023-01-26T15:47:10Z
Protocol 2 spoken by default, blah blah blah...

and vault:
$ vault version
Vault v1.12.3 (209b3dd99fe8ca320340d08c70cff5f620261f9b), built 2023-02-02T09:07:27Z

* docs: update version command output
2023-02-27 11:27:40 -06:00
Tim Gross 79844048e6
populate Nomad token for task runner update hooks (#16266)
The `TaskUpdateRequest` struct we send to task runner update hooks was not
populating the Nomad token that we get from the task runner (which we do for the
Vault token). This results in task runner hooks like the template hook
overwriting the Nomad token with the zero value for the token. This causes
in-place updates of a task to break templates (but not other uses that rely on
identity but don't currently bother to update it, like the identity hook).
2023-02-27 10:48:13 -05:00
Tim Gross 4c9688271a
CSI: fix potential state store corruptions (#16256)
The `CSIVolume` struct has references to allocations that are "denormalized"; we
don't store them on the `CSIVolume` struct but hydrate them on read. Tests
detecting potential state store corruptions found two locations where we're not
copying the volume before denormalizing:

* When garbage collecting CSI volume claims.
* When checking if it's safe to force-deregister the volume.

There are no known user-visible problems associated with these bugs but both
have the potential of mutating volume claims outside of a FSM transaction. This
changeset also cleans up state mutations in some CSI tests so as to avoid having
working tests cover up potential future bugs.
2023-02-27 08:47:08 -05:00
Seth Hoenig 61404b2551
services: Set Nomad's User-Agent by default on HTTP checks for nomad services (#16248) 2023-02-23 08:10:42 -06:00
Seth Hoenig 804f9fdb93
services: ensure task group is set on service hook (#16240)
This PR fixes a bug where the task group information was not being set
on the serviceHook.AllocInfo struct, which is needed later on for calculating
the CheckID of a nomad service check. The CheckID is calculated independently
from multiple callsites, and the information being passed in must be consistent,
including the group name.

The workload.AllocInfo.Group was not set at this callsite, due to the bug fixed in this PR.
 https://github.com/hashicorp/nomad/blob/main/client/serviceregistration/nsd/nsd.go#L114
2023-02-22 10:22:48 -06:00
Seth Hoenig c9ffd1274b
api: fix a panic and tweak some exported types (#16237)
This PR
 - fixes a panic in GetItems when looking up a variable that does not exist.
 - deprecates GetItems in favor of GetVariableItems which avoids returning a pointer to a map
 - deprecates ErrVariableNotFound in favor of ErrVariablePathNotFound which is an actual error type
 - does some minor code cleanup to make linters happier
2023-02-22 08:17:22 -06:00
Alessio Perugini 4e9ec24b22
Allow configurable range of Job priorities (#16084) 2023-02-17 09:23:13 -05:00
Charlie Voiselle c28c0eb6bc
[cli] var put - Add extension parsing to second argument when file (#16181) 2023-02-16 13:43:01 -05:00
visweshs123 fbc51dd190
csi: add option to configure CSIVolumeClaimGCInterval (#16195) 2023-02-16 10:41:15 -05:00
Tim Gross 4fabad7f61
cli: fmt -check should return early on diff (#16174)
The `nomad fmt -check` command incorrectly writes to file because we didn't
return before writing the file on a diff. Fix this bug and update the command
internals to differentiate between the write-to-file and write-to-stdout code
paths, which are activated by different combinations of options and flags.

The docstring for the `-list` and `-write` flags is also unclear and can be
easily misread to be the opposite of the actual behavior. Clarify this and fix
up the docs to match.

This changeset also refactors the tests quite a bit so as to make the test
outputs clear when something is incorrect.
2023-02-15 14:06:31 -05:00
Seth Hoenig 5d325decca
cgutil: handle panic from runc helper method (#16180)
This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from
runc in a defer/recover block because it might panic in some cases.

Upstream fix in: https://github.com/opencontainers/runc/pull/3745

Closes #16179
2023-02-14 15:09:43 -06:00
Tim Gross e8243e0820
build: update to go1.20.1 (#16182) 2023-02-14 15:03:26 -05:00
Will Nicholson 4dc83757a6
eventstream: Handle missing policy documents in event streams (#15495)
Fixes https://github.com/hashicorp/nomad/issues/15493

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-02-14 11:27:39 -05:00
Seth Hoenig 165791dd89
artifact: protect against unbounded artifact decompression (1.5.0) (#16151)
* artifact: protect against unbounded artifact decompression

Starting with 1.5.0, set defaut values for artifact decompression limits.

artifact.decompression_size_limit (default "100GB") - the maximum amount of
data that will be decompressed before triggering an error and cancelling
the operation

artifact.decompression_file_count_limit (default 4096) - the maximum number
of files that will be decompressed before triggering an error and
cancelling the operation.

* artifact: assert limits cannot be nil in validation
2023-02-14 09:28:39 -06:00
Pierre Cauchois 74cf372e20
api: fix missing Node Status "disconnected" in API (#16166) 2023-02-14 09:43:23 -05:00
Charlie Voiselle d93ba0cf32
Add warnings to var put for non-alphanumeric keys. (#15933)
* Warn when Items key isn't directly accessible

Go template requires that map keys are alphanumeric for direct access
using the dotted reference syntax. This warns users when they create
keys that run afoul of this requirement.

- cli: use regex to detect invalid indentifiers in var keys
- test: fix slash in escape test case
- api: share warning formatting function between API and CLI
- ui: warn if var key has characters other than _, letter, or number

---------
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-02-13 16:14:59 -05:00
Dao Thanh Tung 6d6b46f2cb
Fix unbold header Device Group Attributes and remove unused var in func nodeCSIVolumeNames (#16138)
* Fix  unbold header and remove unused var in  func
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Add CHANGELOG file
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Apply suggestions from review <Charlie Voiselle>

---------

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-02-13 15:33:12 -05:00
Charlie Voiselle 65ce3ec8de
[core] Do not start the plugin loader on non-clients (#16111)
The plugin loader loads task and device driver plugins which are not
used on server nodes.
2023-02-10 15:33:16 -05:00
Tim Gross 65c7e149d3
eval broker: use write lock when reaping cancelable evals (#16112)
The eval broker's `Cancelable` method used by the cancelable eval reaper mutates
the slice of cancelable evals by removing a batch at a time from the slice. But
this method unsafely uses a read lock despite this mutation. Under normal
workloads this is likely to be safe but when the eval broker is under the heavy
load this feature is intended to fix, we're likely to have a race
condition. Switch this to a write lock, like the other locks that mutate the
eval broker state.

This changeset also adjusts the timeout to allow poorly-sized Actions runners
more time to schedule the appropriate goroutines. The test has also been updated
to use `shoenig/test/wait` so we can have sensible reporting of the results
rather than just a timeout error when things go wrong.
2023-02-10 10:40:41 -05:00
Seth Hoenig af28ac1610
users: create cache for user lookups (#16100)
* users: create cache for user lookups

This PR introduces a global cache for OS user lookups. This should
relieve pressure on the OS domain/directory lookups, which would be
queried more now that Task API exists.

Hits are cached for 1 hour, and misses are cached for 1 minute. These
values are fairly arbitrary - we can tweak them if there is any reason to.

Closes #16010

* users: delete expired negative entry from cache
2023-02-09 08:37:50 -06:00
Seth Hoenig 0e7bf87ee1
deps: upgrade to hashicorp/golang-lru/v2 (#16085) 2023-02-08 15:20:33 -06:00
Michael Schurter aa9d035c58
docs: cleanup 1.5 changelog entries (#16094) 2023-02-08 12:10:08 -08:00
Michael Schurter 35d65c7c7e
Dynamic Node Metadata (#15844)
Fixes #14617
Dynamic Node Metadata allows Nomad users, and their jobs, to update Node metadata through an API. Currently Node metadata is only reloaded when a Client agent is restarted.

Includes new UI for editing metadata as well.

---------

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
2023-02-07 14:42:25 -08:00
Charlie Voiselle 31a289891d
Add sprig for command templates (#9053)
Adds the sprig functions to the template funcmap prepended with `sprig_` to match the behavior in consul-template
2023-02-07 14:07:20 -05:00
Seth Hoenig 590ae08752
main: remove deprecated uses of rand.Seed (#16074)
* main: remove deprecated uses of rand.Seed

go1.20 deprecates rand.Seed, and seeds the rand package
automatically. Remove cases where we seed the random package,
and cleanup the one case where we intentionally create a
known random source.

* cl: update cl

* mod: update go mod
2023-02-07 09:19:38 -06:00
Tim Gross 8a7d6b0cde
cli: remove deprecated keyring and keygen commands (#16068)
These command were marked as deprecated in 1.4.0 with intent to remove in
1.5.0. Remove them and clean up the docs.
2023-02-07 09:49:52 -05:00
Dao Thanh Tung ae720fe28d
Add -json and -t flag for nomad acl token create command (#16055)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-02-07 12:05:41 +01:00
Seth Hoenig 68894bdc62
docker: disable driver when running as non-root on cgroups v2 hosts (#16063)
* docker: disable driver when running as non-root on cgroups v2 hosts

This PR modifies the docker driver to behave like exec when being run
as a non-root user on a host machine with cgroups v2 enabled. Because
of how cpu resources are managed by the Nomad client, the nomad agent
must be run as root to manage docker-created cgroups.

* cl: update cl
2023-02-06 14:09:19 -06:00
Michael Schurter 0a496c845e
Task API via Unix Domain Socket (#15864)
This change introduces the Task API: a portable way for tasks to access Nomad's HTTP API. This particular implementation uses a Unix Domain Socket and, unlike the agent's HTTP API, always requires authentication even if ACLs are disabled.

This PR contains the core feature and tests but followup work is required for the following TODO items:

- Docs - might do in a followup since dynamic node metadata / task api / workload id all need to interlink
- Unit tests for auth middleware
- Caching for auth middleware
- Rate limiting on negative lookups for auth middleware

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-02-06 11:31:22 -08:00
Seth Hoenig 911700ffea
build: update to go1.20 (#16029)
* build: update to go1.20

* build: use stringy go1.20 in circle yaml

* tests: handle new x509 certificate error structure in go1.20

* cl: add cl entry
2023-02-03 08:14:53 -06:00
Phil Renaud d3c351d2d2
Label for the Web UI (#16006)
* Demoable state

* Demo mirage color

* Label as a block with foreground and background colours

* Test mock updates

* Go test updated

* Documentation update for label support
2023-02-02 16:29:04 -05:00
Tim Gross 19a2c065f4
System and sysbatch jobs always have zero index (#16030)
Service jobs should have unique allocation Names, derived from the
Job.ID. System jobs do not have unique allocation Names because the index is
intended to indicated the instance out of a desired count size. Because system
jobs do not have an explicit count but the results are based on the targeted
nodes, the index is less informative and this was intentionally omitted from the
original design.

Update docs to make it clear that NOMAD_ALLOC_INDEX is always zero for 
system/sysbatch jobs

Validate that `volume.per_alloc` is incompatible with system/sysbatch jobs.
System and sysbatch jobs always have a `NOMAD_ALLOC_INDEX` of 0. So
interpolation via `per_alloc` will not work as soon as there's more than one
allocation placed. Validate against this on job submission.
2023-02-02 16:18:01 -05:00
Daniel Bennett 335f0a5371
docs: how to troubleshoot consul connect envoy (#15908)
* largely a doc-ification of this commit message:
  d47678074bf8ae9ff2da3c91d0729bf03aee8446
  this doesn't spell out all the possible failure modes,
  but should be a good starting point for folks.

* connect: add doc link to envoy bootstrap error

* add Unwrap() to RecoverableError
  mainly for easier testing
2023-02-02 14:20:26 -06:00
Charlie Voiselle cc6f4719f1
Add option to expose workload token to task (#15755)
Add `identity` jobspec block to expose workload identity tokens to tasks.

---------

Co-authored-by: Anders <mail@anars.dk>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-02-02 10:59:14 -08:00
Daniel Bennett dc9c8d4e47
Change job init default to example.nomad.hcl and recommend in docs (#15997)
recommend .nomad.hcl for job files instead of .nomad (without .hcl)
* nomad job init -> example.nomad.hcl
* update docs
2023-02-02 11:47:47 -06:00
Tim Gross 971a286ea3
cli: Fix a panic in deployment status when scheduling is slow (#16011)
If a deployment fails, the `deployment status` command can get a nil deployment
when it checks for a rollback deployment if there isn't one (or at least not one
at the time of the query). Fix the panic.
2023-02-02 12:34:44 -05:00
Phil Renaud 3db9f11c37
[feat] Nomad Job Templates (#15746)
* Extend variables under the nomad path prefix to allow for job-templates (#15570)

* Extend variables under the nomad path prefix to allow for job-templates

* Add job-templates to error message hinting

* RadioCard component for Job Templates (#15582)

* chore: add

* test: component API

* ui: component template

* refact: remove  bc naming collission

* styles: remove SASS var causing conflicts

* Disallow specific variable at nomad/job-templates (#15681)

* Disallows variables at exactly nomad/job-templates

* idiomatic refactor

* Expanding nomad job init to accept a template flag (#15571)

* Adding a string flag for templates on job init

* data-down actions-up version of a custom template editor within variable

* Dont force grid on job template editor

* list-templates flag started

* Correctly slice from end of path name

* Pre-review cleanup

* Variable form acceptance test for job template editing

* Some review cleanup

* List Job templates test

* Example from template test

* Using must.assertions instead of require etc

* ui: add choose template button (#15596)

* ui: add new routes

* chore: update file directory

* ui: add choose template button

* test: button and page navigation

* refact: update var name

* ui: use `Button` component from `HDS` (#15607)

* ui: integrate  buttons

* refact: remove  helper

* ui: remove icons on non-tertiary buttons

* refact: update normalize method for key/value pairs (#15612)

* `revert`: `onCancel` for `JobDefinition`

The `onCancel` method isn't included in the component API for `JobEditor` and the primary cancel behavior exists outside of the component. With the exception of the `JobDefinition` page where we include this button in the top right of the component instead of next to the `Plan` button.

* style: increase button size

* style: keep lime green

* ui: select template (#15613)

* ui: deprecate unused component

* ui: deprecate tests

* ui: jobs.run.templates.index

* ui: update logic to handle templates

* refact: revert key/value changes

* style: padding for cards + buttons

* temp: fixtures for mirage testing

* Revert "refact: revert key/value changes"

This reverts commit 124e95d12140be38fc921f7e15243034092c4063.

* ui: guard template for unsaved job

* ui: handle reading template variable

* Revert "refact: update normalize method for key/value pairs (#15612)"

This reverts commit 6f5ffc9b610702aee7c47fbff742cc81f819ab74.

* revert: remove test fixtures

* revert: prettier problems

* refact: test doesnt need filter expression

* styling: button sizes and responsive cards

* refact: remove route guarding

* ui: update variable adapter

* refact: remove model editing behavior

* refact: model should query variables to populate editor

* ui: clear qp on exit

* refact: cleanup deprecated API

* refact: query all namespaces

* refact: deprecate action

* ui: rely on  collection

* refact: patch deprecate transition API

* refact: patch test to expect namespace qp

* styling: padding, conditionals

* ui: flashMessage on 404

* test: update for o(n+1) query

* ui: create new job template (#15744)

* refact: remove unused code

* refact: add type safety

* test: select template flow

* test: add data-test attrs

* chore: remove dead code

* test: create new job flow

* ui: add create button

* ui: create job template

* refact: no need for wildcard

* refact:  record instead of delete

* styling: spacing

* ui: add error handling and form validation to job create template (#15767)

* ui: handle server side errors

* ui: show error to prevent duplicate

* refact: conditional namespace

* ui: save as template flow (#15787)

* bug:  patches failing tests associated with `pretender` (#15812)

* refact: update assertion

* refact: test set-up

* ui: job templates manager view (#15815)

* ui: manager list view

* test: edit flow

* refact: deprecate column-helper

* ui: template edit and delete flow (#15823)

* ui: manager list view

* refact: update title

* refact: update permissions

* ui: template edit page

* bug: typo

* refact: update toast messages

* bug:  clear selections on exit (#15827)

* bug:  clear controllers on exit

* test: mirage config changes (#15828)

* refact: deprecate column-helper

* style: update z-index for HDS

* Revert "style: update z-index for HDS"

This reverts commit d3d87ceab6d083f7164941587448607838944fc1.

* refact: update delete button

* refact: edit redirect

* refact: patch reactivity issues

* styling: fixed width

* refact: override defaults

* styling: edit text causing overflow

* styling:  add inline text

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>

* bug: edit `text` to `template`

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>

Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>

* test:  delete flow job templates (#15896)

* refact: edit names

* bug:  set correct ref to store

* chore: trim whitespace:

* test: delete flow

* bug: reactively update view (#15904)

* Initialized default jobs (#15856)

* Initialized default jobs

* More jobs scaffolded

* Better commenting on a couple example job specs

* Adapter doing the work

* fall back to epic config

* Label format helper and custom serialization logic

* Test updates to account for a never-empty state

* Test suite uses settled and maintain RecordArray in adapter return

* Updates to hello-world and variables example jobspecs

* Parameterized job gets optional payload output

* Formatting changes for param and service discovery job templates

* Multi-group service discovery job

* Basic test for default templates (#15965)

* Basic test for default templates

* Percy snapshot for manage page

* Some late-breaking design changes

* Some copy edits to the header paragraphs for job templates (#15967)

* Added some init options for job templates (#15994)

* Async method for populating default job templates from the variable adapter

---------

Co-authored-by: Jai <41024828+ChaiWithJai@users.noreply.github.com>
2023-02-02 10:37:40 -05:00
Charlie Voiselle 4caac1a92f
client: Add option to enable hairpinMode on Nomad bridge (#15961)
* Add `bridge_network_hairpin_mode` client config setting
* Add node attribute: `nomad.bridge.hairpin_mode`
* Changed format string to use `%q` to escape user provided data
* Add test to validate template JSON for developer safety

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2023-02-02 10:12:15 -05:00
jmwilkinson 37834dffda
Allow wildcard datacenters to be specified in job file (#11170)
Also allows for default value of `datacenters = ["*"]`
2023-02-02 09:57:45 -05:00
Luiz Aoqui 7c47b576cd
changelog: fix entries for #15522 and #15819 (#15998) 2023-02-01 18:03:39 -05:00
Tim Gross 0abf0b948b
job parsing: fix panic when variable validation is missing condition (#16018) 2023-02-01 16:41:03 -05:00
Tristan Pemble 5440965260
fix(#13844): canonicalize job to avoid nil pointer deference (#13845) 2023-02-01 16:01:28 -05:00
Seth Hoenig ca7ead191e
consul: restore consul token when reverting a job (#15996)
* consul: reset consul token on job during registration of a reversion

* e2e: add test for reverting a job with a consul service

* cl: fixup cl entry
2023-02-01 14:02:45 -06:00
James Rasell 9e8325d63c
acl: fix a bug in token creation when parsing expiration TTLs. (#15999)
The ACL token decoding was not correctly handling time duration
syntax such as "1h" which forced people to use the nanosecond
representation via the HTTP API.

The change adds an unmarshal function which allows this syntax to
be used, along with other styles correctly.
2023-02-01 17:43:41 +01:00
James Rasell 67acfd9f6b
acl: return 400 not 404 code when creating an invalid policy. (#16000) 2023-02-01 17:40:15 +01:00
Mike Nomitch 80848b202e
Increases max variable size to 64KiB from 16KiB (#15983) 2023-01-31 13:32:36 -05:00
stswidwinski 16eefbbf4d
GC: ensure no leakage of evaluations for batch jobs. (#15097)
Prior to 2409f72 the code compared the modification index of a job to itself. Afterwards, the code compared the creation index of the job to itself. In either case there should never be a case of re-parenting of allocs causing the evaluation to trivially always result in false, which leads to unreclaimable memory.

Prior to this change allocations and evaluations for batch jobs were never garbage collected until the batch job was explicitly stopped. The new `batch_eval_gc_threshold` server configuration controls how often they are collected. The default threshold is `24h`.
2023-01-31 13:32:14 -05:00
Seth Hoenig 139f2c0b0f
docker: set force=true on remove image to handle images referenced by multiple tags (#15962)
* docker: set force=true on remove image to handle images referenced by multiple tags

This PR changes our call of docker client RemoveImage() to RemoveImageExtended with
the Force=true option set. This fixes a bug where an image referenced by more than
one tag could never be garbage collected by Nomad. The Force option only applies to
stopped containers; it does not affect running workloads.

* docker: add note about image_delay and multiple tags
2023-01-31 07:53:18 -06:00
Yorick Gersie d94f22bee2
Ensure infra_image gets proper label used for reconciliation (#15898)
* Ensure infra_image gets proper label used for reconciliation

Currently infra containers are not cleaned up as part of the dangling container
cleanup routine. The reason is that Nomad checks if a container is a Nomad owned
container by verifying the existence of the: `com.hashicorp.nomad.alloc_id` label.

Ensure we set this label on the infra container as well.

* fix unit test

* changelog: add entry

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-01-30 09:46:45 -06:00
Jorge Marey d1c9aad762
Rename fields on proxyConfig (#15541)
* Change api Fields for expose and paths

* Add changelog entry

* changelog: add deprecation notes about connect fields

* api: minor style tweaks

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-01-30 09:31:16 -06:00
dependabot[bot] bb79824a20
build(deps): bump github.com/docker/docker from 20.10.21+incompatible to 20.10.23+incompatible (#15848)
* build(deps): bump github.com/docker/docker

Bumps [github.com/docker/docker](https://github.com/docker/docker) from 20.10.21+incompatible to 20.10.23+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v20.10.21...v20.10.23)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* changelog: add entry for docker/docker

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-01-30 09:10:06 -06:00
舍我其谁 3abb453bd0
volume: Add the missing option propagation_mode (#15626) 2023-01-30 09:32:07 -05:00
Seth Hoenig 074b76e3bf
consul: check for acceptable service identity on consul tokens (#15928)
When registering a job with a service and 'consul.allow_unauthenticated=false',
we scan the given Consul token for an acceptable policy or role with an
acceptable policy, but did not scan for an acceptable service identity (which
is backed by an acceptable virtual policy). This PR updates our consul token
validation to also accept a matching service identity when registering a service
into Consul.

Fixes #15902
2023-01-27 18:15:51 -06:00
Seth Hoenig 0fac4e19b3
client: always run alloc cleanup hooks on final update (#15855)
* client: run alloc pre-kill hooks on last pass despite no live tasks

This PR fixes a bug where alloc pre-kill hooks were not run in the
edge case where there are no live tasks remaining, but it is also
the final update to process for the (terminal) allocation. We need
to run cleanup hooks here, otherwise they will not run until the
allocation gets garbage collected (i.e. via Destroy()), possibly
at a distant time in the future.

Fixes #15477

* client: do not run ar cleanup hooks if client is shutting down
2023-01-27 09:59:31 -06:00
Luiz Aoqui de87cdc816
template: restore driver handle on update (#15915)
When the template hook Update() method is called it may recreate the
template manager if the Nomad or Vault token has been updated.

This caused the new template manager did not have a driver handler
because this was only being set on the Poststart hook, which is not
called for inplace updates.
2023-01-27 10:55:59 -05:00
Luiz Aoqui 09fc054c82
ui: fix alloc memory stats to match CLI output (#15909) 2023-01-26 17:08:13 -05:00
Luiz Aoqui bb323ef3de
ui: fix navigation for namespaced jobs in search and job version (#15906) 2023-01-26 16:03:07 -05:00
Seth Hoenig 7375fd40fc
nsd: block on removal of services (#15862)
* nsd: block on removal of services

This PR uses a WaitGroup to ensure workload removals are complete
before returning from ServiceRegistrationHandler.RemoveWorkload of
the nomad service provider. The de-registration of individual services
still occurs asynchrously, but we must block on the parent removal
call so that we do not race with further operations on the same set
of services - e.g. in the case of a task restart where we de-register
and then re-register the services in quick succession.

Fixes #15032

* nsd: add e2e test for initial failing check and restart
2023-01-26 08:17:57 -06:00
Yorick Gersie 2a5c423ae0
Allow per_alloc to be used with host volumes (#15780)
Disallowing per_alloc for host volumes in some cases makes life of a nomad user much harder.
When we rely on the NOMAD_ALLOC_INDEX for any configuration that needs to be re-used across
restarts we need to make sure allocation placement is consistent. With CSI volumes we can
use the `per_alloc` feature but for some reason this is explicitly disabled for host volumes.

Ensure host volumes understand the concept of per_alloc
2023-01-26 09:14:47 -05:00
Tim Gross 6677a103c2
metrics: measure rate of RPC requests that serve API (#15876)
This changeset configures the RPC rate metrics that were added in #15515 to all
the RPCs that support authenticated HTTP API requests. These endpoints already
configured with pre-forwarding authentication in #15870, and a handful of others
were done already as part of the proof-of-concept work. So this changeset is
entirely copy-and-pasting one method call into a whole mess of handlers.

Upcoming PRs will wire up pre-forwarding auth and rate metrics for the remaining
set of RPCs that have no API consumers or aren't authenticated, in smaller
chunks that can be more thoughtfully reviewed.
2023-01-25 16:37:24 -05:00
Luiz Aoqui 3479e2231f
core: enforce strict steps for clients reconnect (#15808)
When a Nomad client that is running an allocation with
`max_client_disconnect` set misses a heartbeat the Nomad server will
update its status to `disconnected`.

Upon reconnecting, the client will make three main RPC calls:

- `Node.UpdateStatus` is used to set the client status to `ready`.
- `Node.UpdateAlloc` is used to update the client-side information about
  allocations, such as their `ClientStatus`, task states etc.
- `Node.Register` is used to upsert the entire node information,
  including its status.

These calls are made concurrently and are also running in parallel with
the scheduler. Depending on the order they run the scheduler may end up
with incomplete data when reconciling allocations.

For example, a client disconnects and its replacement allocation cannot
be placed anywhere else, so there's a pending eval waiting for
resources.

When this client comes back the order of events may be:

1. Client calls `Node.UpdateStatus` and is now `ready`.
2. Scheduler reconciles allocations and places the replacement alloc to
   the client. The client is now assigned two allocations: the original
   alloc that is still `unknown` and the replacement that is `pending`.
3. Client calls `Node.UpdateAlloc` and updates the original alloc to
   `running`.
4. Scheduler notices too many allocs and stops the replacement.

This creates unnecessary placements or, in a different order of events,
may leave the job without any allocations running until the whole state
is updated and reconciled.

To avoid problems like this clients must update _all_ of its relevant
information before they can be considered `ready` and available for
scheduling.

To achieve this goal the RPC endpoints mentioned above have been
modified to enforce strict steps for nodes reconnecting:

- `Node.Register` does not set the client status anymore.
- `Node.UpdateStatus` sets the reconnecting client to the `initializing`
  status until it successfully calls `Node.UpdateAlloc`.

These changes are done server-side to avoid the need of additional
coordination between clients and servers. Clients are kept oblivious of
these changes and will keep making these calls as they normally would.

The verification of whether allocations have been updates is done by
storing and comparing the Raft index of the last time the client missed
a heartbeat and the last time it updated its allocations.
2023-01-25 15:53:59 -05:00
Tim Gross f3f64af821
WI: allow workloads to use RPCs associated with HTTP API (#15870)
This changeset allows Workload Identities to authenticate to all the RPCs that
support HTTP API endpoints, for use with PR #15864.

* Extends the work done for pre-forwarding authentication to all RPCs that
  support a HTTP API endpoint.
* Consolidates the auth helpers used by the CSI, Service Registration, and Node
  endpoints that are currently used to support both tokens and client secrets.

Intentionally excluded from this changeset:
* The Variables endpoint still has custom handling because of the implicit
  policies. Ideally we'll figure out an efficient way to resolve those into real
  policies and then we can get rid of that custom handling.
* The RPCs that don't currently support auth tokens (i.e. those that don't
  support HTTP endpoints) have not been updated with the new pre-forwarding auth
  We'll be doing this under a separate PR to support RPC rate metrics.
2023-01-25 14:33:06 -05:00
Nick Wales 825af1f62a
docker: add option for Windows isolation modes (#15819) 2023-01-24 16:31:48 -05:00
Karl Johann Schubert b773a1b77f
client: add disk_total_mb and disk_free_mb config options (#15852) 2023-01-24 09:14:22 -05:00
Michael Schurter 92c7d96e0a
Add INFO task even log line and make logmon less noisy (#15842)
* client: log task events at INFO level

Fixes #15840

Example INFO level client logs with this enabled:

```
[INFO]  client: node registration complete
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type=Received msg="Task received by client" failed=false
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type="Task Setup" msg="Building Task Directory" failed=false
[WARN]  client.alloc_runner.task_runner.task_hook.logmon: plugin configured with a nil SecureConfig: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy
[INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy path=/tmp/NomadClient2414238708/b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51/alloc/logs/.sleepy.stdout.fifo @module=logmon timestamp=2023-01-20T11:19:34.275-0800
[INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy @module=logmon path=/tmp/NomadClient2414238708/b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51/alloc/logs/.sleepy.stderr.fifo timestamp=2023-01-20T11:19:34.275-0800
[INFO]  client.driver_mgr.raw_exec: starting task: driver=raw_exec driver_cfg="{Command:/bin/bash Args:[-c sleep 1000]}"
[WARN]  client.driver_mgr.raw_exec.executor: plugin configured with a nil SecureConfig: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 driver=raw_exec task_name=sleepy
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type=Started msg="Task started by client" failed=false
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type=Killing msg="Sent interrupt. Waiting 5s before force killing" failed=false
[INFO]  client.driver_mgr.raw_exec.executor: plugin process exited: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 driver=raw_exec task_name=sleepy path=/home/schmichael/go/bin/nomad pid=27668
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type=Terminated msg="Exit Code: 130, Signal: 2" failed=false
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy type=Killed msg="Task successfully killed" failed=false
[INFO]  client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51 task=sleepy path=/home/schmichael/go/bin/nomad pid=27653
[INFO]  client.gc: marking allocation for GC: alloc_id=b3dab5a9-91fd-da9a-ae89-ef7f1eceaf51
```

So task events will approximately *double* the number of per-task log
lines, but I think they add a lot of value.

* client: drop logmon 'opening' from debug->info

Cannot imagine why users care and removes 2 log lines per task
invocation.

```

[INFO]  client: node registration complete
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1cafb2dc-302e-2c92-7845-f56618bc8648 task=sleepy type=Received msg="Task received by client" failed=false
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1cafb2dc-302e-2c92-7845-f56618bc8648 task=sleepy type="Task Setup" msg="Building Task Directory" failed=false
<<< 2 "opening fifo" lines elided here >>>
[WARN]  client.alloc_runner.task_runner.task_hook.logmon: plugin configured with a nil SecureConfig: alloc_id=1cafb2dc-302e-2c92-7845-f56618bc8648 task=sleepy
[INFO]  client.driver_mgr.raw_exec: starting task: driver=raw_exec driver_cfg="{Command:/bin/bash Args:[-c sleep 1000]}"
[WARN]  client.driver_mgr.raw_exec.executor: plugin configured with a nil SecureConfig: alloc_id=1cafb2dc-302e-2c92-7845-f56618bc8648 driver=raw_exec task_name=sleepy
[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1cafb2dc-302e-2c92-7845-f56618bc8648 task=sleepy type=Started msg="Task started by client" failed=false
```

* docs: add changelog for #15842
2023-01-20 14:35:00 -08:00
Tim Gross a51149736d
Rename nomad.broker.total_blocked metric (#15835)
This changeset fixes a long-standing point of confusion in metrics emitted by
the eval broker. The eval broker has a queue of "blocked" evals that are waiting
for an in-flight ("unacked") eval of the same job to be completed. But this
"blocked" state is not the same as the `blocked` status that we write to raft
and expose in the Nomad API to end users. There's a second metric
`nomad.blocked_eval.total_blocked` that refers to evaluations in that
state. This has caused ongoing confusion in major customer incidents and even in
our own documentation! (Fixed in this PR.)

There's little functional change in this PR aside from the name of the metric
emitted, but there's a bit refactoring to clean up the names in `eval_broker.go`
so that there aren't name collisions and multiple names for the same
state. Changes included are:
* Everything that was previously called "pending" referred to entities that were
  associated witht he "ready" metric. These are all now called "ready" to match
  the metric.
* Everything named "blocked" in `eval_broker.go` is now named "pending", except
  for a couple of comments that actually refer to blocked RPCs.
* Added a note to the upgrade guide docs for 1.5.0.
* Fixed the scheduling performance metrics docs because the description for
  `nomad.broker.total_blocked` was actually the description for
  `nomad.blocked_eval.total_blocked`.
2023-01-20 14:23:56 -05:00
Charlie Voiselle 5ea1d8a970
Add raft snapshot configuration options (#15522)
* Add config elements
* Wire in snapshot configuration to raft
* Add hot reload of raft config
* Add documentation for new raft settings
* Add changelog
2023-01-20 14:21:51 -05:00
Seth Hoenig d2d8ebbeba
consul: correctly interpret missing consul checks as unhealthy (#15822)
* consul: correctly understand missing consul checks as unhealthy

This PR fixes a bug where Nomad assumed any registered Checks would exist
in the service registration coming back from Consul. In some cases, the
Consul may be slow in processing the check registration, and the response
object would not contain checks. Nomad would then scan the empty response
looking for Checks with failing health status, finding none, and then
marking a task/alloc as healthy.

In reality, we must always use Nomad's view of what checks should exist as
the source of truth, and compare that with the response Consul gives us,
making sure they match, before scanning the Consul response for failing
check statuses.

Fixes #15536

* consul: minor CR refactor using maps not sets

* consul: observe transition from healthy to unhealthy checks

* consul: spell healthy correctly
2023-01-19 14:01:12 -06:00
James Rasell 94aba987c6
changelog: add feature entry for SSO OIDC (#15821) 2023-01-19 16:48:04 +01:00
Dao Thanh Tung e2ae6d62e1
fix bug in nomad fmt -check does not return error code (#15797) 2023-01-17 09:15:34 -05:00
Benjamin Buzbee 13cc30ebeb
Return buffered text from log endpoint if decoding fails (#15558)
To see why I think this is a good change lets look at why I am making it

My disk was full, which means GC was happening agressively. So by the
time I called the logging endpoint from the SDK, the logs were GC'd

The error I was getting before was:
```
invalid character 'i' in literal false (expecting 'l')
```

Now the error I get is:
```
failed to decode log endpoint response as JSON: "failed to list entries: open /tmp/nomad.data.4219353875/alloc/f11fee50-2b66-a7a2-d3ec-8442cb3d557a/alloc/logs: no such file or directory"
```

Still not super descriptive but much more debugable
2023-01-16 10:39:56 +01:00
Phil Renaud d588aabca6
[ui] Fixes logger height issue when sidebar has events (#15759)
* Fixes logger height issue when sidebar has events

* Much simpler grid method for height calc
2023-01-13 12:16:02 -05:00
Seth Hoenig 8cd77c14a2
env/aws: update ec2 cpu info data (#15770) 2023-01-13 09:58:23 -06:00
Seth Hoenig a8d40ce26b
build: update to go 1.19.5 (#15769) 2023-01-13 09:57:32 -06:00
dependabot[bot] 094caaabdf
build(deps): bump github.com/containerd/containerd from 1.6.6 to 1.6.12 (#15726)
* build(deps): bump github.com/containerd/containerd from 1.6.6 to 1.6.12

Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.6 to 1.6.12.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.6.6...v1.6.12)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* cl: add cl for containerd/containerd

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-01-13 09:22:41 -06:00
Seth Hoenig fe7795ce16
consul/connect: support for proxy upstreams opaque config (#15761)
This PR adds support for configuring `proxy.upstreams[].config` for
Consul Connect upstreams. This is an opaque config value to Nomad -
the data is passed directly to Consul and is unknown to Nomad.
2023-01-12 08:20:54 -06:00
Anthony Davis 1c32471805
Fix rejoin_after_leave behavior (#15552) 2023-01-11 16:39:24 -05:00
Daniel Bennett 7d1059b5ae
connect: ingress gateway validation for http hosts and wildcards (#15749)
* connect: fix non-"tcp" ingress gateway validation

changes apply to http, http2, and grpc:
* if "hosts" is excluded, consul will use its default domain
  e.g. <service-name>.ingress.dc1.consul
* can't set hosts with "*" service name
* test http2 and grpc too
2023-01-11 11:52:32 -06:00
Seth Hoenig 719eee8112
consul: add client configuration for grpc_ca_file (#15701)
* [no ci] first pass at plumbing grpc_ca_file

* consul: add support for grpc_ca_file for tls grpc connections in consul 1.14+

This PR adds client config to Nomad for specifying consul.grpc_ca_file

These changes combined with https://github.com/hashicorp/consul/pull/15913 should
finally enable Nomad users to upgrade to Consul 1.14+ and use tls grpc connections.

* consul: add cl entgry for grpc_ca_file

* docs: mention grpc_tls changes due to Consul 1.14
2023-01-11 09:34:28 -06:00
Dao Thanh Tung 09b25d71b8
cli: Add a nomad operator client state command (#15469)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-01-11 10:03:31 -05:00
Phil Renaud 76bed82192
[ui] Show task events in the sidebar (#15733)
* Add task events to task logs sidebar

* Max-heighting inner table when present for nice looking borders
2023-01-10 17:02:21 -05:00
Phil Renaud 4e16ccc5fa
Basic sidebar expander (#15735) 2023-01-10 16:35:53 -05:00
Luiz Aoqui ed5fccc183
scheduler: allow using device ID as attribute (#15455)
Devices are fingerprinted as groups of similar devices. This prevented
specifying specific device by their ID in constraint and affinity rules.

This commit introduces the `${device.ids}` attribute that returns a
comma separated list of IDs that are part of the device group. Users can
then use the set operators to write rules.
2023-01-10 14:28:23 -05:00
Seth Hoenig 83450c8762
vault: configure user agent on Nomad vault clients (#15745)
* vault: configure user agent on Nomad vault clients

This PR attempts to set the User-Agent header on each Vault API client
created by Nomad. Still need to figure a way to set User-Agent on the
Vault client created internally by consul-template.

* vault: fixup find-and-replace gone awry
2023-01-10 10:39:45 -06:00
Seth Hoenig 2868a45982
docker: configure restart policy for networking pause container (#15732)
This PR modifies the configuration of the networking pause contaier to include
the "unless-stopped" restart policy. The pause container should always be
restored into a running state until Nomad itself issues a stop command for the
container.

This is not a _perfect_ fix for #12216 but it should cover the 99% use case -
where a pause container gets accidently stopped / killed for some reason. There
is still a possibility where the pause container and main task container are
stopped and started in the order where the bad behavior persists, but this is
fundamentally unavoidable due to how docker itself abstracts and manages the
underlying network namespace referenced by the containers.

Closes #12216
2023-01-10 07:50:09 -06:00
Dao Thanh Tung ca2f509e82
agent: Make agent syslog log level inherit from Nomad agent log (#15625) 2023-01-04 09:38:06 -05:00
Tim Gross 8859e1bff1
csi: Fix parsing of '=' in secrets at command line and HTTP (#15670)
The command line flag parsing and the HTTP header parsing for CSI secrets
incorrectly split at more than one '=' rune, making it impossible to use secrets
that included that rune.
2023-01-03 16:28:38 -05:00
Dao Thanh Tung 53cd1b4871
fix: stale querystring parameter value as boolean (#15605)
* Add changes to make stale querystring param boolean

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Make error message more consistent

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Changes from code review + Adding CHANGELOG file

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Changes from code review to use github.com/shoenig/test package

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Change must.Nil() to must.NoError()

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Minor fix on the import order

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Fix existing code format too

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* Minor changes addressing code review feedbacks

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

* swap must.EqOp() order of param provided

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-01-01 13:04:14 -06:00
Danish Prakash dc81568f93
command/job_stop: accept multiple jobs, stop concurrently (#12582)
* command/job_stop: accept multiple jobs, stop concurrently

Signed-off-by: danishprakash <grafitykoncept@gmail.com>

* command/job_stop_test: add test for multiple job stops

Signed-off-by: danishprakash <grafitykoncept@gmail.com>

* improve output, add changelog and docs

Signed-off-by: danishprakash <grafitykoncept@gmail.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-12-16 15:46:58 -08:00
Phil Renaud dce8717866
[ui] Token management interface on policy pages (#15435)
* basic-functionality demo for token CRUD

* Styling for tokens crud

* Tokens crud styles

* Expires, not expiry

* Mobile styles etc

* Refresh and redirect rules for policy save and token creation

* Delete method and associated serializer change

* Ability-checking for tokens

* Update policies acceptance tests to reflect new redirect rules

* Token ability unit tests

* Mirage config methods for token crud

* Token CRUD acceptance tests

* A couple visual diff snapshots

* Add and Delete abilities referenced for token operations

* Changing timeouts and adding a copy to clipboard action

* replaced accessor with secret when copying to clipboard

* PR comments addressed

* Simplified error passing for policy editor
2022-12-15 13:11:28 -05:00
Tim Gross 989d7d9fcf
csi: avoid a nil pointer when handling plugin events (#15518)
If a plugin crashes quickly enough, we can get into a situation where the
deregister function is called before it's ever registered. Safely handle the
resulting nil pointer in the dynamic registry by not emitting a plugin event,
but also update the plugin event handler to tolerate nil pointers in case we
wire it up elsewhere in the future.
2022-12-12 08:42:57 -05:00
Seth Hoenig 51a2212d3d
client: sandbox go-getter subprocess with landlock (#15328)
* client: sandbox go-getter subprocess with landlock

This PR re-implements the getter package for artifact downloads as a subprocess.

Key changes include

On all platforms, run getter as a child process of the Nomad agent.
On Linux platforms running as root, run the child process as the nobody user.
On supporting Linux kernels, uses landlock for filesystem isolation (via go-landlock).
On all platforms, restrict environment variables of the child process to a static set.
notably TMP/TEMP now points within the allocation's task directory
kernel.landlock attribute is fingerprinted (version number or unavailable)
These changes make Nomad client more resilient against a faulty go-getter implementation that may panic, and more secure against bad actors attempting to use artifact downloads as a privilege escalation vector.

Adds new e2e/artifact suite for ensuring artifact downloading works.

TODO: Windows git test (need to modify the image, etc... followup PR)

* landlock: fixup items from cr

* cr: fixup tests and go.mod file
2022-12-07 16:02:25 -06:00
Phil Renaud ce0ffdd077
[ui] Policies UI (#13976)
Co-authored-by: Mike Nomitch <mail@mikenomitch.com>
2022-12-06 12:45:36 -05:00
Seth Hoenig 3ed37b0b1d
fingerprint: add fingerprinting for CNI plugins presense and version (#15452)
This PR adds a fingerprinter to set the attribute
"plugins.cni.version.<name>" => "<version>"

for each CNI plugin in <client>.cni_path (/opt/cni/bin by default).
2022-12-05 14:22:47 -06:00
Phil Renaud 541ca94576
[ui] Adding canary_tags the web UI (#15458)
* Adding canary_tags to anyplace we show service tags

* CSS moved and tabs to spaces
2022-12-05 14:50:17 -05:00
Phil Renaud df749ff54a
Add namespaces to exec window (#15454) 2022-12-02 15:38:01 -05:00
Seth Hoenig 119f7b1cd1
consul: fixup expected consul tagged_addresses when using ipv6 (#15411)
This PR is a continuation of #14917, where we missed the ipv6 cases.

Consul auto-inserts tagged_addresses for keys
- lan_ipv4
- wan_ipv4
- lan_ipv6
- wan_ipv6

even though the service registration coming from Nomad does not contain such
elements. When doing the differential between services Nomad expects to be
registered vs. the services actually registered into Consul, we must first
purge these automatically inserted tagged_addresses if they do not exist in
the Nomad view of the Consul service.
2022-12-01 07:38:30 -06:00
dependabot[bot] 944a7dbb70
build(deps): bump google.golang.org/grpc from 1.50.1 to 1.51.0 (#15402)
* build(deps): bump google.golang.org/grpc from 1.50.1 to 1.51.0

Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.50.1 to 1.51.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.50.1...v1.51.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* changelog: add entry for #15402

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-11-29 14:55:17 -05:00
Seth Hoenig a65fbeb3b3
client: manually cleanup leaked iptables rules (#15407)
This PR adds a secondary path for cleaning up iptables created for an allocation
when the normal CNI library fails to do so. This typically happens when the state
of the pause container is unexpected - e.g. deleted out of band from Nomad. Before,
the iptables rules would be leaked which could lead to unexpected nat routing
behavior later on (in addition to leaked resources). With this change, we scan
for the rules created on behalf of the allocation being GC'd and delete them.

Fixes #6385
2022-11-28 11:32:16 -06:00
Phil Renaud ffd16dfec6
[ui, epic] SSO and Auth improvements (#15110)
* Top nav auth dropdown (#15055)

* Basic dropdown styles

* Some cleanup

* delog

* Default nomad hover state styles

* Component separation-of-concerns and acceptance tests for auth dropdown

* lintfix

* [ui, sso] Handle token expiry 500s (#15073)

* Handle error states generally

* Dont direct, just redirect

* no longer need explicit error on controller

* Redirect on token-doesnt-exist

* Forgot to import our time lib

* Linting on _blank

* Redirect tests

* changelog

* [ui, sso] warn user about pending token expiry (#15091)

* Handle error states generally

* Dont direct, just redirect

* no longer need explicit error on controller

* Linting on _blank

* Custom notification actions and shift the template to within an else block

* Lintfix

* Make the closeAction optional

* changelog

* Add a mirage token that will always expire in 11 minutes

* Test for token expiry with ember concurrency waiters

* concurrency handling for earlier test, and button redirect test

* [ui] if ACLs are disabled, remove the Sign In link from the top of the UI (#15114)

* Remove top nav link if ACLs disabled

* Change to an enabled-by-default model since you get no agent config when ACLs are disabled but you lack a token

* PR feedback addressed; down with double negative conditionals

* lintfix

* ember getter instead of ?.prop

* [SSO] Auth Methods and Mock OIDC Flow (#15155)

* Big ol first pass at a redirect sign in flow

* dont recursively add queryparams on redirect

* Passing state and code qps

* In which I go off the deep end and embed a faux provider page in the nomad ui

* Buggy but self-contained flow

* Flow auto-delay added and a little more polish to resetting token

* secret passing turned to accessor passing

* Handle SSO Failure

* General cleanup and test fix

* Lintfix

* SSO flow acceptance tests

* Percy snapshots added

* Explicitly note the OIDC test route is mirage only

* Handling failure case for complete-auth

* Leentfeex

* Tokens page styles (#15273)

* styling and moving columns around

* autofocus and enter press handling

* Styles refined

* Split up manager and regular tests

* Standardizing to a binary status state

* Serialize auth-methods response to use "name" as primary key (#15380)

* Serializer for unique-by-name

* Use @classic because of class extension
2022-11-28 10:44:52 -05:00
Luiz Aoqui 8f91be26ab
scheduler: create placements for non-register MRD (#15325)
* scheduler: create placements for non-register MRD

For multiregion jobs, the scheduler does not create placements on
registration because the deployment must wait for the other regions.
Once of these regions will then trigger the deployment to run.

Currently, this is done in the scheduler by considering any eval for a
multiregion job as "paused" since it's expected that another region will
eventually unpause it.

This becomes a problem where evals not triggered by a job registration
happen, such as on a node update. These types of regional changes do not
have other regions waiting to progress the deployment, and so they were
never resulting in placements.

The fix is to create a deployment at job registration time. This
additional piece of state allows the scheduler to differentiate between
a multiregion change, where there are other regions engaged in the
deployment so no placements are required, from a regional change, where
the scheduler does need to create placements.

This deployment starts in the new "initializing" status to signal to the
scheduler that it needs to compute the initial deployment state. The
multiregion deployment will wait until this deployment state is
persisted and its starts is set to "pending". Without this state
transition it's possible to hit a race condition where the plan applier
and the deployment watcher may step of each other and overwrite their
changes.

* changelog: add entry for #15325
2022-11-25 12:45:34 -05:00
Piotr Kazmierczak 9c85315bd2
bugfix: typos in acl role commands (#15382)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2022-11-25 10:28:33 +01:00
Tim Gross 8657695322
scheduler: set job on system stack for CSI feasibility check (#15372)
When the scheduler checks feasibility of each node, it creates a "stack" which
carries attributes of the job and task group it needs to check feasibility
for. The `system` and `sysbatch` scheduler use a different stack than `service`
and `batch` jobs. This stack was missing the call to set the job ID and
namespace for the CSI check. This prevents CSI volumes from being scheduled for
system jobs whenever the volume is in a non-default namespace.

Set the job ID and namespace to match the generic scheduler.
2022-11-23 16:47:35 -05:00
Jack 62f7de7ed5
cli: wait flag for use with deployment status -monitor (#15262) 2022-11-23 16:36:13 -05:00
Sam 4689822628
Fix missing host header in http check (#15337) 2022-11-23 08:58:13 -05:00
Phil Renaud 3189826a5b
Task sub row alignment changes (#15363) 2022-11-22 15:49:50 -05:00
Lance Haig 0263e7af34
Add command "nomad tls" (#14296) 2022-11-22 14:12:07 -05:00
James Rasell e2a2ea68fc
client: accommodate Consul 1.14.0 gRPC and agent self changes. (#15309)
* client: accommodate Consul 1.14.0 gRPC and agent self changes.

Consul 1.14.0 changed the way in which gRPC listeners are
configured, particularly when using TLS. Prior to the change, a
single listener was responsible for handling plain-text and
encrypted gRPC requests. In 1.14.0 and beyond, separate listeners
will be used for each, defaulting to 8502 and 8503 for plain-text
and TLS respectively.

The change means that Nomad’s Consul Connect integration would not
work when integrated with Consul clusters using TLS and running
1.14.0 or greater.

The Nomad Consul fingerprinter identifies the gRPC port Consul has
exposed using the "DebugConfig.GRPCPort" value from Consul’s
“/v1/agent/self” endpoint. In Consul 1.14.0 and greater, this only
represents the plain-text gRPC port which is likely to be disbaled
in clusters running TLS. In order to fix this issue, Nomad now
takes into account the Consul version and configured scheme to
optionally use “DebugConfig.GRPCTLSPort” value from Consul’s agent
self return.

The “consul_grcp_socket” allocrunner hook has also been updated so
that the fingerprinted gRPC port attribute is passed in. This
provides a better fallback method, when the operator does not
configure the “consul.grpc_address” option.

* docs: modify Consul Connect entries to detail 1.14.0 changes.

* changelog: add entry for #15309

* fixup: tidy tests and clean version match from review feedback.

* fixup: use strings tolower func.
2022-11-21 09:19:09 -06:00
Seth Hoenig bf4b5f9a8d
consul: add trace logging around service registrations (#15311)
This PR adds trace logging around the differential done between a Nomad service
registration and its corresponding Consul service registration, in an effort
to shed light on why a service registration request is being made.
2022-11-21 08:03:56 -06:00
Phil Renaud 11dc19b307
[ui] Show Consul Connect upstreams / on update info in sidebar (#15324)
* Added consul connect icon and sidebar info

* Show icon to the right of name
2022-11-18 22:49:10 -05:00
James Rasell 3225cf77b6
api: ensure all request body decode error return a 400 status code. (#15252) 2022-11-18 17:04:33 +01:00
stswidwinski 7b6e856a29
Add mount propagation to protobuf definition of mounts (#15096)
* Add mount propagation to protobuf definition of mounts

* Fix formatting

* Add mount propagation to the simple roundtrip test.

* changelog: add entry for #15096

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-11-17 18:14:59 -05:00
Tim Gross d0f9e887f7
autopilot: include only servers from the same region (#15290)
When we migrated to the updated autopilot library in Nomad 1.4.0, the interface
for finding servers changed. Previously autopilot would get the serf members and
call `IsServer` on each of them, leaving it up to the implementor to filter out
clients (and in Nomad's case, other regions). But in the "new" autopilot
library, the equivalent interface is `KnownServers` for which we did not filter
by region. This causes spurious attempts for the cross-region stats fetching,
which results in TLS errors and a lot of log noise.

Filter the member set by region to fix the regression.
2022-11-17 12:09:36 -05:00
stswidwinski 75f80e2fdd
Fix goroutine leakage (#15180)
* Fix goroutine leakage

* cl: add cl entry

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2022-11-17 09:47:11 -06:00
Tim Gross dd3a07302e
keyring: update handle to state inside replication loop (#15227)
* keyring: update handle to state inside replication loop

When keyring replication starts, we take a handle to the state store. But
whenever a snapshot is restored, this handle is invalidated and no longer points
to a state store that is receiving new keys. This leaks a bunch of memory too!

In addition to operator-initiated restores, when fresh servers are added to
existing clusters with large-enough state, the keyring replication can get
started quickly enough that it's running before the snapshot from the existing
clusters have been restored.

Fix this by updating the handle to the state store on each pass.
2022-11-17 08:40:12 -05:00
Tim Gross 6415fb4284
eval broker: shed all but one blocked eval per job after ack (#14621)
When an evaluation is acknowledged by a scheduler, the resulting plan is
guaranteed to cover up to the `waitIndex` set by the worker based on the most
recent evaluation for that job in the state store. At that point, we no longer
need to retain blocked evaluations in the broker that are older than that index.

Move all but the highest priority / highest `ModifyIndex` blocked eval into a
canceled set. When the `Eval.Ack` RPC returns from the eval broker it will
signal a reap of a batch of cancelable evals to write to raft. This paces the
cancelations limited by how frequently the schedulers are acknowledging evals;
this should reduce the risk of cancelations from overwhelming raft relative to
scheduler progress. In order to avoid straggling batches when the cluster is
quiet, we also include a periodic sweep through the cancelable list.
2022-11-16 16:10:11 -05:00
Tim Gross 37134a4a37
eval delete: move batching of deletes into RPC handler and state (#15117)
During unusual outage recovery scenarios on large clusters, a backlog of
millions of evaluations can appear. In these cases, the `eval delete` command can
put excessive load on the cluster by listing large sets of evals to extract the
IDs and then sending larges batches of IDs. Although the command's batch size
was carefully tuned, we still need to be JSON deserialize, re-serialize to
MessagePack, send the log entries through raft, and get the FSM applied.

To improve performance of this recovery case, move the batching process into the
RPC handler and the state store. The design here is a little weird, so let's
look a the failed options first:

* A naive solution here would be to just send the filter as the raft request and
  let the FSM apply delete the whole set in a single operation. Benchmarking with
  1M evals on a 3 node cluster demonstrated this can block the FSM apply for
  several minutes, which puts the cluster at risk if there's a leadership
  failover (the barrier write can't be made while this apply is in-flight).

* A less naive but still bad solution would be to have the RPC handler filter
  and paginate, and then hand a list of IDs to the existing raft log
  entry. Benchmarks showed this blocked the FSM apply for 20-30s at a time and
  took roughly an hour to complete.

Instead, we're filtering and paginating in the RPC handler to find a page token,
and then passing both the filter and page token in the raft log. The FSM apply
recreates the paginator using the filter and page token to get roughly the same
page of evaluations, which it then deletes. The pagination process is fairly
cheap (only abut 5% of the total FSM apply time), so counter-intuitively this
rework ends up being much faster. A benchmark of 1M evaluations showed this
blocked the FSM apply for 20-30ms at a time (typical for normal operations) and
completes in less than 4 minutes.

Note that, as with the existing design, this delete is not consistent: a new
evaluation inserted "behind" the cursor of the pagination will fail to be
deleted.
2022-11-14 14:08:13 -05:00
Charlie Voiselle c73fb51d3a
[bug] Return a spec on reconnect (#15214)
client: fixed a bug where non-`docker` tasks with network isolation would leak network namespaces and iptables rules if the client was restarted while they were running
2022-11-11 13:27:36 -05:00
Seth Hoenig 21237d8337
client: avoid unconsumed channel in timer construction (#15215)
* client: avoid unconsumed channel in timer construction

This PR fixes a bug introduced in #11983 where a Timer initialized with 0
duration causes an immediate tick, even if Reset is called before reading the
channel. The fix is to avoid doing that, instead creating a Timer with a non-zero
initial wait time, and then immediately calling Stop.

* pr: remove redundant stop
2022-11-11 09:31:34 -06:00
Tim Gross eabbcebdd4
exec: allow running commands from host volume (#14851)
The exec driver and other drivers derived from the shared executor check the
path of the command before handing off to libcontainer to ensure that the
command doesn't escape the sandbox. But we don't check any host volume mounts,
which should be safe to use as a source for executables if we're letting the
user mount them to the container in the first place.

Check the mount config to verify the executable lives in the mount's host path,
but then return an absolute path within the mount's task path so that we can hand
that off to libcontainer to run.

Includes a good bit of refactoring here because the anchoring of the final task
path has different code paths for inside the task dir vs inside a mount. But
I've fleshed out the test coverage of this a good bit to ensure we haven't
created any regressions in the process.
2022-11-11 09:51:15 -05:00
Piotr Kazmierczak 4851f9e68a
acl: sso auth method schema and store functions (#15191)
This PR implements ACLAuthMethod type, acl_auth_methods table schema and crud state store methods. It also updates nomadSnapshot.Persist and nomadSnapshot.Restore methods in order for them to work with the new table, and adds two new Raft messages: ACLAuthMethodsUpsertRequestType and ACLAuthMethodsDeleteRequestType

This PR is part of the SSO work captured under ☂️ ticket #13120.
2022-11-10 19:42:41 +01:00
Seth Hoenig 6e3309ebc6
template: protect use of template manager with a lock (#15192)
This PR protects access to `templateHook.templateManager` with its lock. So
far we have not been able to reproduce the panic - but it seems either Poststart
is running without a Prestart being run first (should be impossible), or the
Update hook is running concurrently with Poststart, nil-ing out the templateManager
in a race with Poststart.

Fixes #15189
2022-11-10 08:30:27 -06:00
Derek Strickland 80b6f27efd
api: remove mapstructure tags fromPort struct (#12916)
This PR solves a defect in the deserialization of api.Port structs when returning structs from theEventStream.

Previously, the api.Port struct's fields were decorated with both mapstructure and hcl tags to support the network.port stanza's use of the keyword static when posting a static port value. This works fine when posting a job and when retrieving any struct that has an embedded api.Port instance as long as the value is deserialized using JSON decoding. The EventStream, however, uses mapstructure to decode event payloads in the api package. mapstructure expects an underlying field named static which does not exist. The result was that the Port.Value field would always be set to 0.

Upon further inspection, a few things became apparent.

The struct already has hcl tags that support the indirection during job submission.
Serialization/deserialization with both the json and hcl packages produce the desired result.
The use of of the mapstructure tags provided no value as the Port struct contains only fields with primitive types.
This PR:

Removes the mapstructure tags from the api.Port structs
Updates the job parsing logic to use hcl instead of mapstructure when decoding Port instances.
Closes #11044

Co-authored-by: DerekStrickland <dstrickland@hashicorp.com>
Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2022-11-08 11:26:28 +01:00
Drew Gonzales aac9404ee5
server: add git revision to serf tags (#9159) 2022-11-07 10:34:33 -05:00
Phil Renaud 85521c49c4
[ui] Remove animation from task logs sidebar (#15146)
* Remove animation from task logs sidebar

* changelog
2022-11-07 10:11:18 -05:00
Tim Gross 9e1c0b46d8
API for Eval.Count (#15147)
Add a new `Eval.Count` RPC and associated HTTP API endpoints. This API is
designed to support interactive use in the `nomad eval delete` command to get a
count of evals expected to be deleted before doing so.

The state store operations to do this sort of thing are somewhat expensive, but
it's cheaper than serializing a big list of evals to JSON. Note that although it
seems like this could be done as an extra parameter and response field on
`Eval.List`, having it as its own endpoint avoids having to change the response
body shape and lets us avoid handling the legacy filter params supported by
`Eval.List`.
2022-11-07 08:53:19 -05:00