Commit Graph

212 Commits

Author SHA1 Message Date
Tim Gross 951661db04 CSI: resolve invalid claim states (#11890)
* csi: resolve invalid claim states on read

It's currently possible for CSI volumes to be claimed by allocations
that no longer exist. This changeset asserts a reasonable state at
the state store level by registering these nil allocations as "past
claims" on any read. This will cause any pass through the periodic GC
or volumewatcher to trigger the unpublishing workflow for those claims.

* csi: make feasibility check errors more understandable

When the feasibility checker finds we have no free write claims, it
checks to see if any of those claims are for the job we're currently
scheduling (so that earlier versions of a job can't block claims for
new versions) and reports a conflict if the volume can't be scheduled
so that the user can fix their claims. But when the checker hits a
claim that has a GCd allocation, the state is recoverable by the
server once claim reaping completes and no user intervention is
required; the blocked eval should complete. Differentiate the
scheduler error produced by these two conditions.
2022-01-28 14:43:35 -05:00
Tim Gross 4e559c6255 csi: update leader's ACL in volumewatcher (#11891)
The volumewatcher that runs on the leader needs to make RPC calls
rather than writing to raft (as we do in the deploymentwatcher)
because the unpublish workflow needs to make RPC calls to the
clients. This requires that the volumewatcher has access to the
leader's ACL token.

But when leadership transitions, the new leader creates a new leader
ACL token. This ACL token needs to be passed into the volumewatcher
when we enable it, otherwise the volumewatcher can find itself with a
stale token.
2022-01-28 14:43:27 -05:00
Derek Strickland 460416e787 Update IsEmpty to check for pre-1.2.4 fields (#11930) 2022-01-28 14:41:49 -05:00
Dave May 330d24a873
cli: Add event stream capture to nomad operator debug (#11865) 2022-01-17 21:35:51 -05:00
Michael Schurter 99c863f909
cli: improve debug error messages (#11507)
Improves `nomad debug` error messages when contacting agents that do not
have /v1/agent/host endpoints (the endpoint was added in v0.12.0)

Part of #9568 and manually tested against Nomad v0.8.7.

Hopefully isRedirectError can be reused for more cases listed in #9568
2022-01-17 11:15:17 -05:00
Luiz Aoqui 57a61b50f4
changelog: add entry for #11793 (#11862) 2022-01-17 11:08:29 -05:00
James Rasell 29fb0ddb4e
Merge pull request #11849 from hashicorp/b-changelog-11848
changelog: add entry for #11848
2022-01-17 09:35:10 +01:00
Jai 7ad9ec1143
Merge pull request #11820 from hashicorp/f-ui/alloc-legend
feat:  add links to legend items in `allocation-summary`
2022-01-14 14:02:54 -05:00
Tim Gross d7756f8cdb
csi: volume deregistration should require exact ID (#11852)
The command line client sends a specific volume ID, but this isn't
enforced at the API level and we were incorrectly using a prefix match
for volume deregistration, resulting in cases where a volume with a
shorter ID that's a prefix of another volume would be deregistered
instead of the intended volume.
2022-01-14 12:26:03 -05:00
Tim Gross 33f7c6cba4
csi: when warning for multiple prefix matches, use full ID (#11853)
When the `volume deregister` or `volume detach` commands get an ID
prefix that matches multiple volumes, show the full length of the
volume IDs in the list of volumes shown so so that the user can select
the correct one.
2022-01-14 12:25:48 -05:00
Tim Gross 9c4864badd
freebsd: build fix for ARM7 32-bit (#11854)
The size of `stat_t` fields is architecture dependent, which was
reportedly causing a build failure on FreeBSD ARM7 32-bit
systems. This changeset matches the behavior we have on Linux.
2022-01-14 12:25:32 -05:00
Tim Gross 73d0779858
drivers: set world-readable permissions on copied resolv.conf (#11856)
When we copy the system DNS to a task's `resolv.conf`, we should set
the permissions as world-readable so that unprivileged users within
the task can read it.
2022-01-14 12:25:23 -05:00
Jai Bhagat 764936a768 chore: add changelog 2022-01-14 10:23:09 -05:00
James Rasell 10405e9c6b
changelog: add entry for #11848 2022-01-14 13:40:50 +01:00
Luiz Aoqui d48e50da9a
Fix log level parsing from lines that include a timestamp (#11838) 2022-01-13 09:56:35 -05:00
Luiz Aoqui c7ae13a1f3
Fix ACL requirements for job details UI (#11672) 2022-01-12 21:26:02 -05:00
Michael Schurter ebadaabc71 doc: add changelog for #11830 2022-01-12 14:21:47 -08:00
Tim Gross b62da8fc9a
docs: improve changelog for PR #11783 (#11818) 2022-01-11 11:54:12 -05:00
Tim Gross 1a5973184e
docs: changelog for PR #11783 (#11812) 2022-01-10 16:39:21 -05:00
Alessandro De Blasis e647549ecf
metrics: added `mapped_file` metric (#11500)
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Nate <37554478+servusdei2018@users.noreply.github.com>
2022-01-10 15:35:19 -05:00
Derek Strickland 0a8e03f0f7
Expose Consul template configuration parameters (#11606)
This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza.

- `wait`

It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza.

- `max_stale`
- `block_query_wait`
- `consul_retry`
- `vault_retry` 
- `wait` 

Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure.

- `wait_bounds`

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-01-10 10:19:07 -05:00
Tim Gross 32f150d469
docs: new scheduler metrics (#11790)
* Fixed name of `nomad.scheduler.allocs.reschedule` metric
* Added new metrics to metrics reference documentation
* Expanded definitions of "waiting" metrics
* Changelog entry for #10236 and #10237
2022-01-07 09:51:15 -05:00
Charlie Voiselle 98a240cd99
Make number of scheduler workers reloadable (#11593)
## Development Environment Changes
* Added stringer to build deps

## New HTTP APIs
* Added scheduler worker config API
* Added scheduler worker info API

## New Internals
* (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume()
* Update shutdown to use context
* Add mutex for contended server data
    - `workerLock` for the `workers` slice
    - `workerConfigLock` for the `Server.Config.NumSchedulers` and
      `Server.Config.EnabledSchedulers` values

## Other
* Adding docs for scheduler worker api
* Add changelog message

Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
2022-01-06 11:56:13 -05:00
Michael Schurter 1af8d47de2
Merge pull request #11744 from hashicorp/b-node-copy
Fix Node.Copy()
2022-01-05 17:01:53 -08:00
Jai c7e581d879
Merge pull request #11590 from hashicorp/e-ui/breadcrumbs-service
Refactor:  Breadcrumbs Service
2022-01-05 17:46:48 -05:00
Tim Gross 51f512a3e6
csi: reap unused volume claims at leadership transitions (#11776)
When `volumewatcher.Watcher` starts on the leader, it starts a watch
on every volume and triggers a reap of unused claims on any change to
that volume. But if a reaping is in-flight during leadership
transitions, it will fail and the event that triggered the reap will
be dropped. Perform one reap of unused claims at the start of the
watcher so that leadership transitions don't drop this event.
2022-01-05 11:40:20 -05:00
Arkadiusz ffb174b596
Fix log streaming missing frames (#11721)
Perform one more read after receiving cancel when streaming file from the allocation API
2022-01-04 14:07:16 -05:00
Tim Gross 2806dc2bd7
docs/tests for multiple HTTP address config (#11760) 2022-01-03 10:17:13 -05:00
Tim Gross 395628efe1
api: paginate deployment list and accept wildcard namespace (#11743)
Add `per_page` and `next_token` handling to `Deployment.List` RPC, and
allow the use of a wildcard namespace for namespace filtering.
2022-01-03 08:36:02 -05:00
Michael Schurter c4d03815e1 add changelog for Node.Copy fix 2021-12-23 12:34:05 -08:00
Tim Gross 265e488ab4
task runner: fix goroutine leak in prestart hook (#11741)
The task runner prestart hooks take a `joincontext` so they have the
option to exit early if either of two contexts are canceled: from
killing the task or client shutdown. Some tasks exit without being
shutdown from the server, so neither of the joined contexts ever gets
canceled and we leak the `joincontext` (48 bytes) and its internal
goroutine. This primarily impacts batch jobs and any task that fails
or completes early such as non-sidecar prestart lifecycle tasks.
Cancel the `joincontext` after the prestart call exits to fix the
leak.
2021-12-23 11:50:51 -05:00
Luiz Aoqui 4bdd2c84e3
fix host network reserved port fingerprint (#11728) 2021-12-22 15:29:54 -05:00
Luiz Aoqui 40093f97cd
api: support namespace wildcard in CSI volume list (#11724) 2021-12-21 17:19:45 -05:00
Shishir 65eab35412
Add support for setting pids_limit in docker plugin config. (#11526) 2021-12-21 13:31:34 -05:00
Tim Gross b0c3b99b03
scheduler: fix quadratic performance with spread blocks (#11712)
When the scheduler picks a node for each evaluation, the
`LimitIterator` provides at most 2 eligible nodes for the
`MaxScoreIterator` to choose from. This keeps scheduling fast while
producing acceptable results because the results are binpacked.

Jobs with a `spread` block (or node affinity) remove this limit in
order to produce correct spread scoring. This means that every
allocation within a job with a `spread` block is evaluated against
_all_ eligible nodes. Operators of large clusters have reported that
jobs with `spread` blocks that are eligible on a large number of nodes
can take longer than the nack timeout to evaluate (60s). Typical
evaluations are processed in milliseconds.

In practice, it's not necessary to evaluate every eligible node for
every allocation on large clusters, because the `RandomIterator` at
the base of the scheduler stack produces enough variation in each pass
that the likelihood of an uneven spread is negligible. Note that
feasibility is checked before the limit, so this only impacts the
number of _eligible_ nodes available for scoring, not the total number
of nodes.

This changeset sets the iterator limit for "large" `spread` block and
node affinity jobs to be equal to the number of desired
allocations. This brings an example problematic job evaluation down
from ~3min to ~10s. The included tests ensure that we have acceptable
spread results across a variety of large cluster topologies.
2021-12-21 10:10:01 -05:00
Jai Bhagat 1887e34b75 chore: edit mirage scenario to populate csi 2021-12-21 07:42:23 -05:00
Luiz Aoqui 3d3b5a2c8e
changelog: add entries for #11555, #11557, and #11687 (#11706) 2021-12-20 13:45:20 -05:00
Tim Gross e046bb31e9
api: respect wildcard in evaluations list API (#11710) 2021-12-20 12:23:50 -05:00
Jai 93d5ef596f
Merge pull request #11545 from hashicorp/f-ui/add-alloc-filters-on-table
Add Allocation Filters in Client View
2021-12-18 09:39:53 -05:00
Jai 2b7fb2c5bd
Merge pull request #11544 from hashicorp/f-ui/add-filters-to-allocs
Add filters to Allocations
2021-12-18 09:38:28 -05:00
Luiz Aoqui e067b3d75f
changelog: fix entry for #11544 2021-12-17 18:57:54 -05:00
Luiz Aoqui a1c4536523
changelog: add entry for #11545 2021-12-17 18:49:56 -05:00
Tim Gross f2615992a4
cli: unhide advanced operator raft debugging commands (#11682)
The `nomad operator raft` and `nomad operator snapshot state`
subcommands for inspecting on-disk raft state were hidden and
undocumented. Expose and document these so that advanced operators
have support for these tools.
2021-12-16 10:32:11 -05:00
Tim Gross 536e3c5282
`nomad eval list` command (#11675)
Use the new filtering and pagination capabilities of the `Eval.List`
RPC to provide filtering and pagination at the command line.

Also includes note that `nomad eval status -json` is deprecated and
will be replaced with a single evaluation view in a future version of
Nomad.
2021-12-15 11:58:38 -05:00
Tim Gross f8a133a810
cli: ensure `-stale` flag is respected by `nomad operator debug` (#11678)
When a cluster doesn't have a leader, the `nomad operator debug`
command can safely use stale queries to gracefully degrade the
consistency of almost all its queries. The query parameter for these
API calls was not being set by the command.

Some `api` package queries do not include `QueryOptions` because
they target a specific agent, but they can potentially be forwarded to
other agents. If there is no leader, these forwarded queries will
fail. Provide methods to call these APIs with `QueryOptions`.
2021-12-15 10:44:03 -05:00
Luiz Aoqui 05bb65779c
api: return error when `LicenseGet` status is not `200` (#11644) 2021-12-14 19:47:09 -05:00
Tim Gross a0cf5db797
provide `-no-shutdown-delay` flag for job/alloc stop (#11596)
Some operators use very long group/task `shutdown_delay` settings to
safely drain network connections to their workloads after service
deregistration. But during incident response, they may want to cause
that drain to be skipped so they can quickly shed load.

Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and
`nomad job stop` commands that bypasses the delay. This sets a new
desired transition state on the affected allocations that the
allocation/task runner will identify during pre-kill on the client.

Note (as documented here) that using this flag will almost always
result in failed inbound network connections for workloads as the
tasks will exit before clients receive updated service discovery
information and won't be gracefully drained.
2021-12-13 14:54:53 -05:00
Tim Gross 5a68373e7f Version 1.2.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJhs7QgAAoJELC0QQl2hbZ2IQQP/3aKKgsptB0IPGx4vAAlIfMY
 IUyj9KdQ0SRN4B0C4h/T3CxqIhFPGmrV2RkOtEpDyBJuTUbH4FBjCscsKePFON+g
 Kfk/SoP05AQSksXFiKVK99UxUjg43SdqvatwnmLH4hafapbq5mMouTkBho+i05xK
 n6853DOwoq5qsPs6ihwRddRtpduozBKWLBMoBUm3syf8erWX0dafU5WszvLvG16R
 YJxTNr0nwQFhKDfY1CFUHJglj1s521poA9Zj6Xa1fNnIQ2JdKW1kElPUXmra1w7X
 0Wussv4fgJAetTO2bz0+IeuQf+EzxQ7vKDklt4ORypXkwiC9h7x2ZNCKRL+GReyU
 wUnzccXBfOsgpvW5EAoNXCGOQa6c2+uvHAAd62AAqljLh+B+yDJysvPobihfbSsu
 E2kXJEd3N6GndDjFfzUaYPhhGkvBaPUTNxybSaaREShJ7a7c8tedxfMpNYt1RwGz
 llJEoeZZketwjEFLEHp9xjNeqXdAXyrqCkluMvy+foU72HaRPFc0tlDnRsqirZ0p
 hBxLxPp5oM4V/RegTa3z8P4J0kMSvCdCE4bPNgyiEJDmvxRYDVk5YorLTCDTGrWU
 4WO7fue0bOwhGBYWRfAWzfpoHrCvRLto2vVdtBaFwlzmGP8j/QjM8ANrGyiJeiuY
 IPZSM93pAAcWQEV9id/E
 =G3In
 -----END PGP SIGNATURE-----

Merge tag 'v1.2.3' into merge-release-1.2.3-branch

Version 1.2.3
2021-12-13 10:12:07 -05:00
Tim Gross 46e1d29298 golang security update 1.17.5 2021-12-10 13:50:22 -05:00
Tim Gross 624ecab901
evaluations list pagination and filtering (#11648)
API queries can request pagination using the `NextToken` and `PerPage`
fields of `QueryOptions`, when supported by the underlying API.

Add a `NextToken` field to the `structs.QueryMeta` so that we have a
common field across RPCs to tell the caller where to resume paging
from on their next API call. Include this field on the `api.QueryMeta`
as well so that it's available for future versions of List HTTP APIs
that wrap the response with `QueryMeta` rather than returning a simple
list of structs. In the meantime callers can get the `X-Nomad-NextToken`.

Add pagination to the `Eval.List` RPC by checking for pagination token
and page size in `QueryOptions`. This will allow resuming from the
last ID seen so long as the query parameters and the state store
itself are unchanged between requests.

Add filtering by job ID or evaluation status over the results we get
out of the state store.

Parse the query parameters of the `Eval.List` API into the arguments
expected for filtering in the RPC call.
2021-12-10 13:43:03 -05:00