Commit graph

785 commits

Author SHA1 Message Date
Phil Renaud 7f7f764c5a
[ui] Fixed: Evaluations sidebar/response not scrollable (#16960)
* Sets up a CSS grid for Evaluations sidebar

* Flex seems more sensible for this actually

* Tighten up the header margin

* Percy found a diff; the expand button wasnt showing for view logs sidebar
2023-04-27 09:49:18 -04:00
James Rasell 4d2c1403c2
scale: do not allow scaling of jobs with type system. (#16969) 2023-04-25 15:47:44 +01:00
Phil Renaud 7dbebe9a93
[ui, feature] Job Page Redesign (#16932)
* [ui] Service job status panel (#16134)

* it begins

* Hacky demo enabled

* Still very hacky but seems deece

* Floor of at least 3 must be shown

* Width from on-high

* Other statuses considered

* More sensible allocTypes listing

* Beginnings of a legend

* Total number of allocs running now maps over job.groups

* Lintfix

* base the number of slots to hold open on actual tallies, which should never exceed totalAllocs

* Versions get yer versions here

* Versions lookin like versions

* Mirage fixup

* Adds Remaining as an alloc chart status and adds historical status option

* Get tests passing again by making job status static for a sec

* Historical status panel click actions moved into their own component class

* job detail tests plz chill

* Testing if percy is fickle

* Hyper-specfic on summary distribution bar identifier

* Perhaps the 2nd allocSummary item no longer exists with the more accurate afterCreate data

* UI Test eschewing the page pattern

* Bones of a new acceptance test

* Track width changes explicitly with window-resize

* testlintfix

* Alloc counting tests

* Alloc grouping test

* Alloc grouping with complex resizing

* Refined the list of showable statuses

* PR feedback addressed

* renamed allocation-row to allocation-status-row

* [ui, job status] Make panel status mode a queryParam (#16345)

* queryParam changing

* Test for QP in panel

* Adding @tracked to legacy controller

* Move the job of switching to Historical out to larger context

* integration test mock passed func

* [ui] Service job deployment status panel (#16383)

* A very fast and loose deployment panel

* Removing Unknown status from the panel

* Set up oldAllocs list in constructor, rather than as a getter/tracked var

* Small amount of template cleanup

* Refactored latest-deployment new logic back into panel.js

* Revert now-unused latest-deployment component

* margin bottom when ungrouped also

* Basic integration tests for job deployment status panel

* Updates complete alloc colour to green for new visualizations only (#16618)

* Updates complete alloc colour to green for new visualizations only

* Pale green instead of dark green for viz in general

* [ui] Job Deployment Status: History and Update Props (#16518)

* Deployment history wooooooo

* Styled deployment history

* Update Params

* lintfix

* Types and groups for updateParams

* Live-updating history

* Harden with types, error states, and pending states

* Refactor updateParams to use trigger component

* [ui] Deployment History search (#16608)

* Functioning searchbox

* Some nice animations for history items

* History search test

* Fixing up some old mirage conventions

* some a11y rule override to account for scss keyframes

* Split panel into deploying and steady components

* HandleError passed from job index

* gridified panel elements

* TotalAllocs added to deploying.js

* Width perc to px

* [ui] Splitting deployment allocs by status, health, and canary status (#16766)

* Initial attempt with lots of scratchpad work

* Style mods per UI discussion

* Fix canary overflow bug

* Dont show canary or health for steady/prev-alloc blocks

* Steady state

* Thanks Julie

* Fixes steady-state versions

* Legen, wait for it...

* Test fixes now that we have a minimum block size

* PR prep

* Shimmer effect on pending and unplaced allocs (#16801)

* Shimmer effect on pending and unplaced

* Dont show animation in the legend

* [ui, deployments] Linking allocblocks and legends to allocation / allocations index routes (#16821)

* Conditional link-to component and basic linking to allocations and allocation routes

* Job versions filter added to allocations index page

* Steady state legends link

* Legend links

* Badge count links for versions

* Fix: faded class on steady-state legend items

* version link now wont show completed ones

* Fix a11y violations with link labels

* Combining some template conditional logic

* [ui, deployments] Conversions on long nanosecond update params (#16882)

* Conversions on long nanosecond nums

* Early return in updateParamGroups comp prop

* [ui, deployments] Mirage Actively Deploying Job and Deployment Integration Tests (#16888)

* Start of deployment alloc test scaffolding

* Bit of test cleanup and canary for ungrouped allocs

* Flakey but more robust integrations for deployment panel

* De-flake acceptance tests and add an actively deploying job to mirage

* Jitter-less alloc status distribution removes my bad math

* bugfix caused by summary.desiredTotal non-null

* More interesting mirage active deployment alloc breakdown

* Further tests for previous-allocs row

* Previous alloc legend tests

* Percy snapshots added to integration test

* changelog
2023-04-24 22:45:39 -04:00
Seth Hoenig 753c17c9de
services: un-mark group services as deregistered if restart hook runs (#16905)
* services: un-mark group services as deregistered if restart hook runs

This PR may fix a bug where group services will never be deregistered if the
group undergoes a task restart.

* e2e: add test case for restart and deregister group service

* cl: add cl

* e2e: add wait for service list call
2023-04-24 14:24:51 -05:00
Tim Gross 72cbe53f19
logs: allow disabling log collection in jobspec (#16962)
Some Nomad users ship application logs out-of-band via syslog. For these users
having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow
disabling the logmon and pointing the task's stdout/stderr to /dev/null.

This changeset is the first of several incremental improvements to log
collection short of full-on logging plugins. The next step will likely be to
extend the internal-only task driver configuration so that cluster
administrators can turn off log collection for the entire driver.

---

Fixes: #11175

Co-authored-by: Thomas Weber <towe75@googlemail.com>
2023-04-24 10:00:27 -04:00
valodzka 379497a484
fix host port handling for ipv6 (#16723) 2023-04-20 19:53:20 -07:00
Etienne Bruines 1e3531b978
cni: fix plugin fingerprinting versions (#16776)
CNI plugins v1.2.0 and above output a second line, containing supported protocol versions.
2023-04-20 18:44:39 -07:00
Luiz Aoqui a1ba068e1f
cli: fix panic on job plan when -diff=false (#16944)
PR #14492 introduced a new check to return 0 when the `nomad job plan`
command returns a diff of type `None`.

But the `-diff` CLI flag was also being used to control whether the plan
request should return the diff of not instead of just controlling if the
diff was printed.

This means that when `-diff=false` is set the response does not include
any diff information, and so the new check panics.

This commit fixes the problem by always requesting a diff and using the
`-diff` only for controlling output, as it's currently documented.
2023-04-20 17:33:29 -07:00
astudentofblake 42c4c8d5ea
fix: added landlock access to /usr/libexec for getter (#16900) 2023-04-20 11:16:04 -05:00
claire labry d2beea3435
changelog: add changelog update for vendor label for linux packaging (#16071) 2023-04-19 08:14:14 -07:00
Luiz Aoqui fb588fcbb8
allocrunner: prevent panic on network manager (#16921) 2023-04-18 13:39:13 -07:00
Charlie Voiselle 9e8f2a937c
[scheduler] Honor false for distinct hosts constraint (#16907)
* Honor value for distinct_hosts constraint
* Add test for feasibility checking for `false`
---------
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-04-17 17:43:56 -04:00
Tim Gross 04e049caed
license: show Terminated field in license get command (#16892) 2023-04-17 09:01:43 -04:00
Tim Gross 62548616d4
client: allow drain_on_shutdown configuration (#16827)
Adds a new configuration to clients to optionally allow them to drain their
workloads on shutdown. The client sends the `Node.UpdateDrain` RPC targeting
itself and then monitors the drain state as seen by the server until the drain
is complete or the deadline expires. If it loses connection with the server, it
will monitor local client status instead to ensure allocations are stopped
before exiting.
2023-04-14 15:35:32 -04:00
Tim Gross 5a9abdc469
drain: use client status to determine drain is complete (#14348)
If an allocation is slow to stop because of `kill_timeout` or `shutdown_delay`,
the node drain is marked as complete prematurely, even though drain monitoring
will continue to report allocation migrations. This impacts the UI or API
clients that monitor node draining to shut down nodes.

This changeset updates the behavior to wait until the client status of all
drained allocs are terminal before marking the node as done draining.
2023-04-13 08:55:28 -04:00
Seth Hoenig ec1a8ae12a
deps: update docker to 23.0.3 (#16862)
* [no ci] deps: update docker to 23.0.3

This PR brings our docker/docker dependency (which is hosted at github.com/moby/moby)
up to 23.0.3 (forward about 2 years). Refactored our use of docker/libnetwork to
reference the package in its new home, which is docker/docker/libnetwork (it is
no longer an independent repository). Some minor nearby test case cleanup as well.

* add cl
2023-04-12 14:13:36 -05:00
Juana De La Cuesta 8302085384
Deployment Status Command Does Not Respect -namespace Wildcard (#16792)
* func: add namespace support for list deployment

* func: add wildcard to namespace filter for deployments

* Update deployment_endpoint.go

* style: use must instead of require or asseert

* style: rename paginator to avoid clash with import

* style: add changelog entry

* fix: add missing parameter for upsert jobs
2023-04-12 11:02:14 +02:00
James Rasell bc01d47071
consul/connect: fixed a bug where restarting proxy tasks failed. (#16815)
The first start of a Consul Connect proxy sidecar triggers a run
of the envoy_version hook which modifies the task config image
entry. The modification takes into account a number of factors to
correctly populate this. Importantly, once the hook has run, it
marks itself as done so the taskrunner will not execute it again.

When the client receives a non-destructive update for the
allocation which the proxy sidecar is a member of, it will update
and overwrite the task definition within the taskerunner. In doing
so it overwrite the modification performed by the hook. If the
allocation is restarted, the envoy_version hook will be skipped as
it previously marked itself as done, and therefore the sidecar
config image is incorrect and causes a driver error.

The fix removes the hook in marking itself as done to the view of
the taskrunner.
2023-04-11 15:56:03 +01:00
Seth Hoenig ba728f8f97
api: enable support for setting original job source (#16763)
* api: enable support for setting original source alongside job

This PR adds support for setting job source material along with
the registration of a job.

This includes a new HTTP endpoint and a new RPC endpoint for
making queries for the original source of a job. The
HTTP endpoint is /v1/job/<id>/submission?version=<version> and
the RPC method is Job.GetJobSubmission.

The job source (if submitted, and doing so is always optional), is
stored in the job_submission memdb table, separately from the
actual job. This way we do not incur overhead of reading the large
string field throughout normal job operations.

The server config now includes job_max_source_size for configuring
the maximum size the job source may be, before the server simply
drops the source material. This should help prevent Bad Things from
happening when huge jobs are submitted. If the value is set to 0,
all job source material will be dropped.

* api: avoid writing var content to disk for parsing

* api: move submission validation into RPC layer

* api: return an error if updating a job submission without namespace or job id

* api: be exact about the job index we associate a submission with (modify)

* api: reword api docs scheduling

* api: prune all but the last 6 job submissions

* api: protect against nil job submission in job validation

* api: set max job source size in test server

* api: fixups from pr
2023-04-11 08:45:08 -05:00
Daniel Bennett fa33ee567a
gracefully recover tasks that use csi node plugins (#16809)
new WaitForPlugin() called during csiHook.Prerun,
so that on startup, clients can recover running
tasks that use CSI volumes, instead of them being
terminated and rescheduled because they need a
node plugin that is "not found" *yet*, only because
the plugin task has not yet been recovered.
2023-04-10 17:15:33 -05:00
Tim Gross 1335543731
ephemeral disk: migrate should imply sticky (#16826)
The `ephemeral_disk` block's `migrate` field allows for best-effort migration of
the ephemeral disk data to new nodes. The documentation says the `migrate` field
is only respected if `sticky=true`, but in fact if client ACLs are not set the
data is migrated even if `sticky=false`.

The existing behavior when client ACLs are disabled has existed since the early
implementation, so "fixing" that case now would silently break backwards
compatibility. Additionally, having `migrate` not imply `sticky` seems
nonsensical: it suggests that if we place on a new node we migrate the data but
if we place on the same node, we throw the data away!

Update so that `migrate=true` implies `sticky=true` as follows:

* The failure mode when client ACLs are enabled comes from the server not passing
  along a migration token. Update the server so that the server provides a
  migration token whenever `migrate=true` and not just when `sticky=true` too.
* Update the scheduler so that `migrate` implies `sticky`.
* Update the client so that we check for `migrate || sticky` where appropriate.
* Refactor the E2E tests to move them off the old framework and make the intention
  of the test more clear.
2023-04-07 16:33:45 -04:00
Michael Schurter a8b379f962
docker: default device.container_path to host_path (#16811)
* docker: default device.container_path to host_path

Matches docker cli behavior.

Fixes #16754
2023-04-06 14:44:33 -07:00
Tim Gross 6f2b9266bc
Merge pull request #16794 from hashicorp/post-1.5.3-release
Post 1.5.3 release
2023-04-05 13:02:37 -04:00
the-nando f541f2e59b
Do not set attributes when spawning the getter child (#16791)
* Do not set attributes when spawning the getter child

* Cleanup

* Cleanup

---------

Co-authored-by: the-nando <the-nando@invalid.local>
2023-04-05 11:47:51 -05:00
Tim Gross 66a01bb35a upgrade go to 1.20.3 2023-04-05 12:18:19 -04:00
Tim Gross 8278f23042 acl: fix ACL bypass for anon requests that pass thru client HTTP
Requests without an ACL token that pass thru the client's HTTP API are treated
as though they come from the client itself. This allows bypass of ACLs on RPC
requests where ACL permissions are checked (like `Job.Register`). Invalid tokens
are correctly rejected.

Fix the bypass by only setting a client ID on the identity if we have a valid node secret.

Note that this changeset will break rate metrics for RPCs sent by clients
without a client secret such as `Node.GetClientAllocs`; these requests will be
recorded as anonymous.

Future work should:
* Ensure the node secret is sent with all client-driven RPCs except
  `Node.Register` which is TOFU.
* Create a new `acl.ACL` object from client requests so that we
  can enforce ACLs for all endpoints in a uniform way that's less error-prone.~
2023-04-05 12:17:51 -04:00
Juana De La Cuesta 9b4871fece
Prevent kill_timeout greater than progress_deadline (#16761)
* func: add validation for kill timeout smaller than progress dealine

* style: add changelog

* style: typo in changelog

* style: remove refactored test

* Update .changelog/16761.txt

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/structs/structs.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-04-04 18:17:10 +02:00
James Rasell cb6ba80f0f
cli: stream both stdout and stderr when following an alloc. (#16556)
This update changes the behaviour when following logs from an
allocation, so that both stdout and stderr files streamed when the
operator supplies the follow flag. The previous behaviour is held
when all other flags and situations are provided.

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-04 10:42:27 +01:00
Georgy Buranov ca80546ef7
take maximum processor Mhz (#16740)
* take maximum processor Mhz

* remove break

* cl: add cl for 16740

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-03-31 11:25:32 -05:00
Horacio Monsalvo 20372b1721
connect: add meta on ConsulSidecarService (#16705)
Co-authored-by: Sol-Stiep <sol.stiep@southworks.com>
2023-03-30 16:09:28 -04:00
Piotr Kazmierczak acfc266c30 acl: JWT changelog entry and typo fix 2023-03-30 09:40:11 +02:00
Tim Gross 76284a09a0
docker: move pause container recovery to after SetConfig (#16713)
When we added recovery of pause containers in #16352 we called the recovery
function from the plugin factory function. But in our plugin setup protocol, a
plugin isn't ready for use until we call `SetConfig`. This meant that
recovering pause containers was always done with the default
config. Setting up the Docker client only happens once, so setting the wrong
config in the recovery function also means that all other Docker API calls will
use the default config.

Move the `recoveryPauseContainers` call into the `SetConfig`. Fix the error
handling so that we return any error but also don't log when the context is
canceled, which happens twice during normal startup as we fingerprint the
driver.
2023-03-29 16:20:37 -04:00
dependabot[bot] afa9608475
build(deps): bump github.com/opencontainers/runc from 1.1.4 to 1.1.5 (#16712)
* build(deps): bump github.com/opencontainers/runc from 1.1.4 to 1.1.5

Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.4 to 1.1.5.
- [Release notes](https://github.com/opencontainers/runc/releases)
- [Changelog](https://github.com/opencontainers/runc/blob/v1.1.5/CHANGELOG.md)
- [Commits](https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* changelog entry

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-03-29 15:05:05 -04:00
Elvis Pranskevichus 11a9bb6ce7
drivers/exec: Fix handling of capabilities for unprivileged tasks (#16643)
Currently, the `exec` driver is only setting the Bounding set, which is
not sufficient to actually enable the requisite capabilities for the
task process.  In order for the capabilities to survive `execve`
performed by libcontainer, the `Permitted`, `Inheritable`, and `Ambient`
sets must also be set.

Per CAPABILITIES (7):

> Ambient: This is a set of capabilities that are preserved across an
> execve(2) of a program that is not privileged.  The ambient capability
> set obeys the invariant that no capability can ever be ambient if it
> is not both permitted and inheritable.
2023-03-28 12:12:55 -04:00
Seth Hoenig 87f4b71df0
client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips (#16672)
* client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips

This PR adds detection of asymetric core types (Power & Efficiency) (P/E)
when running on M1/M2 Apple Silicon CPUs. This functionality is provided
by shoenig/go-m1cpu which makes use of the Apple IOKit framework to read
undocumented registers containing CPU performance data. Currently working
on getting that functionality merged upstream into gopsutil, but gopsutil
would still not support detecting P vs E cores like this PR does.

Also refactors the CPUFingerprinter code to handle the mixed core
types, now setting power vs efficiency cpu attributes.

For now the scheduler is still unaware of mixed core types - on Apple
platforms tasks cannot reserve cores anyway so it doesn't matter, but
at least now the total CPU shares available will be correct.

Future work should include adding support for detecting P/E cores on
the latest and upcoming Intel chips, where computation of total cpu shares
is currently incorrect. For that, we should also include updating the
scheduler to be core-type aware, so that tasks of resources.cores on Linux
platforms can be assigned the correct number of CPU shares for the core
type(s) they have been assigned.

node attributes before

cpu.arch                  = arm64
cpu.modelname             = Apple M2 Pro
cpu.numcores              = 12
cpu.reservablecores       = 0
cpu.totalcompute          = 1000

node attributes after

cpu.arch                  = arm64
cpu.frequency.efficiency  = 2424
cpu.frequency.power       = 3504
cpu.modelname             = Apple M2 Pro
cpu.numcores.efficiency   = 4
cpu.numcores.power        = 8
cpu.reservablecores       = 0
cpu.totalcompute          = 37728

* fingerprint/cpu: follow up cr items
2023-03-28 08:27:58 -05:00
Juana De La Cuesta 320884b8ee
Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true (#16583)
* Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true
Fixes #11052
When restoring periodic dispatcher, all periodic jobs are forced without checking for previous childre.

* Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true
Fixes #11052
When restoring periodic dispatcher, all periodic jobs are forced without checking for previous children.

* style: refactor force run function

* fix: remove defer and inline unlock for speed optimization

* Update nomad/leader.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* style: refactor tests to use must

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update nomad/leader_test.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* fix: move back from defer to calling unlock before returning.

createEval cant be called with the lock on

* style: refactor test to use must

* added new entry to changelog and update comments

---------

Co-authored-by: James Rasell <jrasell@hashicorp.com>
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-27 17:25:05 +02:00
Luiz Aoqui 8070882c4b
scheduler: fix reconciliation of reconnecting allocs (#16609)
When a disconnect client reconnects the `allocReconciler` must find the
allocations that were created to replace the original disconnected
allocations.

This process was being done in only a subset of non-terminal untainted
allocations, meaning that, if the replacement allocations were not in
this state the reconciler didn't stop them, leaving the job in an
inconsistent state.

This inconsistency is only solved in a future job evaluation, but at
that point the allocation is considered reconnected and so the specific
reconnection logic was not applied, leading to unexpected outcomes.

This commit fixes the problem by running reconnecting allocation
reconciliation logic earlier into the process, leaving the rest of the
reconciler oblivious of reconnecting allocations.

It also uses the full set of allocations to search for replacements,
stopping them even if they are not in the `untainted` set.

The system `SystemScheduler` is not affected by this bug because
disconnected clients don't trigger replacements: every eligible client
is already running an allocation.
2023-03-24 19:38:31 -04:00
Luiz Aoqui e5d31bca61
cli: job restart command (#16278)
Implement the new `nomad job restart` command that allows operators to
restart allocations tasks or reschedule then entire allocation.

Restarts can be batched to target multiple allocations in parallel.
Between each batch the command can stop and hold for a predefined time
or until the user confirms that the process should proceed.

This implements the "Stateless Restarts" alternative from the original
RFC
(https://gist.github.com/schmichael/e0b8b2ec1eb146301175fd87ddd46180).
The original concept is still worth implementing, as it allows this
functionality to be exposed over an API that can be consumed by the
Nomad UI and other clients. But the implementation turned out to be more
complex than we initially expected so we thought it would be better to
release a stateless CLI-based implementation first to gather feedback
and validate the restart behaviour.

Co-authored-by: Shishir Mahajan <smahajan@roblox.com>
2023-03-23 18:28:26 -04:00
Phil Renaud 11de45d17b
[ui] Copyable server and client attribute values (#16548)
* Copyable server and client attribute values

* Changelog
2023-03-22 15:05:01 -04:00
Luiz Aoqui 518fd610b3
changelog: update #16427 to improvement (#16565)
The security fix in Go 1.20.2 does not apply to Nomad.
2023-03-20 21:24:53 -04:00
Michael Schurter f8884d8b52
client/metadata: fix crasher caused by AllowStale = false (#16549)
Fixes #16517

Given a 3 Server cluster with at least 1 Client connected to Follower 1:

If a NodeMeta.{Apply,Read} for the Client request is received by
Follower 1 with `AllowStale = false` the Follower will forward the
request to the Leader.

The Leader, not being connected to the target Client, will forward the
RPC to Follower 1.

Follower 1, seeing AllowStale=false, will forward the request to the
Leader.

The Leader, not being connected to... well hoppefully you get the
picture: an infinite loop occurs.
2023-03-20 16:32:32 -07:00
Phil Renaud ccce4b68f2
[ui] Perform common job tasks with keyboard shortcuts (#16378)
* Throw your mouse into traffic

* Add node metadata with a shortcut

* Re-labelled

* Adds a toast notification to job start/stop on keyboard shortcut

* Typo fix
2023-03-20 09:24:39 -04:00
Juana De La Cuesta 47be374bbd
Add -json flag to quota inspect command (#16478)
* Added  and  flag to  command

* cli[style]: small refactor to avoid confussion with tmpl variable

* Update inspect.mdx

* cli: add changelog entry

* Update .changelog/16478.txt

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update command/quota_inspect.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:40:51 +01:00
Juana De La Cuesta ed44f50091
cli: add -json and -t flags to quota status command (#16485)
* cli: add json and t flags to quota status command

* cli: add entry to changelog

* Update command/quota_status.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:56 +01:00
Juana De La Cuesta eeb3766575
cli: Add json and -t flags to server members command (#16444)
* cli: Add  and  flags to server members

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* cli: update the server memebers tests to use must

* cli: add flags addition to changelog

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:24 +01:00
Piotr Kazmierczak 0a2b425eb5
cli: nomad login command should not require a -type flag and should respect default auth method (#16504)
nomad login command does not need to know ACL Auth Method's type, since all
method names are unique. 

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-17 19:14:28 +01:00
Seth Hoenig 07543f8bdf
nsd: always set deregister flag after deregistration of group (#16289)
* services: always set deregister flag after deregistration of group

This PR fixes a bug where the group service hook's deregister flag was
not set in some cases, causing the hook to attempt deregistrations twice
during job updates (alloc replacement).

In the tests ... we used to assert on the wrong behvior (remove twice) which
has now been corrected to assert we remove only once.

This bug was "silent" in the Consul provider world because the error logs for
double deregistration only show up in Consul logs; with the Nomad provider the
error logs are in the Nomad agent logs.

* services: cleanup group service hook tests
2023-03-17 09:44:21 -05:00
Tim Gross ec47b245d0
client: don't use Status RPC for Consul discovery (#16490)
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.
2023-03-16 15:38:33 -04:00
Seth Hoenig 5b1970468e
artifact: git needs more files for private repositories (#16508)
* landlock: git needs more files for private repositories

This PR fixes artifact downloading so that git may work when cloning from
private repositories. It needs

- file read on /etc/passwd
- dir read on /root/.ssh
- file write on /root/.ssh/known_hosts

Add these rules to the landlock rules for the artifact sandbox.

* cr: use nonexistent instead of devnull

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>

* cr: use go-homdir for looking up home directory

* pr: pull go-homedir into explicit require

* cr: fixup homedir tests in homeless root cases

* cl: fix root test for real

---------

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-03-16 12:22:25 -05:00
Seth Hoenig ed7177de76
scheduler: annotate tasksUpdated with reason and purge DeepEquals (#16421)
* scheduler: annotate tasksUpdated with reason and purge DeepEquals

* cr: move opaque into helper

* cr: swap affinity/spread hashing for slice equal

* contributing: update checklist-jobspec with notes about struct methods

* cr: add more cases to wait config equal method

* cr: use reflect when comparing envoy config blocks

* cl: add cl
2023-03-14 09:46:00 -05:00