Commit graph

25039 commits

Author SHA1 Message Date
am-ak 3ca370dd03
docs: fix broken link in security model docs (#17812)
correcting a broken link under "similar to consul" and correcting list formatting under "general mechanisms"
2023-07-06 10:01:36 -04:00
Patric Stout ebb363d43e
metrics: add "total_ticks_count" for CPU metrics (#17579)
This counter tells you the total amount of ticks for that CPU
entry since the start of Nomad.
2023-07-05 10:28:55 -04:00
deverton-godaddy f44793d377
[api] Add NetworkStatus to allocation response (#17280)
Service discovery or mesh network systems consuming the Nomad event stream or API need to know the CNI assigned IP for the allocation. This data is returned by the underlying Nomad API but isn't mapped in the response struct.
2023-07-04 19:35:38 -04:00
James Rasell 4289de5986
docs: fix up constraint jobspec HCL format. (#17795) 2023-07-04 13:33:46 +01:00
Phil Renaud 5cc0b39683
Report shows a 3rd party browser extension puts a banner at the top of page and awkwardly shifts nav; this fixes that (#17783) 2023-06-30 17:09:42 -04:00
Phil Renaud d559072e48
[ui] Text wrap long lines of code and logs (#17754)
* Text and code wrapping as a localStorage var

* task-log uses wrapping and kb shortcut

* Word wrap keyboard labels

* Wrapper as a toggle not a button

* Changelog and fixed an extra space trailing log lines

* Moves toggle to inside

* Acceptance tests for ww and toggle click
2023-06-30 17:07:57 -04:00
Tim Gross e7cc7f2123
docs: clarify network topology requirements for clients (#17779)
The requirements for client-to-server and client-to-client topologies are not
well-documented in the production install requirements docs. Document that
clients make connections to servers (and not the other way around), and that
clients don't need to communicate with each other (with some exceptions).

Fixes: #17631
2023-06-30 10:46:29 -04:00
James Rasell 45073e8a05
job: ensure node pool is canonicalized for state restores. (#17765) 2023-06-30 07:37:22 +01:00
Sarah Thompson 8c6cc5b1d8
Update the revision used by the docker build action. (#17755)
Update the revision used by the docker action. This should always reflect the commit that's being built as this may differ from the default <github.sha> that the workflow was invoked at.

Goes with https://github.com/hashicorp/actions-docker-build/pull/59 - and should not be merged until this PR is merged and a new version of the action is cut.
2023-06-29 09:19:54 -04:00
Phil Renaud 01d6a94eac
[ui] HCL-in-UI: Re-arrange buttons, add save-as-file (#17752)
* Move buttons over as expected

* Let a user download file locally

* test mock fns for jobeditor

* Changelog
2023-06-28 21:57:03 -04:00
Daniel Bennett 77a8d79bb5
e2e: use DNS instead of HTTP to get my_public_ipv4 (#17759) 2023-06-28 13:11:57 -05:00
Tim Gross 70eca4e7f9
Merge pull request #17758 from hashicorp/post-release-1.6.0-beta.1
Post release 1.6.0 beta.1
2023-06-28 12:26:00 -04:00
hc-github-team-nomad-core 6f5401ae3a Prepare for next release 2023-06-28 11:06:28 -04:00
hc-github-team-nomad-core 5c703a49b1 Generate files for 1.6.0-beta.1 release 2023-06-28 11:06:20 -04:00
Tim Gross 6a24ffac1c release: submit build workflow from the file on the release's own branch 2023-06-28 11:06:13 -04:00
Tim Gross 06c7974120 Prepare release 1.6.0-beta.1 2023-06-28 11:06:05 -04:00
Phil Renaud 5f8c4b3d48
Link to allocations.allocation by ID reference, not by model (#17753) 2023-06-28 10:00:59 -04:00
Phil Renaud bd1ec095d3
[ui] Move Placement Failures notification above job status panel (#17750)
* Moves the Placement Failures box above job status, should it exist

* Move it for non-service job-types as well
2023-06-27 19:32:51 -04:00
Phil Renaud 7a60c69b73
[ui] links to allocations explicitly go through their route model hook (#17737)
* links to allocations explicitly go through their route model hook

* Acceptance test to make sure alloc clicking loads alloc endpoint obj
2023-06-27 10:01:50 -04:00
Seth Hoenig cfb7efc478
fix changelog entry typo (#17743) 2023-06-27 08:02:06 -05:00
Seth Hoenig 4771690582
deps: update cronexpr to capture license file in SBOM tools (#17733) 2023-06-27 07:58:20 -05:00
Juana De La Cuesta 28b66d2400
Update checklist-rpc-endpoint.md (#17698)
* Update checklist-rpc-endpoint.md

* Update checklist-rpc-endpoint.md

* Update contributing/checklist-rpc-endpoint.md

Co-authored-by: Tim Gross <tgross@hashicorp.com>

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-06-27 10:52:38 +02:00
Phil Renaud 32af971bcb
Node Pools moved to after Type in jobs index columns (#17738) 2023-06-26 17:00:01 -04:00
Seth Hoenig d590123637
drivers/docker: refactor use of clients in docker driver (#17731)
* drivers/docker: refactor use of clients in docker driver

This PR refactors how we manage the two underlying clients used by the
docker driver for communicating with the docker daemon. We keep two clients
- one with a hard-coded timeout that applies to all operations no matter
what, intended for use with short lived / async calls to docker. The other
has no timeout and is the responsibility of the caller to set a context
that will ensure the call eventually terminates.

The use of these two clients has been confusing and mistakes were made
in a number of places where calls were making use of the wrong client.

This PR makes it so that a user must explicitly call a function to get
the client that makes sense for that use case.

Fixes #17023

* cr: followup items
2023-06-26 15:21:42 -05:00
sejalapeno 4c6906d873
Update allocations.go (#17726)
* Update allocations.go

updated missing client status "unknown" #17688

* changelog

* Update .changelog/17726.txt

adding relevant desc.

Co-authored-by: Seth Hoenig <shoenig@duck.com>

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-06-26 13:33:29 -05:00
nicoche 649831c1d3
deploymentwatcher: fail early whenever possible (#17341)
Given a deployment that has a `progress_deadline`, if a task group runs
out of reschedule attempts, allow it to fail at this time instead of
waiting until the `progress_deadline` is reached.

Fixes: #17260
2023-06-26 14:01:03 -04:00
Phil Renaud 81edceb2de
[ui] alignment and spacing for job status panel (#17708)
* CSS alignment and spacing for job status panel

* Only fade the count, not the legend icon, when count is 0

* Unrounded version corners

* changelog

* css has to only remove border radius when count is present

* Seed stabilization for services test

* Try consolidating the testfixes from before

* Total test isolation and bonus logs

* Drop the isolation but keep the logs

* Remove bonus logging
2023-06-26 12:18:12 -04:00
hashicorp-copywrite[bot] e901340c3f
[COMPLIANCE] Add Copyright and License Headers (#17732)
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
2023-06-26 11:11:17 -05:00
dependabot[bot] 05a8ccff26
build(deps): bump github.com/containerd/go-cni from 1.1.7 to 1.1.9 (#17582) 2023-06-26 16:47:20 +01:00
James Rasell 74ab0badb4
test: add drain config tests. (#17724) 2023-06-26 16:23:13 +01:00
Seth Hoenig 2e2c578298
e2e: refactor pids isolation tests (#17717)
This PR refactors some old PID isolation tests to make use of the e2e/v3
packages. Should be quite a bit easier to read. Adds 'alloc exec' capability
to the jobs3 package.
2023-06-26 09:51:18 -05:00
Tim Gross f65a925096
adjust prioritized client updates (#17541)
In #17354 we made client updates prioritized to reduce client-to-server
traffic. When the client has no previously-acknowledged update we assume that
the update is of typical priority; although we don't know that for sure in
practice an allocation will never become healthy quickly enough that the first
update we send is the update saying the alloc is healthy.

But that doesn't account for allocations that quickly fail in an unrecoverable
way because of allocrunner hook failures, and it'd be nice to be able to send
those failure states to the server more quickly. This changeset does so and adds
some extra comments on reasoning behind priority.
2023-06-26 09:14:24 -04:00
dependabot[bot] e93af16008
build(deps): bump github.com/opencontainers/runtime-spec (#17719)
Bumps [github.com/opencontainers/runtime-spec](https://github.com/opencontainers/runtime-spec) from 1.0.3-0.20210326190908-1c3f411f0417 to 1.1.0-rc.3.
- [Release notes](https://github.com/opencontainers/runtime-spec/releases)
- [Changelog](https://github.com/opencontainers/runtime-spec/blob/main/ChangeLog)
- [Commits](https://github.com/opencontainers/runtime-spec/commits/v1.1.0-rc.3)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runtime-spec
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-26 08:03:50 -05:00
Piotr Kazmierczak abd2252115
chore: gofmt docker driver handle.go (#17721) 2023-06-26 10:38:23 +02:00
Johan Forssell 9174f38f8c
drivers: OOM kill logging for Docker driver (#17518)
Explicit error log of the docker ID and container image name
2023-06-26 10:13:23 +02:00
Tim Gross 926b3030d7
cli: fix broken node pool jobs test (#17715)
In #17705 we fixed a bug in the treatment of the "all" node pool for the `node
pool jobs` command but missed a test in the CLI.
2023-06-23 14:10:45 -07:00
Tim Gross 1432af9a88
docs: clarify drain's -force flag behavior with system/CSI jobs (#17703)
If you use `nomad node drain -force`, the drain deadline is set to -1ns. If you
have not prevented system and CSI node plugin allocations from being drained
with `-ignore-system`, they will be immediately drained as well. This is
typically not safe for CSI node plugins.

Also fix some broken links.

Fixes: #17696
2023-06-23 16:38:11 -04:00
Luiz Aoqui 9aa9779d80
api: prevent panic on job plan (#17689)
Check for a nil job ID to prevent a panic when calling Jobs().Plan().
2023-06-23 16:20:52 -04:00
Luiz Aoqui d62c34b9f9
build: add Docker image (#17017)
Co-authored-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com>
2023-06-23 15:57:09 -04:00
Luiz Aoqui 66962b2b28
np: fix list of jobs for node pool all (#17705)
Unlike nodes, jobs are allowed to be registered in the node pool `all`,
in which case all nodes are used for evaluating placements. When listing
jobs for the `all` node pool only those that are explicitly in this node
pool should be returned.
2023-06-23 15:47:53 -04:00
Luiz Aoqui 3398d32000
changelog: add entry for node pools (#17707) 2023-06-23 15:47:35 -04:00
Tim Gross 12d5eab2d1
docs: split out unsupported versions in changelog (#17704)
Our changelog has become large enough that GitHub's rendering is very slow,
resulting in error pages ("angry unicorns"). Split out the older unsupported
versions of Nomad into their own file so that we only need to render the most
recent versions, while keeping the older versions relatively searchable by
having them in a single file.
2023-06-23 15:17:57 -04:00
grembo 7936c1e33f
Add disable_file parameter to job's vault stanza (#13343)
This complements the `env` parameter, so that the operator can author
tasks that don't share their Vault token with the workload when using 
`image` filesystem isolation. As a result, more powerful tokens can be used 
in a job definition, allowing it to use template stanzas to issue all kinds of 
secrets (database secrets, Vault tokens with very specific policies, etc.), 
without sharing that issuing power with the task itself.

This is accomplished by creating a directory called `private` within
the task's working directory, which shares many properties of
the `secrets` directory (tmpfs where possible, not accessible by
`nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the
container.

If the `disable_file` parameter is set to `false` (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.
2023-06-23 15:15:04 -04:00
Michael Lange faa3377a56
Merge pull request #17691 from hashicorp/f/missing-chart-stories
[UI] Missing chart stories
2023-06-23 08:17:34 -07:00
James Rasell b9440965db
client: remove unused nsd check allocation result diff func (#17695) 2023-06-23 15:26:06 +01:00
Seth Hoenig 2c7877658c
e2e: create a v3/ set of packages for creating Nomad e2e tests (#17620)
* e2e: create a v3/ set of packages for creating Nomad e2e tests

This PR creates an experimental set of packages under `e2e/v3/` for crafting
Nomad e2e tests. Unlike previous generations, this is an attempt at providing
a way to create tests in a declarative (ish) pattern, with a focus on being
easy to use, easy to cleanup, and easy to debug.

@shoenig is just trying this out to see how it goes.

Lots of features need to be implemented.
Many more docs need to be written.
Breaking changes are to be expected.
There are known and unknown bugs.
No warranty.

Quick run of `example` with verbose logging.

```shell
➜ NOMAD_E2E_VERBOSE=1 go test -v
=== RUN   TestExample
=== RUN   TestExample/testSleep
    util3.go:25: register (service) job: "sleep-809"
    util3.go:25: checking eval: 9f0ae04d-7259-9333-3763-44d0592d03a1, status: pending
    util3.go:25: checking eval: 9f0ae04d-7259-9333-3763-44d0592d03a1, status: complete
    util3.go:25: checking deployment: a85ad2f8-269c-6620-d390-8eac7a9c397d, status: running
    util3.go:25: checking deployment: a85ad2f8-269c-6620-d390-8eac7a9c397d, status: running
    util3.go:25: checking deployment: a85ad2f8-269c-6620-d390-8eac7a9c397d, status: running
    util3.go:25: checking deployment: a85ad2f8-269c-6620-d390-8eac7a9c397d, status: running
    util3.go:25: checking deployment: a85ad2f8-269c-6620-d390-8eac7a9c397d, status: successful
    util3.go:25: deployment a85ad2f8-269c-6620-d390-8eac7a9c397d was a success
    util3.go:25: deregister job "sleep-809"
    util3.go:25: system gc
=== RUN   TestExample/testNamespace
    util3.go:25: apply namespace "example-291"
    util3.go:25: register (service) job: "sleep-967"
    util3.go:25: checking eval: a2a2303a-adf1-2621-042e-a9654292e569, status: pending
    util3.go:25: checking eval: a2a2303a-adf1-2621-042e-a9654292e569, status: complete
    util3.go:25: checking deployment: 3395e9a8-3ffc-8990-d5b8-cc0ce311f302, status: running
    util3.go:25: checking deployment: 3395e9a8-3ffc-8990-d5b8-cc0ce311f302, status: running
    util3.go:25: checking deployment: 3395e9a8-3ffc-8990-d5b8-cc0ce311f302, status: running
    util3.go:25: checking deployment: 3395e9a8-3ffc-8990-d5b8-cc0ce311f302, status: successful
    util3.go:25: deployment 3395e9a8-3ffc-8990-d5b8-cc0ce311f302 was a success
    util3.go:25: deregister job "sleep-967"
    util3.go:25: system gc
    util3.go:25: cleanup namespace "example-291"
=== RUN   TestExample/testEnv
    util3.go:25: register (batch) job: "env-582"
    util3.go:25: checking eval: 600f3bce-ea17-6d13-9d20-9d9eb2a784f7, status: pending
    util3.go:25: checking eval: 600f3bce-ea17-6d13-9d20-9d9eb2a784f7, status: complete
    util3.go:25: deregister job "env-582"
    util3.go:25: system gc
--- PASS: TestExample (10.08s)
    --- PASS: TestExample/testSleep (5.02s)
    --- PASS: TestExample/testNamespace (4.02s)
    --- PASS: TestExample/testEnv (1.03s)
PASS
ok      github.com/hashicorp/nomad/e2e/example  10.079s
```

* cluster3: use filter for kernel.name instead of filtering manually
2023-06-23 09:10:49 -05:00
James Rasell 78cdf0d0d8
server: remove unused endpoints struct. (#17665) 2023-06-23 08:20:33 +01:00
Luiz Aoqui f785da4748
ci: fix flaky UI test (#17676) 2023-06-22 23:07:36 -04:00
Michael Lange 41f6f7e04f TopoViz story that is sourced from Mirage
Unfortunately due to the split build nature of the ember app and
storybook it isn't possible to import mirage in the storybook context to
control scenarios via a knob :(
2023-06-22 16:55:36 -07:00
Michael Lange 85371941c4 Full TopoViz story 2023-06-22 16:55:25 -07:00