Commit graph

20437 commits

Author SHA1 Message Date
Joel May 13faf0d79e Allow client.cpu_total_compute to override attr.cpu.totalcompute 2021-01-07 15:31:11 -05:00
Seth Hoenig 09c13b0066
Merge pull request #9751 from hashicorp/b-envoyv-segfault
consul/connect: fix panic during in-place upgrade with connect jobs
2021-01-07 14:22:27 -06:00
Tim Gross 4eafcb06ef changelog: add entry for GH-9050 2021-01-07 15:01:04 -05:00
Seth Hoenig 303856183c consul/connect: fix panic during in-place upgrade with connect jobs
When upgrading from Nomad v0.12.x to v1.0.x, Nomad client will panic on
startup if the node is running Connect enabled jobs. This is caused by
a missing piece of plumbing of the Consul Proxies API interface during the
client restore process.

Fixes #9738
2021-01-07 13:24:24 -06:00
Michael Lange 304378565c
Merge pull request #9690 from hashicorp/docs-wtdd-update-ui-api-docs
WTDD: Update UI api docs
2021-01-07 10:52:45 -08:00
Kent 'picat' Gruber f0d1c4092b
Update go-getter to v1.5.2 with support for vhost style S3 paths (#9349) 2021-01-07 13:34:28 -05:00
Jeff Escalante 8b4f6b40e4
Merge pull request #9748 from hashicorp/docs-zs.build-time-code-highlight-revised
Add build-time highlighting to code blocks
2021-01-07 13:33:49 -05:00
Michael Lange 674707e349 Update the page param default to 1 instead of 0 2021-01-07 09:59:09 -08:00
Michael Lange aa8e209c2e Typo fixes
Co-authored-by: Buck Doyle <buck@hashicorp.com>
2021-01-07 09:59:08 -08:00
Michael Lange 46a0435cf2 Update 'Node' to 'Client' which is used throughout the UI 2021-01-07 09:59:08 -08:00
Michael Lange 761b7a1cef Add missing faceted search query params 2021-01-07 09:59:08 -08:00
Michael Lange 2f05f06ecd Remove no longer true enteprise warning 2021-01-07 09:59:08 -08:00
Michael Lange 549f2f77ab Remove version introduction
0.7 is ancient at this point. Now it's as if the UI has always existed.
2021-01-07 09:59:07 -08:00
Michael Lange 75f304bbc2 Add missing routes to the UI API doc 2021-01-07 09:59:07 -08:00
Michael Lange d9b8f6d411
Merge pull request #9733 from hashicorp/b-ui/topo-viz-old-agent
UI: Guard against nodes running an old version of the Nomad agent
2021-01-07 09:27:14 -08:00
Zach Shilton 2a9f9aa8d3
Remove broken shell-session highlighting 2021-01-07 11:57:09 -05:00
Zach Shilton caa30ca097
Add build-time highlighting to code blocks 2021-01-07 11:48:02 -05:00
James Rasell 005e15afbc
Merge pull request #9744 from hashicorp/f-add-namespace-e2e-oss
e2e: move namespace tests into OSS.
2021-01-07 17:36:09 +01:00
Nick Ethier 6705f845f2
Merge pull request #9739 from hashicorp/b-alloc-netmode-ports
Use port's to value when building service address under 'alloc' addr_mode
2021-01-07 09:16:27 -05:00
Kdu Bonalume 425ad5892d Fix missing link for Consul integration
Add a link back to configuration/consul in the `service` parameter section of the `group` stanza.
2021-01-07 09:02:43 -05:00
Nick Ethier 7a6aab10bb
Apply suggestions from code review
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-01-07 08:53:54 -05:00
James Rasell 02b9d9da87
e2e: move namespace tests into OSS. 2021-01-07 09:15:43 +01:00
Jeff Escalante f791725736
Merge pull request #9743 from hashicorp/je.fix-edit-page-links
hotfix: fix 'edit this page' links
2021-01-06 19:09:46 -05:00
Jeff Escalante 8c04e22ce4
fix 'edit this page' links 2021-01-06 19:01:32 -05:00
Mahmood Ali 050ad6b6f4
tests: deflake test-api job (#9742)
Deflake test-api job, currently failing at around 7.6% (44 out of 578
workflows), by ensuring that test nomad agent use a small dedicated port
range that doesn't conflict with the kernel ephemeral range.

The failures are disproportionatly related to port allocation, where a
nomad agent fails to start when the http port is already bound to
another process. The failures are intermitent and aren't specific to any
test in particular. The following is a representative failure:
https://app.circleci.com/pipelines/github/hashicorp/nomad/13995/workflows/6cf6eb38-f93c-46f8-8aa0-f61e62fe7694/jobs/128169
.

Upon investigation, the issue seems to be that the api freeport library
picks a port block within 10,000-14,500, but that overlaps with the
kernel ephemeral range 32,769-60,999! So, freeport may allocate a free
port to the nomad agent, just to be used by another process before the
nomad agent starts!

This happened for example in
https://app.circleci.com/pipelines/github/hashicorp/nomad/14111/workflows/e1fcd7ff-f0e0-4796-8719-f57f510b1ffa/jobs/129684
.  `freeport` allocated port 41662 to serf, but `google_accounts`
raced to use it to connect to the CirleCI vm metadata service.

We avoid such races by using a dedicated port range that's disjoint from
the kernel ephemeral port range.
2021-01-06 16:18:28 -05:00
Mahmood Ali 00be4fc63c
tests: deflake TestTaskRunner_StatsHook_Periodic (#9734)
This PR deflakes TestTaskRunner_StatsHook_Periodic tests and adds backoff when the driver closes the channel.

TestTaskRunner_StatsHook_Periodic is currently the most flaky test - failing ~4% of the time (20 out of 486 workflows). A sample failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/14028/workflows/957b674f-cbcc-4228-96d9-1094fdee5b9c/jobs/128563 .

This change has two components:

First, it updates the StatsHook so that it backs off when stats channel is closed. In the context of the test where the mock driver emits a single stats update and closes the channel, the test may make tens of thousands update during the period. In real context, if a driver doesn't implement the stats handler properly or when a task finishes, we may generate way too many Stats queries in a tight loop. Here, the backoff reduces these queries. I've added a failing test that shows 154,458 stats updates within 500ms in https://app.circleci.com/pipelines/github/hashicorp/nomad/14092/workflows/50672445-392d-4661-b19e-e3561ed32746/jobs/129423 .

Second, the test ignores the first stats update after a task exit. Due to the asynchronicity of updates and channel/context use, it's possible that an update is enqueued while the test marks the task as exited, resulting into a spurious update.
2021-01-06 16:03:00 -05:00
Nick Ethier bb060bd46b command/agent/consul: remove duplicated tests 2021-01-06 14:11:31 -05:00
Kris Hicks 868ba0cea5
consul: Refactor parts of UpdateWorkload (#9737)
This removes modification of ops in methods that UpdateWorkload calls, keeping
them local to UpdateWorkload. It also includes some rewrites of checkRegs for
clarity.
2021-01-06 11:11:28 -08:00
Nick Ethier ab01e19df3 command/agent/consul: use port's to value when building service address under 'alloc' addr_mode 2021-01-06 13:52:48 -05:00
Jeff Escalante 220a24f973
Merge pull request #9638 from hashicorp/je.mdx-remote
implement mdx remote
2021-01-06 11:55:05 -05:00
Michael Lange 6d985f05bb When the topo viz filters out nodes, report this to the user via warning alert 2021-01-05 17:11:17 -08:00
Michael Lange 9b488adbf2 Filter out nodes that don't have NodeResources from the topo viz 2021-01-05 17:11:17 -08:00
Jeff Escalante af3759fc84
fix intro navigation 2021-01-05 19:40:19 -05:00
Jeff Escalante 48bfdbc565
fix style references 2021-01-05 19:07:55 -05:00
Jeff Escalante 2c9b689c8e
touch up dependencies 2021-01-05 19:03:18 -05:00
Jeff Escalante eaaafd9dd4
implement mdx remote 2021-01-05 19:02:39 -05:00
Michael Lange e2c063fa0f
Merge pull request #9697 from hashicorp/docs-wtdd-cleanup-images
docs: Remove unused images + intro pages
2021-01-05 14:22:52 -08:00
Michael Schurter 8345bd273c
Merge pull request #9732 from hashicorp/docs-cleanup
Docs cleanup
2021-01-05 13:10:36 -08:00
Michael Lange 5a37cfded0
Merge pull request #9692 from hashicorp/docs-wtdd-jobs-field-reference
docs: Enumerate job statuses and types in docs
2021-01-05 11:46:22 -08:00
Michael Schurter d5b858680b docs: mention client can be non-root
I want to strike a balance here:

- On the one hand there are use cases (raw_exec or Docker only) where
  running Nomad clients as an unprivileged user is *preferable.*
- On the other hand running Nomad clients as root is our main and best
  tested environment. So I want to leave that a strong recommendation.
2021-01-05 11:08:23 -08:00
Michael Schurter 70bf5456d3 docs: remove old client.options migration note
There's no reason to give the `client.options` to `plugins` migration
top billing on the client configuration page. Remove and and clarify the
more appropriately placed note down below.
2021-01-05 10:56:14 -08:00
Michael Schurter 5d7713bb4e docs: remove duplicate PreserveCounts entry 2021-01-05 10:41:38 -08:00
Michael Schurter b91b941b79
Merge pull request #9731 from hashicorp/docs-upgrade-102
docs: note that 1.0.2 is unreleased
2021-01-05 10:12:25 -08:00
Michael Schurter 1cede6cff8 docs: note that 1.0.2 is unreleased 2021-01-05 10:06:39 -08:00
Kris Hicks be6580bd28
Refactor alloc exec command (#9718)
This re-arranges the alloc exec Run implementation to have validation
hoisted as high as possible.
2021-01-05 09:33:04 -08:00
Luiz Aoqui af5fd4bb89
Merge pull request #9429 from hashicorp/docs-update-autoscale-policy
docs: update file scaling policy docs
2021-01-05 11:10:11 -05:00
Seth Hoenig b88996b2dd
Merge pull request #9727 from hashicorp/b-empty-ingress-proxy
consul/connect: avoid NPE from unset connect gateway proxy
2021-01-05 10:09:46 -06:00
Seth Hoenig 3636ff8f80
Merge pull request #9726 from hashicorp/f-add-tf-lockfile
e2e: add terraform lockfile
2021-01-05 10:03:38 -06:00
Seth Hoenig c46a503fdb docs: update changelog 2021-01-05 09:31:22 -06:00
Seth Hoenig 6c9366986b consul/connect: avoid NPE from unset connect gateway proxy
Submitting a job with an ingress gateway in host networking mode
with an absent gateway.proxy block would cause the Nomad client
to panic on NPE.

The consul registration bits would assume the proxy stanza was
not nil, but it could be if the user does not supply any manually
configured envoy proxy settings.

Check the proxy field is not nil before using it.

Fixes #9669
2021-01-05 09:27:01 -06:00