open-nomad

Commit Graph

Author	SHA1	Message	Date
James Rasell	d6cab8aa14	Merge pull request #9767 from hashicorp/f-e2e-job-scaling-suite e2e: add job scaling test suite.	2021-01-11 18:35:07 +01:00
Tim Gross	d78b4fc1a1	safely handle existing net namespace in default network manager When a client restarts, the network_hook's prerun will call `CreateNetwork`. Drivers that don't implement their own network manager will fall back to the default network manager, which doesn't handle the case where the network namespace is being recreated safely. This results in an error and the task being restarted for `exec` tasks with `network` blocks (this also impacts the community `containerd` and probably other community task drivers). If we get an error when attempting to create the namespace and that error is because the file already exists and is locked by its process, then we'll return a `nil` error with the `created` flag set to false, just as we do with the `docker` driver.	2021-01-11 11:31:03 -05:00
Seth Hoenig	43880dadd5	Merge pull request #9765 from hashicorp/f-bump-connect-examples command: bump connect examples to v3	2021-01-11 10:22:58 -06:00
Seth Hoenig	64a8b795f2	Merge pull request #9766 from hashicorp/f-bump-cni-plugins-version cni: bump CNI plugins version to v0.9.0	2021-01-11 09:59:43 -06:00
Tim Gross	f97505e384	e2e: remove deprecated terraform syntax Also bumps patch versions of some TF modules	2021-01-11 08:25:22 -05:00
James Rasell	4374d99071	e2e: add job scaling test suite.	2021-01-11 11:34:19 +01:00
Seth Hoenig	fc5f48d936	cni: bump CNI version to v0.9.0 https://github.com/containernetworking/plugins/releases/tag/v0.9.0 Also make the copy-paste install instructions work with arm64 for a better OOTB experience (AWS Graviton, Pi 4's).	2021-01-10 18:03:27 -06:00
Seth Hoenig	207fe378ce	docs: update countdash examples to v3	2021-01-10 17:19:39 -06:00
Seth Hoenig	36da162619	command: generate bindata assetfs	2021-01-10 17:09:08 -06:00
Seth Hoenig	456868c166	command: bump connect examples to v3 Nomad v1.0+ combined with Consul 1.9+ support launching Envoy v1.16+ which is the first version of envoy to support arm64 platforms out of the box. By rebuilding our example docker containers for connect to be multiplatform between amd64 and arm64, Nomad can provide a nicer user experience for those trying out Connect on arm64 machines (e.g. AWS Graviton instances or Raspberry Pi 4's). This has been done for the countdash examples at v3. https://hub.docker.com/layers/hashicorpnomad/counter-dashboard/v3/images/sha256-94e323587bc372ba1b6ca5c58dc23e291e9d26787b50e71025f1c8967dfbcd07?context=repo https://hub.docker.com/layers/hashicorpnomad/counter-api/v3/images/sha256-16a9e9e08082985a635c9edd0f258b084153c6c7831a9b41d34bde78c308b65c?context=repo The connect-native examples are now also multiplatform at v5, but we don't have them built into `job init`.	2021-01-10 16:54:31 -06:00
Chris Baker	cdfe5a50ff	Merge pull request #9761 from hashicorp/b-9758-enforce-policy-on-scale in Job.Scale, ensure that new count is within [min,max] configured in scaling policy	2021-01-08 15:49:38 -06:00
Chris Baker	3546469205	nicer error message	2021-01-08 21:13:29 +00:00
Jeff Escalante	f4e68cedc1	update dependencies (#9760 )	2021-01-08 15:46:31 -05:00
Buck Doyle	2589f7360c	Add documentation for exec websocket (#9679 )	2021-01-08 14:01:06 -06:00
Chris Baker	d43e0d10c0	appease the linter and fix an incorrect test	2021-01-08 19:38:25 +00:00
Chris Baker	a53e54d7a6	changelog for 9761	2021-01-08 19:26:42 +00:00
Chris Baker	49effd5840	in Job.Scale, ensure that new count is within [min,max] configured in scaling policy resolves #9758	2021-01-08 19:24:36 +00:00
Drew Bailey	c87adfac62	persist shared ports during inplace updates (#9736 ) AllocatedSharedResources were not being copied over to the new allocation struct the scheduler makes during inplace updates. This caused downstream issues after the plan was applied, namely the shared ports were dropped causing issues with service registration/deregistration. test that shared ports are preserved change log, also carry over shared network copy networks	2021-01-08 09:00:41 -05:00
Tim Gross	5b9a98d25a	docs: clarify default behavior of docker userns_mode	2021-01-08 08:22:39 -05:00
Chulki Lee	b7b23e9955	Fix HCL2 link	2021-01-08 08:19:06 -05:00
James Rasell	108fa33393	Merge pull request #9747 from hashicorp/f-e2e-scaling-policy-suite e2e: add ScalingPolicies test suite with initial test case.	2021-01-08 10:51:48 +01:00
Michael Lange	1fabd3240c	Merge pull request #9614 from hashicorp/dependabot/npm_and_yarn/ui/ini-1.3.7 build(deps): bump ini from 1.3.5 to 1.3.7 in /ui	2021-01-07 14:10:03 -08:00
Tim Gross	cb0c4b1d0b	changelog entry for #9532	2021-01-07 15:44:13 -05:00
Joel May	13faf0d79e	Allow client.cpu_total_compute to override attr.cpu.totalcompute	2021-01-07 15:31:11 -05:00
Seth Hoenig	09c13b0066	Merge pull request #9751 from hashicorp/b-envoyv-segfault consul/connect: fix panic during in-place upgrade with connect jobs	2021-01-07 14:22:27 -06:00
Tim Gross	4eafcb06ef	changelog: add entry for GH-9050	2021-01-07 15:01:04 -05:00
Seth Hoenig	303856183c	consul/connect: fix panic during in-place upgrade with connect jobs When upgrading from Nomad v0.12.x to v1.0.x, Nomad client will panic on startup if the node is running Connect enabled jobs. This is caused by a missing piece of plumbing of the Consul Proxies API interface during the client restore process. Fixes #9738	2021-01-07 13:24:24 -06:00
Michael Lange	304378565c	Merge pull request #9690 from hashicorp/docs-wtdd-update-ui-api-docs WTDD: Update UI api docs	2021-01-07 10:52:45 -08:00
Kent 'picat' Gruber	f0d1c4092b	Update go-getter to v1.5.2 with support for vhost style S3 paths (#9349 )	2021-01-07 13:34:28 -05:00
Jeff Escalante	8b4f6b40e4	Merge pull request #9748 from hashicorp/docs-zs.build-time-code-highlight-revised Add build-time highlighting to code blocks	2021-01-07 13:33:49 -05:00
Michael Lange	674707e349	Update the page param default to 1 instead of 0	2021-01-07 09:59:09 -08:00
Michael Lange	aa8e209c2e	Typo fixes Co-authored-by: Buck Doyle <buck@hashicorp.com>	2021-01-07 09:59:08 -08:00
Michael Lange	46a0435cf2	Update 'Node' to 'Client' which is used throughout the UI	2021-01-07 09:59:08 -08:00
Michael Lange	761b7a1cef	Add missing faceted search query params	2021-01-07 09:59:08 -08:00
Michael Lange	2f05f06ecd	Remove no longer true enteprise warning	2021-01-07 09:59:08 -08:00
Michael Lange	549f2f77ab	Remove version introduction 0.7 is ancient at this point. Now it's as if the UI has always existed.	2021-01-07 09:59:07 -08:00
Michael Lange	75f304bbc2	Add missing routes to the UI API doc	2021-01-07 09:59:07 -08:00
Michael Lange	d9b8f6d411	Merge pull request #9733 from hashicorp/b-ui/topo-viz-old-agent UI: Guard against nodes running an old version of the Nomad agent	2021-01-07 09:27:14 -08:00
Zach Shilton	2a9f9aa8d3	Remove broken shell-session highlighting	2021-01-07 11:57:09 -05:00
Zach Shilton	caa30ca097	Add build-time highlighting to code blocks	2021-01-07 11:48:02 -05:00
James Rasell	005e15afbc	Merge pull request #9744 from hashicorp/f-add-namespace-e2e-oss e2e: move namespace tests into OSS.	2021-01-07 17:36:09 +01:00
Nick Ethier	6705f845f2	Merge pull request #9739 from hashicorp/b-alloc-netmode-ports Use port's to value when building service address under 'alloc' addr_mode	2021-01-07 09:16:27 -05:00
Kdu Bonalume	425ad5892d	Fix missing link for Consul integration Add a link back to configuration/consul in the `service` parameter section of the `group` stanza.	2021-01-07 09:02:43 -05:00
Nick Ethier	7a6aab10bb	Apply suggestions from code review Co-authored-by: Tim Gross <tgross@hashicorp.com>	2021-01-07 08:53:54 -05:00
James Rasell	b087d68736	e2e: add ScalingPolicies test suite with initial test case.	2021-01-07 14:39:55 +01:00
James Rasell	02b9d9da87	e2e: move namespace tests into OSS.	2021-01-07 09:15:43 +01:00
Jeff Escalante	f791725736	Merge pull request #9743 from hashicorp/je.fix-edit-page-links hotfix: fix 'edit this page' links	2021-01-06 19:09:46 -05:00
Jeff Escalante	8c04e22ce4	fix 'edit this page' links	2021-01-06 19:01:32 -05:00
Mahmood Ali	050ad6b6f4	tests: deflake test-api job (#9742 ) Deflake test-api job, currently failing at around 7.6% (44 out of 578 workflows), by ensuring that test nomad agent use a small dedicated port range that doesn't conflict with the kernel ephemeral range. The failures are disproportionatly related to port allocation, where a nomad agent fails to start when the http port is already bound to another process. The failures are intermitent and aren't specific to any test in particular. The following is a representative failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/13995/workflows/6cf6eb38-f93c-46f8-8aa0-f61e62fe7694/jobs/128169 . Upon investigation, the issue seems to be that the api freeport library picks a port block within 10,000-14,500, but that overlaps with the kernel ephemeral range 32,769-60,999! So, freeport may allocate a free port to the nomad agent, just to be used by another process before the nomad agent starts! This happened for example in https://app.circleci.com/pipelines/github/hashicorp/nomad/14111/workflows/e1fcd7ff-f0e0-4796-8719-f57f510b1ffa/jobs/129684 . `freeport` allocated port 41662 to serf, but `google_accounts` raced to use it to connect to the CirleCI vm metadata service. We avoid such races by using a dedicated port range that's disjoint from the kernel ephemeral port range.	2021-01-06 16:18:28 -05:00
Mahmood Ali	00be4fc63c	tests: deflake TestTaskRunner_StatsHook_Periodic (#9734 ) This PR deflakes TestTaskRunner_StatsHook_Periodic tests and adds backoff when the driver closes the channel. TestTaskRunner_StatsHook_Periodic is currently the most flaky test - failing ~4% of the time (20 out of 486 workflows). A sample failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/14028/workflows/957b674f-cbcc-4228-96d9-1094fdee5b9c/jobs/128563 . This change has two components: First, it updates the StatsHook so that it backs off when stats channel is closed. In the context of the test where the mock driver emits a single stats update and closes the channel, the test may make tens of thousands update during the period. In real context, if a driver doesn't implement the stats handler properly or when a task finishes, we may generate way too many Stats queries in a tight loop. Here, the backoff reduces these queries. I've added a failing test that shows 154,458 stats updates within 500ms in https://app.circleci.com/pipelines/github/hashicorp/nomad/14092/workflows/50672445-392d-4661-b19e-e3561ed32746/jobs/129423 . Second, the test ignores the first stats update after a task exit. Due to the asynchronicity of updates and channel/context use, it's possible that an update is enqueued while the test marks the task as exited, resulting into a spurious update.	2021-01-06 16:03:00 -05:00

1 2 3 4 5 ...

20462 Commits All Branches Search

20462 Commits

All Branches