Commit graph

19553 commits

Author SHA1 Message Date
Mahmood Ali f4450db775 tests: use system path
On host with systemd-resolved, we copy /run/systemd/resolve/resolv.conf
actually.
2020-10-01 10:23:19 -04:00
Mahmood Ali 567597e108 Compare to the correct host network setting
In systemd-resolved hosts with no DNS customizations, the docker driver
DNS setting should be compared to /run/systemd/resolve/resolv.conf while
exec/java drivers should be compared to /etc/resolv.conf.

When system-resolved is enabled, /etc/resolv.conf is a stub that points
to 127.0.0.53. Docker avoids this stub because this address isn't
accessible from the container.  The exec/java drivers that don't create
network isolations use the stub though in the default configuration.
2020-10-01 10:23:14 -04:00
Mahmood Ali f4b0aa0c1c tests: copy permissions when copying files
On the failover path, copy the permission bits (a.k.a. file mode),
specially the execution bit.
2020-10-01 10:23:14 -04:00
Mahmood Ali cd060db42a tests: ignore empty cgroup
My latest Vagrant box contains an empty cgroup name that isn't used for
isolation:

```
$ cat /proc/self/cgroup  | grep ::
0::/user.slice/user-1000.slice/session-17.scope
```
2020-10-01 10:23:13 -04:00
Mahmood Ali eb5ff115a5
Merge pull request #8992 from hashicorp/b-tests-failover-copy
tests: failover to copying when symlinking fails
2020-09-30 09:22:24 -05:00
Mahmood Ali 91376cccf2 tests: failover to copying when symlinking fails
Symlinking busybox may fail when the test code and the test temporary
directory live on different volumes/partitions; so we should copy
instead.  This situation arises in the Vagrant setup, where the code
repository live on special file sharing volume.

Somewhat unrelated, remove `f.Sync()` invocation from a test copyFile
helper function.  Sync is useful only for crash recovery, and isn't
necessary in our test setup.  The sync invocation is a significant
overhead as it requires the OS to flush any cached writes to disk.
2020-09-30 09:58:22 -04:00
Tim Gross 4bab91b81b
e2e: ensure tests are constrained to Linux (#8990)
Until we have LCOW support in the E2E environment (which requires a Windows
2019 test target), we need to constrain E2E tests to the appropriate kernel
2020-09-30 09:43:30 -04:00
Tim Gross e49410e97b
e2e: cleanup errors should use assert, not require (#8989)
The E2E framework wraps testify's `require` so that by default we can stop
tests on errors, but the cleanup functions should use `assert` so that we
continue to try to cleanup the test environment even if there's a failure.
2020-09-30 09:00:37 -04:00
Lars Lehtonen 03abe3c890
client: fix test umask (#8987) 2020-09-30 08:09:41 -04:00
Luiz Aoqui 32c6a99ff3
Merge pull request #8985 from hashicorp/f-scaling-policy-type
Add scaling policy type
2020-09-29 18:18:24 -04:00
Luiz Aoqui 88d4eecfd0
add scaling policy type 2020-09-29 17:57:46 -04:00
Dave May eaa4f6faf5
Merge pull request #8922 from hashicorp/dmay-raftutil-path
Raftutil cleanup, plus helper function to find raft.db
2020-09-29 15:12:32 -04:00
davemay99 a7fc6e9b30 return explicit error if not found/empty path falls through 2020-09-29 14:55:28 -04:00
Tim Gross 9853653f2a
docs: quota apply does not have a 'name' param (#8983) 2020-09-29 13:58:30 -04:00
Charlie Voiselle 9d85195361
[docs] Update redirects and links for learn.hashicorp.com (#8598)
* Fix links to ACL guides
* Managing Nomad guide links; links in jsx pages
* job updates guide URLS
* node-drain guide URLS
* outage recovery guide links
* fix guide links - sentinel
* fix guide links - namespaces
* fix guide links - quotas
* fix guide links - autopilot
* more guide links.
* more guide links - continued.
* Updating redirects for learn
* Getting Started
* Load Balancing Guides
* update redirects for ui guide
* Consolidate spark redirects to point to GH repo
* operating job update part 1
* finish operating job links; operations guides links.
* finish guide redirects
* coalesce EOL redirects for spark guides.
* one last link
* Checked links and found a few more stray links
* Found more .htmls
* Fixup links for new HC websites
* Post-merge fixups
* linkcheck caught missing ids
2020-09-29 12:48:32 -04:00
Mahmood Ali 77ac8c4991
Merge pull request #8981 from hashicorp/b-vagrant-vault-setup
Update Vagrant consul/vault deps
2020-09-29 11:40:02 -05:00
Mahmood Ali 2e9e8ccc24
Merge pull request #8982 from hashicorp/b-exec-dns-resolv
drivers/exec: fix DNS resolution in systemd hosts
2020-09-29 11:39:43 -05:00
Seth Hoenig 6d9a6786e5
Merge pull request #8933 from jf/fix_docker_stopsignal
drivers/docker/driver.go: change default signal for docker driver to SIGTERM?
2020-09-29 10:51:04 -05:00
Mahmood Ali 7ddf4b2902 drivers/exec: fix DNS resolution in systemd hosts
Host with systemd-resolved have `/etc/resolv.conf` is a symlink
to `/run/systemd/resolve/stub-resolv.conf`. By bind-mounting
/etc/resolv.conf only, the exec container DNS resolution fail very badly.

This change fixes DNS resolution by binding /run/systemd/resolve as
well.

Note that this assumes that the systemd resolver (default to 127.0.0.53) is
accessible within the container. This is the case here because exec
containers share the same network namespace by default.

Jobs with custom network dns configurations are not affected, and Nomad
will continue to use the job dns settings rather than host one.
2020-09-29 11:33:51 -04:00
Mahmood Ali 1bac665cbf Update Vagrant consul/vault deps
Update Vagrant setup to use the consul/vault versions used in CI.
2020-09-29 11:06:53 -04:00
Tim Gross fa1fa623f2
e2e: rework rescheduling progress deadline test (#8958)
Eliminate sources of randomness in the progress deadline test and clarify the
purpose of the test to check for progress deadline updates.
2020-09-29 11:02:16 -04:00
Seth Hoenig 6dd862b927
Merge pull request #8977 from hashicorp/b-script-check-task
consul: fix validation of task in group-level script-checks
2020-09-29 09:29:02 -05:00
davemay99 a592186947 satisfy the linter 2020-09-29 01:37:15 -04:00
davemay99 bb555b249f reverting export of fixTime 2020-09-29 01:21:03 -04:00
davemay99 61c12e3e18 finish refactoring walk to search for any file 2020-09-29 01:17:10 -04:00
Tim Gross 6489c5f626
e2e: namespace support for CLI helpers (#8978)
Required to support tests for namespaces and other ENT features.
2020-09-28 16:37:34 -04:00
Seth Hoenig af9543c997 consul: fix validation of task in group-level script-checks
When defining a script-check in a group-level service, Nomad needs to
know which task is associated with the check so that it can use the
correct task driver to execute the check.

This PR fixes two bugs:
1) validate service.task or service.check.task is configured
2) make service.check.task inherit service.task if it is itself unset

Fixes #8952
2020-09-28 15:02:59 -05:00
Tim Gross b12938a9fb
command: fix a typo in the help text for namespaces (#8975) 2020-09-28 12:23:25 -04:00
Tim Gross 6bed4ec45b
e2e: ENT placeholder for namespace/quotas tests (#8973) 2020-09-28 11:23:37 -04:00
Tim Gross 1311f32f1b
e2e: test for host volumes and Docker volumes (#8972)
Exercises host volume and Docker volume functionality for the `exec` and `docker`
task driver, particularly around mounting locations within the container and
how this can be used with `template`.
2020-09-28 11:14:13 -04:00
Tim Gross 566dae7b19
e2e: add flag to bootstrap Nomad ACLs (#8961)
Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via
a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad
TF provider if we're bootstrapping in the same Terraform stack, so instead of
using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous
policy. The resulting management token is exported as an environment var with
`$(terraform output environment)` and tests that want stricter ACLs will be
able to write them using that token.

This should also provide a basis to do similar work with Consul ACLs in the
future.
2020-09-28 09:22:36 -04:00
davemay99 f2b3536da2 refactor functions to find raft.db 2020-09-24 19:00:53 -04:00
davemay99 4fcdbb9dee logging tweaks 2020-09-24 17:30:50 -04:00
davemay99 48f8ee1d11 export FixTime to allow external use 2020-09-24 16:47:58 -04:00
Nick Ethier e75a3f349b
command: remove mbits from quota hcl (#8740) 2020-09-24 11:44:59 -04:00
Tim Gross 7023f59ca3
ci: don't run UI tests for e2e branches (#8956) 2020-09-24 08:59:56 -04:00
Seth Hoenig 19e4a2e1ad
Merge pull request #8957 from hashicorp/f-pause-container-arch
drivers/docker: detect arch for default docker plugin infra_image
2020-09-23 15:21:06 -05:00
Seth Hoenig fd2a31a331 drivers/docker: detect arch for default infra_image
The 'docker.config.infra_image' would default to an amd64 container.
It is possible to reference the correct image for a platform using
the `runtime.GOARCH` variable, eliminating the need to explicitly set
the `infra_image` on non-amd64 platforms.

Also upgrade to Google's pause container version 3.1 from 3.0, which
includes some enhancements around process management.

Fixes #8926
2020-09-23 13:54:30 -05:00
Tim Gross 15d3f5ea7e
e2e: remove unused migrations test (#8955)
The areas of the code this test exercised were merged in with the node
drain tests.
2020-09-23 14:50:15 -04:00
Tim Gross 147b16243d
e2e: use more recent instance type (#8954)
Newer EC2 instances are both cheaper and have generally better
performance.

The dnsmasq configuration had a hard-coded interface name, so in order to
accomodate instances with more recent networking that result in so-called
predictable interface names, the dnsmasq configuration needs to be replaced at
runtime with userdata to select the default interface.
2020-09-23 14:27:52 -04:00
Tim Gross 1fc525ec1e
e2e: add flags for provisioning Nomad Enterprise (#8929) 2020-09-23 10:39:04 -04:00
Tim Gross e78fc7ff93
ci: reduce merge conflict potential in CircleCI config (#8951)
We currently have two jobs commented out in ENT because we don't have the test
target configured there. So having the filter attached to one of those jobs
means changes to filters always create a merge conflict.
2020-09-23 09:35:23 -04:00
Ryan Oaks 7333335b1c
Merge pull request #8942 from hashicorp/ro.docs-component-update
docs: Update docs platform components
2020-09-22 16:21:35 -04:00
Tim Gross 3fb377b513
docs: fix escaping in job run env vars (#8944) 2020-09-22 14:32:29 -04:00
Pete Woods 81fa2a01fc
Add node "status", "scheduling eligibility" to all client metrics (#8925)
- We previously added these to the client host metrics, but it's useful to have them on all client metrics.
- e.g. so you can exclude draining nodes from charts showing your fleet size.
2020-09-22 13:53:50 -04:00
Neil Mock f749de8543
Fix multi-interface networking in the system scheduler (#8822) 2020-09-22 12:54:34 -04:00
Michael Schurter eb48e6a788
Merge pull request #8745 from hashicorp/b-fsm-logs
core: improve job deregister error logging
2020-09-22 09:42:07 -07:00
Ryan Oaks 950de562a1 Update docs platform components 2020-09-22 12:08:57 -04:00
Michael Lange 6187e3116e
Merge pull request #8931 from hashicorp/f-ui/children-jobs-quality-of-life
UI: Launched jobs quality of life improvements
2020-09-22 07:38:54 -07:00
Michael Schurter c79c0ace97 docs: add #8745 to changelog 2020-09-21 08:59:47 -07:00