Exercises host volume and Docker volume functionality for the `exec` and `docker`
task driver, particularly around mounting locations within the container and
how this can be used with `template`.
Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via
a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad
TF provider if we're bootstrapping in the same Terraform stack, so instead of
using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous
policy. The resulting management token is exported as an environment var with
`$(terraform output environment)` and tests that want stricter ACLs will be
able to write them using that token.
This should also provide a basis to do similar work with Consul ACLs in the
future.
The 'docker.config.infra_image' would default to an amd64 container.
It is possible to reference the correct image for a platform using
the `runtime.GOARCH` variable, eliminating the need to explicitly set
the `infra_image` on non-amd64 platforms.
Also upgrade to Google's pause container version 3.1 from 3.0, which
includes some enhancements around process management.
Fixes#8926
Newer EC2 instances are both cheaper and have generally better
performance.
The dnsmasq configuration had a hard-coded interface name, so in order to
accomodate instances with more recent networking that result in so-called
predictable interface names, the dnsmasq configuration needs to be replaced at
runtime with userdata to select the default interface.
We currently have two jobs commented out in ENT because we don't have the test
target configured there. So having the filter attached to one of those jobs
means changes to filters always create a merge conflict.
- We previously added these to the client host metrics, but it's useful to have them on all client metrics.
- e.g. so you can exclude draining nodes from charts showing your fleet size.
Noticed this error in some production logs, and they were far from
helpful. Changes:
1. Include job ID in logs
2. Wrap errors and log once instead of double log lines
3. Test fsm error handling behavior
The conditional around some of the rescheduling tests was backwards, where we
were waiting for allocations to be rescheduled but testing for a count of
0. The test was passing but flaky because if the check happened quickly enough
before the scheduler rescheduled the allocations, it would pass.
Have Terraform run the target-specific `provision.sh`/`provision.ps1` script
rather than the test runner code which needs to be customized for each
distro. Use Terraform's detection of variable value changes so that we can
re-run the provisioning without having to re-install Nomad on those specific
hosts that need it changed.
Allow the configuration "profile" (well-known directory) to be set by a
Terraform variable. The default configurations are installed during Packer
build time, and symlinked into the live configuration directory by the
provision script. Detect changes in the file contents so that we only upload
custom configuration files that have changed between Terraform runs
Branches for the e2e code base can't have impact on the unit tests, so running
those tests just extends the time it takes to ship e2e updates. This changeset
updates the CircleCI config so that e2e branches run linting, build the
binary, and run the e2e unit tests (currently just vault compatibility).
The current implementation measures RPC request timeout only against
config.RPCHoldTimeout, which is fine for non-blocking requests but will
almost surely be exceeded by long-poll requests that block for minutes
at a time.
This adds an HasTimedOut method on the RPCInfo interface that takes into
account whether the request is blocking, its maximum wait time, and the
RPCHoldTimeout.
* remove outdated references to envchain in documentation
* add new host volume locations in userdata
* don't exit the entire script during provisioning, just return