The e2e framework instantiates clients for Nomad/Consul but the
provisioning of the actual Nomad cluster is left to Terraform. The
Terraform provisioning process uses `remote-exec` to deploy specific
versions of Nomad so that we don't have to bake an AMI every time we
want to test a new version. But Terraform treats the resulting
instances as immutable, so we can't use the same tooling to update the
version of Nomad in-place. This is a prerequisite for upgrade testing.
This changeset extends the e2e framework to provide the option of
deploying Nomad (and, in the future, Consul/Vault) with specific
versions to running infrastructure. This initial implementation is
focused on deploying to a single cluster via `ssh` (because that's our
current need), but provides interfaces to hook the test run at the
start of the run, the start of each suite, or the start of a given
test case.
Terraform work includes:
* provides Terraform output that written to JSON used by the framework
to configure provisioning via `terraform output provisioning`.
* provides Terraform output that can be used by test operators to
configure their shell via `$(terraform output environment)`
* drops `remote-exec` provisioning steps from Terraform
* makes changes to the deployment scripts to ensure they can be run
multiple times w/ different versions against the same host.
I originally planned to add component documentation, but as this dragged on and I found that JSDoc-to-Markdown sometimes needed hand-tuning, I decided to skip it and focus on replicating what was already present in Freestyle. Adding documentation is a finite task that can be revisited in the future.
My goal was to migrate everything from Freestyle with as few changes as possible. Some adaptations that I found necessary:
• the DelayedArray and DelayedTruth utilities that delay component rendering until slightly after initial render because without them:
◦ charts were rendering with zero width
◦ the JSON viewer was rendering with empty content
• Storybook in Ember renders components in a routerless/controllerless context by default, so some component stories needed changes:
◦ table pagination/sorting stories access to query params, which necessitates some reaching into Ember internals to start routing and dynamically generate a Storybook route/controller to render components into
◦ some stories have a faux controller as part of their Storybook context that hosts setInterval-linked dynamic computed properties
• some jiggery-pokery with anchor tags
◦ inert href='#' had to become href='javascript:;
◦ links that are actually meant to navigate need target='_parent' so they don’t navigate inside the Storybook iframe
Maybe some of these could be addressed by fixes in ember-cli-storybook but I’m wary of digging around in there any more than I already have, as I’ve lost a lot of time to Storybook confusion and frustrations already 😞
The STORYBOOK=true environment variable tweaks some environment settings to get things working as expected in the Storybook context.
I chose to:
• use angle bracket invocation within stories rather than have to migrate them soon after having moved to Storybook
• keep Freestyle around for now for its palette and typeface components
Group service checks cannot interpolate task fields, because the task
fields are not available at the time the script check hook is created
for the group service. When f31482a was merged this e2e test began
failing because we are now correctly matching the script check ID to
the service ID, which revealed this jobspec was invalid.
This builds on API changes in #6017 and #6021 to conditionally turn off the
“Run Job” button based on the current token’s capabilities, or the capabilities
of the anonymous policy if no token is present.
If you try to visit the job-run route directly, it redirects to the job list.
This changeset is part of the work to improve our E2E provisioning
process to allow our upgrade tests:
* Move more of the setup into the AMI image creation so it's a little
more obvious to provisioning config authors which bits are essential
to deploying a specific version of Nomad.
* Make the service file update do a systemd daemon-reload so that we
can update an already-running cluster with the same script we use to
deploy it initially.
Modernize Vault integration/e2e test a bit:
- Download from releases.hashicorp.com instead of using a hardcoded list
- Remove old unused make target e2e-test
- Use NOMAD_E2E env var instead of -integration flag
- Add a README
On my machine with ~250 Mbps internet it takes ~400s to download all
Vault binaries.
The system command includes gc and reconcile-summaries subcommands
which covers all currently available system API calls. The help
information is largely pulled from the current Nomad website API
documentation.
Passes in agent enable_debug config to nomad server and client configs.
This allows for rpc endpoints to have more granular control if they
should be enabled or not in combination with ACLs.
enable debug on client test
When multiple Connect-enabled task groups start on the same client
node, a race condition in the CNI plugins for creating iptables chains
causes one of the tasks to fail. We upstreamed a patch to CNI plugins
to make iptables chain creation idempotent.
This changeset updates end-to-end testing, development tooling, and
documentation to use 0.8.4 which includes our patch.
Now that alloc.Canonicalize() is called in all alloc sources in the
client (i.e. on state restore and RPC fetching), we no longer need to
check alloc.TaskResources.
alloc.AllocatedResources is always non-nil through alloc runner.
Though, early on, we check for alloc validity, so NewTaskRunner and
TaskEnv must still check. `TestClient_AddAllocError` test validates
that behavior.