Commit Graph

163 Commits

Author SHA1 Message Date
Seth Hoenig dfc3b067ea
e2e: fix 1 of 4 client disconnect tests (#15357)
This PR modifies the disconnect helper job to run as root, which is necesary
for manipulating iptables as it does. Also re-organizes the final test logic
to wait for client re-connect before looking for the replacement (3rd) allocation
in case that client was needed to run the alloc (also giving the sheduler more
time to do its thing).

Skips the other 3 tests, which fail and I cannot yet figure out what is going on.
2022-11-22 08:51:53 -06:00
Seth Hoenig eaf842b226
e2e: jammy image needs latest java lts (#15323) 2022-11-18 14:36:36 -06:00
Seth Hoenig 845ff10281
e2e: disable systemd stub dns in jammy image (#15286) 2022-11-17 09:50:44 -06:00
Seth Hoenig 45ff0765c7
e2e: swap bionic image for jammy (#15220) 2022-11-16 10:37:18 -06:00
James Rasell 5d0cc93939
e2e: add acl test for token expiration. (#14418)
In order to add an E2E test to cover token expiration, the server
config has been updated to include a low minimum allowed TTL
value. For ease of reading, the max value is also set.
2022-09-01 09:36:09 +02:00
James Rasell 17a467020c
e2e: add terraform init commands to readme doc. (#13655) 2022-07-08 16:52:35 +02:00
Tim Gross c4d92205b4
E2E: provide options for reverse proxy for web UI (#12671)
Our E2E test environment is deployed with mTLS, but it's impractical
for us to use mTLS in headless browsers for automated testing (or even
in manual testing). Provide certificates for proxying the web UI via
Nginx. This proxy uses client certs for proxying to the HTTP endpoint
and a self-signed cert for the browser-facing endpoint. We can accept
certificate errors in the automated tests we'll be adding in the next
step of this work.
2022-04-19 16:55:05 -04:00
Tim Gross 70c262eb95
E2E: terraform provisioner upgrades (#12652)
While working on infrastructure for testing the UI in E2E, we needed
to upgrade the certificate provider. Performing a provider upgrade via
the TF `init -upgrade` brought in updates for the file and AWS
providers as well. These updates include deprecating the use of
`sensitive_content` fields, removing CA algorithm parameters that can
be inferred from keys, and removing the requirement to manually
specify AWS assume role parameters in the provider config if they're
available in the calling environment's AWS config file (as they are
via doormat or our E2E environment).
2022-04-19 14:27:14 -04:00
Derek Strickland 3f871973f9
Update E2E terraform output command (#12561) 2022-04-13 16:46:09 -04:00
Tim Gross 4078e6ea0e
scripts: fix interpreter for bash (#12549)
Many of our scripts have a non-portable interpreter line for bash and
use bash-specific variables like `BASH_SOURCE`. Update the interpreter
line to be portable between various Linuxes and macOS without
complaint from posix shell users.
2022-04-12 10:08:21 -04:00
Tim Gross 19703e3316
E2E: test exercising node drain behavior for CSI volumes (#12384) 2022-03-29 11:19:23 -04:00
Tim Gross 5c7f2bad0b
E2E: namespace HCP vault and consul policies to avoid collisions (#12386)
Concurrent E2E runs can collide when provisioning policies on HCP
Consul and HCP Vault. Namespace these by the test run name, as we do
for most everything else.
2022-03-25 16:05:59 -04:00
Tim Gross 9f05d62338
E2E with HCP Consul/Vault (#12267)
Use HCP Consul and HCP Vault for the Consul and Vault clusters used in E2E testing. This has the following benefits:

* Without the need to support mTLS bootstrapping for Consul and Vault, we can simplify the mTLS configuration by leaning on Terraform instead of janky bash shell scripting.
* Vault bootstrapping is no longer required, so we can eliminate even more janky shell scripting
* Our E2E exercises HCP, which is important to us as an organization
* With the reduction in configurability, we can simplify the Terraform configuration and drop the complicated `provision.sh`/`provision.ps1` scripts we were using previously. We can template Nomad configuration files and upload them with the `file` provisioner.
* Packer builds for Linux and Windows become much simpler.

tl;dr way less janky shell scripting!
2022-03-18 09:27:28 -04:00
Tim Gross 4c4895e19c
e2e: configure prometheus for mTLS for `Metrics` suite (#12181)
The `Metrics` suite uses prometheus to scrape Nomad metrics so that
we're testing the full user experience of extracting metrics from
Nomad. With the addition of mTLS, we need to make sure prometheus also
has mTLS configuration because the metrics endpoint is protected.

Update the Nomad client configuration and prometheus job to bind-mount
the client's certs into the task so that the job can use these certs
to scrape the server. This is a temporary solution that gets the job
passing; we should give the job its own certificates (issued by
Vault?) when we've done some of the infrastructure rework we'd like.
2022-03-04 08:55:06 -05:00
Tim Gross f2a4ad0949
CSI: implement support for topology (#12129) 2022-03-01 10:15:46 -05:00
Luiz Aoqui 3c8381bf85
e2e: enable Consul HTTPS port and always restart Nomad systemd unit 2022-01-18 16:56:26 -05:00
Tim Gross ae04e540e6
hclfmt on some config files (#11611) 2021-12-02 15:25:46 -05:00
Derek Strickland 8a5aa0cd8a
Fix Vault E2E TLS config (#11483)
* Update e2e/terraform configuration for Vault and default to mtls=true
2021-12-02 12:20:09 -05:00
James Rasell 30273d9256
e2e: fix provisioning when ACLs and TLS enabled; no nightly TLS. 2021-09-16 17:15:41 +02:00
Luiz Aoqui f30c024a40
e2e: use absolute path for mTLS env vars (#11126) 2021-09-03 12:59:21 -04:00
Mahmood Ali fec0adbb0e
Support mTLS clusters for e2e testing (#11092)
This allows us to spin up e2e clusters with mTLS configured for all HashiCorp services, i.e. Nomad, Consul, and Vault. Used it for testing #11089 .

mTLS is disabled by default. I have not updated Windows provisioning scripts yet - Windows also lacks ACL support from before. I intend to follow up for them in another round.
2021-08-30 10:18:16 -04:00
James Rasell 73ab63cf68
test: update e2e and dev scripts to use cni plugins v1.0.0 2021-08-27 11:14:47 +02:00
Michael Schurter 319650d481 e2e: use api.ipify.org
ipv4.icanhazip.com returns ipv6 addresses
2021-06-07 15:12:42 -07:00
Mahmood Ali 5258ae480b remove unused Spark security group rules 2021-06-04 11:49:43 -04:00
Mahmood Ali b852dc5eb8 e2e: pass nomad_url variable 2021-06-04 10:32:51 -04:00
Mahmood Ali 71936e1b27 e2e: NOMAD_VERSION is not set when installing url 2021-06-04 10:31:37 -04:00
Mahmood Ali d0768bb999 restrict ingress ip 2021-06-04 10:31:35 -04:00
Luiz Aoqui 139c5e8df9
e2e: fix terraform output environment command instruction (#10674) 2021-06-01 10:10:12 -04:00
Mahmood Ali 0853d48927
e2e: Spin clusters with custom url binaries (#10656)
Ease spinning up a cluster, where binaries are fetched from arbitrary
urls.  These could be CircleCI `build-binaries` job artifacts, or
presigned S3 urls.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-05-25 13:47:39 -04:00
Tim Gross 709b92c5a8 e2e: update TF lockfile 2021-05-18 09:35:57 -04:00
Tim Gross d4465f01ac E2E: remove references to nomad_sha 2021-05-10 16:42:39 -04:00
Michael Schurter 0eb5d5136f e2e: use public_ip in packer 2021-04-27 15:07:03 -07:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Seth Hoenig 198e0d9f24 e2e: get consul ent in e2e packer builds
Using Consul Enterprise is going to be necessary for testing Nomad's
Consul Namespace integration in Nomad v1.1 in e2e.
2021-04-14 12:05:55 -06:00
Tim Gross a84eca0136 E2E: remove broken Move-Item call during Windows provisioning
The archive does not include the `pkg/windows_amd64` path and unpacking the
archive happens in the installation directory.
2021-04-09 09:49:42 -04:00
Tim Gross f4ccb360ef E2E: use remote-exec via TF0.14.7+
The E2E provisioning used local-exec to call ssh in a for loop in a hacky
workaround https://github.com/hashicorp/terraform/issues/25634, which
prevented remote-exec from working on Windows. Move to a newer version of
Terraform that fixes the remote-exec bug to make provisioning more reliable
and observable.

Note that Windows remote-exec needs to include the `powershell` call itself,
unlike Unix-alike remote-exec.
2021-04-08 16:03:06 -04:00
Tim Gross da89103c5c E2E: extend CSI test to cover create and snapshot workflows
Split the EBS and EFS tests out into their own test cases:
* EBS exercises the Controller RPCs, including the create/snapshot workflow.
* EFS exercises only the Node RPCs, and assumes we have an existing volume
that gets registered, rather than created.
2021-04-08 12:55:36 -04:00
Drew Bailey 64084f3209
e2e allow setting an enterprise license environment variable (#10233)
* allow setting an enterprise license environment variable

* update comment

* address pr comments
2021-03-25 14:35:55 -04:00
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Tim Gross 2a2e36690a docs: swap master for main in Nomad repo 2021-03-08 14:26:31 -05:00
Mahmood Ali ff8d67fae2
Merge pull request #9935 from hashicorp/e2e-segment-e2e-clusters
e2e: segment e2e clusters
2021-03-01 09:23:21 -05:00
Seth Hoenig d2cd605995 dist: place systemd unit options correctly
This PR places StartLimitIntervalSec and StartLimitBurst in the
Unit section of systemd unit files, rather than the Service section.

https://www.freedesktop.org/software/systemd/man/systemd.unit.html

Fixes #10065
2021-02-22 19:23:00 -06:00
Drew Bailey c152757d38
E2e/fix periodic (#10047)
* fix periodic

* update periodic to not use template

nomad job inspect no longer returns an apiliststub so the required fields to query job summary are no longer there, parse cli output instead

* rm tmp makefile entry

* fix typo

* revert makefile change
2021-02-18 12:21:53 -05:00
Seth Hoenig 45e0e70a50 consul/connect: enable custom sidecars to use expose checks
This PR enables jobs configured with a custom sidecar_task to make
use of the `service.expose` feature for creating checks on services
in the service mesh. Before we would check that sidecar_task had not
been set (indicating that something other than envoy may be in use,
which would not support envoy's expose feature). However Consul has
not added support for anything other than envoy and probably never
will, so having the restriction in place seems like an unnecessary
hindrance. If Consul ever does support something other than Envoy,
they will likely find a way to provide the expose feature anyway.

Fixes #9854
2021-02-09 10:49:37 -06:00
Chris Baker b1bb8a760e e2e packer build: upgrade jdk to java 14 2021-02-02 17:33:48 +00:00
Mahmood Ali 45889f9f55 e2e: segment e2e clusters
Ensure that the e2e clusters are isolated and never attempt to autojoin
with another e2e cluster.

This ensures that each cluster servers have a unique `ConsulAutoJoin`,
to be used for discovery.
2021-02-01 08:04:21 -05:00
Mahmood Ali 0aafd9af64 e2e: Fix build script and pass shellcheck 2021-01-26 09:11:37 -05:00
Mahmood Ali 4397eda209
Merge pull request #9798 from hashicorp/e2e-terraform-tweaks-20200113
This PR makes two ergonomics changes, meant to get e2e builds more reproducible and ease changes.

### AMI Management

First, we pin the server AMIs to the commits associated with the build.  No more using the latest AMI a developer build in a test branch, or accidentally using a stale AMI because we forgot to build one!  Packer is to tag the AMI images with the commit sha used to generate the image, and then Terraform would look up only the AMIs associated with that sha. To minimize churn, we use the SHA associated with the latest Packer configurations, rather than SHA of all.

This has few benefits: reproducibility and avoiding accidental AMI changes and contamination of changes across branches. Also, the change is a stepping stone to an e2e pipeline that builds new AMIs automatically if Packer files changed.

The downside is that new AMIs will be generated even for irrelevant changes (e.g. spelling, commits), but I suspect that's OK. Also, an engineer will be forced to build the AMI whenever they change Packer files while iterating on e2e scripts; this hasn't been an issue for me yet, and I'll be open for iterating on that later if it proves to be an issue.

### Config Files and Packer

Second, this PR moves e2e config hcl management to Terraform instead of Packer. Currently, the config files live in `./terraform/config`, but they are baked into the servers by Packer and changes are ignored.  This current behavior surprised me, as I spent a bit of time debugging why my config changes weren't applied.  Having Terraform manage them would ease engineer's iteration.  Also, make Packer management more consistent (Packer only works `e2e/terraform/packer`), and easing the logic for AMI change detection.

The config directory is very small (100KB), and having it as an upload step adds negligible time to `terraform apply`.
2021-01-25 13:20:28 -05:00
Mahmood Ali 39da228964 update readme about profiles and packer build 2021-01-25 11:40:26 -05:00
Drew Bailey 630babb886
prevent double job status update (#9768)
* Prevent Job Statuses from being calculated twice

https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval
insertion iwth job (de-)registration. This change removes a now obsolete
guard which checked if the index was equal to the job.CreateIndex, which
would empty the status. Now that the job regisration eval insetion is
atomic with the registration this check is no longer necessary to set
the job statuses correctly.

* test to ensure only single job event for job register

* periodic e2e

* separate job update summary step

* fix updatejobstability to use copy instead of modified reference of job

* update envoygatewaybindaddresses copy to prevent job diff on null vs empty

* set ConsulGatewayBindAddress to empty map instead of nil

fix nil assertions for empty map

rm unnecessary guard
2021-01-22 09:18:17 -05:00