open-nomad

Author	SHA1	Message	Date
Tim Gross	727277793b	e2e: bootstrap vault and provision Nomad with vault tokens (#9010 ) Provisions vault with the policies described in the Nomad Vault integration guide, and drops a configuration file for Nomad vault server configuration with its token. The vault root token is exposed to the E2E runner so that tests can write additional policies to vault.	2020-10-05 09:28:37 -04:00
Tim Gross	b6292528fe	e2e: tfvars.dev file must override default tfvars file (#9005 ) The `-var-file` flag for loading variables into Terraform overlays the default variables file if present. This means that variables that are set in the default variables file will take precedence if the overlay file does not have them set. Set `nomad_acls` and `nomad_enteprise` to `false` in the dev cluster.	2020-10-02 08:02:37 -04:00
Tim Gross	566dae7b19	e2e: add flag to bootstrap Nomad ACLs (#8961 ) Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad TF provider if we're bootstrapping in the same Terraform stack, so instead of using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous policy. The resulting management token is exported as an environment var with `$(terraform output environment)` and tests that want stricter ACLs will be able to write them using that token. This should also provide a basis to do similar work with Consul ACLs in the future.	2020-09-28 09:22:36 -04:00
Tim Gross	147b16243d	e2e: use more recent instance type (#8954 ) Newer EC2 instances are both cheaper and have generally better performance. The dnsmasq configuration had a hard-coded interface name, so in order to accomodate instances with more recent networking that result in so-called predictable interface names, the dnsmasq configuration needs to be replaced at runtime with userdata to select the default interface.	2020-09-23 14:27:52 -04:00
Tim Gross	1fc525ec1e	e2e: add flags for provisioning Nomad Enterprise (#8929 )	2020-09-23 10:39:04 -04:00
Tim Gross	3da61545d5	make sure dev-cluster has the option to run windows config (#8928 )	2020-09-18 16:41:35 -04:00
Tim Gross	ea1f6408bf	e2e: remove unused framework provisioning code (#8908 )	2020-09-18 11:46:47 -04:00
Tim Gross	c413fa5e49	e2e: test script for Terraform logic (#8907 )	2020-09-18 11:46:40 -04:00
Tim Gross	9d37233eaf	e2e: provision cluster entirely through Terraform (#8748 ) Have Terraform run the target-specific `provision.sh`/`provision.ps1` script rather than the test runner code which needs to be customized for each distro. Use Terraform's detection of variable value changes so that we can re-run the provisioning without having to re-install Nomad on those specific hosts that need it changed. Allow the configuration "profile" (well-known directory) to be set by a Terraform variable. The default configurations are installed during Packer build time, and symlinked into the live configuration directory by the provision script. Detect changes in the file contents so that we only upload custom configuration files that have changed between Terraform runs	2020-09-18 11:27:24 -04:00
Tim Gross	990fcf7be4	e2e: documentation and minor tweaks to configs (#8912 ) * remove outdated references to envchain in documentation * add new host volume locations in userdata * don't exit the entire script during provisioning, just return	2020-09-17 09:20:18 -04:00
Michael Schurter	5f3a71d0b9	docs: update scripts to 0.12.4	2020-09-09 15:22:37 -07:00
Tim Gross	a47b1c1081	e2e: move configurations into profile-specific directories (#8828 ) This changeset stages upcoming E2E provisioning improvements work. It splits the existing shared configuration directory into 3 profiles: * "full-cluster": the set of configurations currently in use * "dev-cluster": a simplified set of mostly existing configurations that weren't in use. * "custom": an empty profile for developers to keep non-standard configurations during complex feature development. The tooling to switch between profiles will be in a later changeset. Also drops some unused configuration knobs from the provisioning scripts to make the next stage of work easier.	2020-09-04 11:23:32 -04:00
Tim Gross	93c1093274	e2e: remove unused EBS volumes and depends_on (#8827 ) Our provisioning process for E2E doesn't require the `depends_on` fields to be set for client instances, so dropping that field allows all instances to be started in parallel. We don't use the extra EBS volumes (they aren't even mounted), so remove them to reduce costs.	2020-09-04 10:25:59 -04:00
Tim Gross	0577b03479	e2e: minor rename and cleanup (#8824 )	2020-09-04 08:51:22 -04:00
Tim Gross	e6cdd8e0c0	e2e: consolidate cloud-specific Consul configs (#8823 ) The `-recursor` flag in the Consul service unit files is specific to a given cloud, but we already have cloud-specific configuration files. Consolidate all the cloud-specific items into the config.	2020-09-04 08:51:15 -04:00
Tim Gross	bc6ad011fe	e2e: Linux AMI setup cleanup (#8821 ) As we add new Linux targets for E2E, the existing setup.sh script will be used only for Ubuntu. Rather than have the service and config files echo'd from the script, move them into files we upload so they can be reused. Includes some general noise reduction in the setup.sh script and removal of unused bits.	2020-09-03 16:30:58 -04:00
Tim Gross	3a382f599f	e2e: minor TF refactor to split out vars and outputs (#8752 )	2020-08-26 17:00:36 -04:00
Tim Gross	8c8b91e7b9	e2e: move systemd unit files into Packer build (#8751 )	2020-08-26 16:45:09 -04:00
Tim Gross	693a8a2613	e2e: fix platform path for installing for Linux from s3 (#8708 )	2020-08-21 09:20:09 -04:00
Tim Gross	b23150057a	E2E: move Nomad installation to script on remote hosts (#8706 ) This changeset moves the installation of Nomad binaries out of the provisioning framework and into scripts that are installed on the remote host during AMI builds. This provides a few advantages: * The provisioning framework can be reduced in scope (with the goal of moving most of it into the Terraform stack entirely). * The scripts can be arbitrarily complex if we don't have to stuff them into ssh commands, so it's easier to make them idempotent. In this changeset, the scripts check the version of the existing binary and don't re-download when using the `--nomad_sha` or `--nomad_version` flags. * The scripts can be OS/distro specific, which helps in building new test targets.	2020-08-20 16:10:00 -04:00
Tim Gross	0fd4a05b2f	E2E AMI cleanup (#8697 ) * move CNI install/podman config to build-time * move DNS config to userdata * consolidate apt updates for performance	2020-08-20 10:09:31 -04:00
Tim Gross	9a3caa49db	e2e: remove unused spark dependency (#8695 )	2020-08-19 14:59:36 -04:00
Tim Gross	a49732816c	migrate AMI builds to new account (#8674 )	2020-08-19 08:20:59 -04:00
Tim Gross	d810dab50b	migrate E2E test runs to new AWS account (#8676 )	2020-08-18 14:24:34 -04:00
Tim Gross	d0b03cad7c	e2e: give containers access to dnsmasq DNS (#8536 ) By default, Docker containers get /etc/resolv.conf bound into the container with the localhost entry stripped out. In order to resolve using the host's dnsmasq, we need to make sure the container uses the docker0 IP as its nameserver and that dnsmasq is listening on that port and forwarding to either the AWS VPC DNS (so that we can query private resources like EFS) or to the Consul DNS.	2020-07-24 14:09:18 -04:00
Drew Bailey	01b01f7cac	use latest podman release (#8403 )	2020-07-09 09:28:53 -04:00
Drew Bailey	327843acfa	base podman e2e test and provisioning updates (#8104 ) * initial setup for terrform to install podman task driver podman * Update e2e provisioning to support root podman Excludes setup for rootless podman. updates source ami to ubuntu 18.04 Installs podman and configures podman varlink base podman test ensure client status running revert terraform directory changes * back out random go-discover go mod change * include podman varlink docs * address comments	2020-06-03 14:06:58 -04:00
Tim Gross	932710ad7d	e2e: upgrade CNI to 0.8.6 (#7956 )	2020-05-14 09:29:11 -04:00
Seth Hoenig	623c804046	e2e: upgrade consul in packer setup to 1.7.3 from 1.6.1 There have been a number of bug fixes and features particularly around Connect that will help us in Nomad's e2e tests. Upgrade Consul in our packer builder so e2e can make use of the new version.	2020-05-11 11:17:28 -06:00
Tim Gross	4935b304a0	e2e: add helper to Makefile for local file deployments (#7822 )	2020-04-28 16:15:58 -04:00
Tim Gross	ab3086a1f4	e2e: testing reliability (#7701 ) * pin CSI plugin versions * ensure failing CSI tests clean up * allow NOMAD_SHA env var to override makefile	2020-04-13 10:25:24 -04:00
Mahmood Ali	c8eddb9f6b	fixup! e2e: add a convenient creation script	2020-04-09 11:04:26 -04:00
Mahmood Ali	8a4937d9ce	e2e: add a convenient creation script Add a convenience Makefile for creating e2e environment for manual debugging.	2020-04-09 10:54:30 -04:00
Tim Gross	bde13dfc0c	e2e: have TF write-out HCL for CSI volume registration (#7599 )	2020-04-02 12:16:43 -04:00
Tim Gross	cd1c6173f4	csi: e2e tests for EBS and EFS plugins (#7343 ) This changeset provides two basic e2e tests for CSI plugins targeting common AWS use cases. The EBS test launches the EBS plugin (controller + nodes) and registers an EBS volume as a Nomad CSI volume. We deploy a job that writes to the volume, stop that job, and reuse the volume for another job which should be able to read the data written by the first job. The EFS test launches the EFS plugin (nodes-only) and registers an EFS volume as a Nomad CSI volume. We deploy a job that writes to the volume, stop that job, and reuse the volume for another job which should be able to read the data written by the first job. The writer jobs mount the CSI volume at a location within the alloc dir.	2020-03-23 13:59:18 -04:00
Mahmood Ali	857ddf7aaf	e2e: use unique CSI token Use a unique per-cluster efs creation token, as https://www.terraform.io/docs/providers/aws/r/efs_file_system.html#creation_token. Using a static value prevents having multiple test clusters. [ci skip]	2020-03-15 21:55:26 -04:00
Tim Gross	79222c36bf	e2e: add EBS and EFS volumes for testing CSI (#7266 ) This changeset adds volumes but does not mount them to instances so that we can test the mounting ("staging") via CSI plugins. The CSI plugins themselves will be installed as Nomad jobs. In order to ensure we can always mount the EFS volume, this changeset pins the deployment of the cluster to a specific subnet. In future work we should spread the cluster out among several AZs and test that behavior explicitly.	2020-03-04 10:44:51 -05:00
Tim Gross	940110b2de	e2e: improve provisioning defaults and documentation (#7062 ) This changeset improves the ergonomics of running the Nomad e2e test provisioning process by defaulting to a blank `nomad_sha` in the Terraform configuration. By default, a user will now need to pass in one of the Nomad version flags. But they won't have to manually edit the `provisioning.json` file for the common case of deploying a released version of Nomad, and won't need to put dummy values for `nomad_sha`. Includes general documentation improvements.	2020-02-04 10:37:00 -05:00
Tim Gross	7681f09ae4	e2e: packer builds should not be public (#6998 )	2020-01-27 16:28:25 -05:00
Tim Gross	457e3ad5c6	e2e: document e2e provisioning process (#6976 )	2020-01-22 16:55:17 -05:00
Tim Gross	2edbdfc8be	e2e: update framework to allow deploying Nomad (#6969 ) The e2e framework instantiates clients for Nomad/Consul but the provisioning of the actual Nomad cluster is left to Terraform. The Terraform provisioning process uses `remote-exec` to deploy specific versions of Nomad so that we don't have to bake an AMI every time we want to test a new version. But Terraform treats the resulting instances as immutable, so we can't use the same tooling to update the version of Nomad in-place. This is a prerequisite for upgrade testing. This changeset extends the e2e framework to provide the option of deploying Nomad (and, in the future, Consul/Vault) with specific versions to running infrastructure. This initial implementation is focused on deploying to a single cluster via `ssh` (because that's our current need), but provides interfaces to hook the test run at the start of the run, the start of each suite, or the start of a given test case. Terraform work includes: * provides Terraform output that written to JSON used by the framework to configure provisioning via `terraform output provisioning`. * provides Terraform output that can be used by test operators to configure their shell via `$(terraform output environment)` * drops `remote-exec` provisioning steps from Terraform * makes changes to the deployment scripts to ensure they can be run multiple times w/ different versions against the same host.	2020-01-22 08:48:52 -05:00
Tim Gross	1e600d573d	e2e: improve reusability of provisioning scripts (#6942 ) This changeset is part of the work to improve our E2E provisioning process to allow our upgrade tests: * Move more of the setup into the AMI image creation so it's a little more obvious to provisioning config authors which bits are essential to deploying a specific version of Nomad. * Make the service file update do a systemd daemon-reload so that we can update an already-running cluster with the same script we use to deploy it initially.	2020-01-16 09:29:36 -05:00
Tim Gross	b5bcfb533b	upgrade CNI plugins to 0.8.4 (#6921 ) When multiple Connect-enabled task groups start on the same client node, a race condition in the CNI plugins for creating iptables chains causes one of the tasks to fail. We upstreamed a patch to CNI plugins to make iptables chain creation idempotent. This changeset updates end-to-end testing, development tooling, and documentation to use 0.8.4 which includes our patch.	2020-01-09 10:57:07 -05:00
Tim Gross	9b2b4da3a4	e2e: run client/allocs metrics nightly tests vs Windows (#6850 ) Adds Windows targets to the client/allocs metrics tests. Removes the `allocstats` test, which covers less than these tests and is now redundant. Adds a firewall rule to our Windows instances so that the prometheus server can scrape the Nomad HTTP API for metrics.	2019-12-16 08:34:17 -05:00
Tim Gross	c9d92f845f	e2e: add a Windows client to test runner (#6735 ) * Adds a constraint to prevent tests from landing on Windows * Improve Terraform output for mixed windows/linux clients * Makes some Windows client config fixes from 0.10.2 testing	2019-11-25 13:31:00 -05:00
Tim Gross	e012c2b5bf	Infrastructure for Windows e2e testing (#6584 ) Includes: * baseline Windows AMI * initial pass at Terraform configurations * OpenSSH for Windows Using OpenSSH is a lot nicer for Nomad developers than winrm would be, plus it lets us avoid passing around the Windows password in the clear. Note that now we're copying up all the provisioning scripts and configs as a zipped bundle because TF's file provisioner dies in the middle of pushing up multiple files (whereas `scp -r` works fine). We're also running all the provisioning scripts inside the userdata by polling for the zip file to show up (gross!). This is because `remote-exec` provisioners are failing on Windows with the same symptoms as: https://github.com/hashicorp/terraform/issues/17728 If we can't fix this, it'll prevent us from having multiple Windows clients running until TF supports count interpolation in the `template_file`, which is planned for a later 0.12 release.	2019-11-19 11:06:10 -05:00
Drew Bailey	2befab6900	Merge pull request #6573 from hashicorp/update-cci-consul updates default consul version to 1.6.1	2019-11-07 11:01:22 -05:00
Drew Bailey	1c2af019c6	update vagrant & packer consul versions	2019-11-07 10:13:14 -05:00
Tim Gross	3e9ae481ce	e2e: refactor Consul configurations (#6559 ) Ensure that we're reusing the base configuration between client and servers without the possibility of drift. Reduce the amount of `sed` mangling of the configuration file, and make recommended changes from `shellcheck` for this section of the provisioning script. Fixes some rebase errors on the Nomad config as well.	2019-10-28 09:27:40 -04:00
Tim Gross	ba7e7413ef	e2e: refactor Nomad configuration (#6560 ) Share base configuration for telemetry and consul. Have the server configurations respect the `var.server_count` config. Make changes recommended by `shellcheck` in the provisioning scripts for this section. Switch to OS/arch-tagged release bundles on S3 for compatibility with adding Windows builds in the near future.	2019-10-28 08:21:02 -04:00

1 2

85 commits