open-nomad

Author	SHA1	Message	Date
Tim Gross	2edbdfc8be	e2e: update framework to allow deploying Nomad (#6969 ) The e2e framework instantiates clients for Nomad/Consul but the provisioning of the actual Nomad cluster is left to Terraform. The Terraform provisioning process uses `remote-exec` to deploy specific versions of Nomad so that we don't have to bake an AMI every time we want to test a new version. But Terraform treats the resulting instances as immutable, so we can't use the same tooling to update the version of Nomad in-place. This is a prerequisite for upgrade testing. This changeset extends the e2e framework to provide the option of deploying Nomad (and, in the future, Consul/Vault) with specific versions to running infrastructure. This initial implementation is focused on deploying to a single cluster via `ssh` (because that's our current need), but provides interfaces to hook the test run at the start of the run, the start of each suite, or the start of a given test case. Terraform work includes: * provides Terraform output that written to JSON used by the framework to configure provisioning via `terraform output provisioning`. * provides Terraform output that can be used by test operators to configure their shell via `$(terraform output environment)` * drops `remote-exec` provisioning steps from Terraform * makes changes to the deployment scripts to ensure they can be run multiple times w/ different versions against the same host.	2020-01-22 08:48:52 -05:00
Tim Gross	1e600d573d	e2e: improve reusability of provisioning scripts (#6942 ) This changeset is part of the work to improve our E2E provisioning process to allow our upgrade tests: * Move more of the setup into the AMI image creation so it's a little more obvious to provisioning config authors which bits are essential to deploying a specific version of Nomad. * Make the service file update do a systemd daemon-reload so that we can update an already-running cluster with the same script we use to deploy it initially.	2020-01-16 09:29:36 -05:00
Tim Gross	b5bcfb533b	upgrade CNI plugins to 0.8.4 (#6921 ) When multiple Connect-enabled task groups start on the same client node, a race condition in the CNI plugins for creating iptables chains causes one of the tasks to fail. We upstreamed a patch to CNI plugins to make iptables chain creation idempotent. This changeset updates end-to-end testing, development tooling, and documentation to use 0.8.4 which includes our patch.	2020-01-09 10:57:07 -05:00
Tim Gross	9b2b4da3a4	e2e: run client/allocs metrics nightly tests vs Windows (#6850 ) Adds Windows targets to the client/allocs metrics tests. Removes the `allocstats` test, which covers less than these tests and is now redundant. Adds a firewall rule to our Windows instances so that the prometheus server can scrape the Nomad HTTP API for metrics.	2019-12-16 08:34:17 -05:00
Tim Gross	c9d92f845f	e2e: add a Windows client to test runner (#6735 ) * Adds a constraint to prevent tests from landing on Windows * Improve Terraform output for mixed windows/linux clients * Makes some Windows client config fixes from 0.10.2 testing	2019-11-25 13:31:00 -05:00
Tim Gross	e012c2b5bf	Infrastructure for Windows e2e testing (#6584 ) Includes: * baseline Windows AMI * initial pass at Terraform configurations * OpenSSH for Windows Using OpenSSH is a lot nicer for Nomad developers than winrm would be, plus it lets us avoid passing around the Windows password in the clear. Note that now we're copying up all the provisioning scripts and configs as a zipped bundle because TF's file provisioner dies in the middle of pushing up multiple files (whereas `scp -r` works fine). We're also running all the provisioning scripts inside the userdata by polling for the zip file to show up (gross!). This is because `remote-exec` provisioners are failing on Windows with the same symptoms as: https://github.com/hashicorp/terraform/issues/17728 If we can't fix this, it'll prevent us from having multiple Windows clients running until TF supports count interpolation in the `template_file`, which is planned for a later 0.12 release.	2019-11-19 11:06:10 -05:00
Tim Gross	3e9ae481ce	e2e: refactor Consul configurations (#6559 ) Ensure that we're reusing the base configuration between client and servers without the possibility of drift. Reduce the amount of `sed` mangling of the configuration file, and make recommended changes from `shellcheck` for this section of the provisioning script. Fixes some rebase errors on the Nomad config as well.	2019-10-28 09:27:40 -04:00
Tim Gross	ba7e7413ef	e2e: refactor Nomad configuration (#6560 ) Share base configuration for telemetry and consul. Have the server configurations respect the `var.server_count` config. Make changes recommended by `shellcheck` in the provisioning scripts for this section. Switch to OS/arch-tagged release bundles on S3 for compatibility with adding Windows builds in the near future.	2019-10-28 08:21:02 -04:00
Tim Gross	8be403f47b	e2e: refactor Vault configuration (#6561 ) Match the configuration directory layout we're using for Consul and other services. Make recommended changes from `shellcheck` for this section of the provisioning script.	2019-10-25 15:29:01 -04:00
Tim Gross	87b3abddd3	e2e: use sockaddr for IP address configuration (#6548 ) Update the Consul and Vault configs to take advantage of their included `go-sockaddr` library for getting the IP addresses we need in a portable way. This particularly avoids problems with "predictable" interface names provided by systemd. Also adds the `sockaddr` binary to the Packer build so we can use it in our provisioning scripts.	2019-10-25 14:08:38 -04:00
Tim Gross	efbd680d4e	e2e: split Packer build scripts from TF provisioning (#6542 ) Make a clear split between Packer and Terraform provisioning steps: the scripts in the `packer/linux` directory are run when we build the AMI whereas the stuff in shared are run at Terraform provisioning time. Merging all runtime provisioning scripts into a single script for each of server/client solves the following: * Userdata scripts can't take arguments, they can only be templated and that means we have to do TF escaping in bash/powershell scripts. * TF provisioning scripts race with userdata scripts.	2019-10-25 08:08:24 -04:00
Tim Gross	15e912ddd6	e2e: move remote-exec inline to script (#6488 ) A failing script in a `remote-exec` provisioner's `inline` stanza won't fail the provisioning step. This lets us continue on to execute tests against potentially broken deployments, rather than letting us know the provisioning itself failed.	2019-10-14 10:23:41 -04:00
Michael Schurter	228899c32f	e2e: test demo job for connect	2019-09-04 12:40:08 -07:00
Danielle Lancashire	d454dab39b	chore: Format hcl configurations	2019-07-20 16:55:07 +02:00
Michael Schurter	803aa62b7a	systemd: set a high but non-infinite fd limit	2019-07-02 09:13:24 -07:00
Michael Schurter	2b7f398726	e2e: fix nomad service for systemd<230	2019-05-14 10:53:26 -07:00
Michael Schurter	a1c3ce36bc	Merge pull request #5647 from hashicorp/e2e-tf E2E Test Terraform/Packer Improvements	2019-05-06 15:42:52 -07:00
Mahmood Ali	bfc907827c	docs: update s3 urls to use virtual bucket style In response to https://forums.aws.amazon.com/ann.jspa?annID=6776	2019-05-06 10:39:51 -04:00
Michael Schurter	93f3ac7a9c	e2e: explain these scripts are for packer It took me way too long to figure out these weren't used by TF.	2019-05-03 07:55:28 -07:00
Michael Schurter	19889d6468	e2e: update deps and install nomad in packer Nomad on the packer image will be overwritten by the sha specified in the TF var, but including a base version on the packer image makes the image valid for independent use.	2019-05-03 07:53:08 -07:00
Michael Schurter	13b62a68f7	e2e: enable systemd units so they start on boot	2019-05-03 07:52:03 -07:00
Michael Schurter	30db07cccb	docs: sync systemd unit files; update deploy guide The systemd configs spread across our repo were fairly out of sync. This should get them on our best practices. The deployment guide also had some strange things like running Nomad as a non-root user. It would be fine for servers but completely breaks clients. For simplicity I simply removed the non-root user references.	2019-03-19 15:18:12 -07:00
Preetha Appan	419a2682b0	update to Consul 1.4.0 for e2e tests	2019-01-24 09:52:15 -06:00
Preetha Appan	1ba4674ce2	suggestions from code review	2018-12-17 15:06:22 -06:00
Jack Pearkes	844d981e77	Terraform configs for e2e tests	2018-12-17 11:40:09 -06:00

25 commits