open-nomad

Commit Graph

Author	SHA1	Message	Date
Seth Hoenig	d2cd605995	dist: place systemd unit options correctly This PR places StartLimitIntervalSec and StartLimitBurst in the Unit section of systemd unit files, rather than the Service section. https://www.freedesktop.org/software/systemd/man/systemd.unit.html Fixes #10065	2021-02-22 19:23:00 -06:00
Chris Baker	b1bb8a760e	e2e packer build: upgrade jdk to java 14	2021-02-02 17:33:48 +00:00
Mahmood Ali	0aafd9af64	e2e: Fix build script and pass shellcheck	2021-01-26 09:11:37 -05:00
Mahmood Ali	39da228964	update readme about profiles and packer build	2021-01-25 11:40:26 -05:00
Mahmood Ali	76ce6306a4	add helper for building ami	2021-01-15 10:49:13 -05:00
Mahmood Ali	e51651c34a	set sha	2021-01-15 10:49:13 -05:00
Mahmood Ali	82637715cf	change ami naming	2021-01-15 10:49:12 -05:00
Mahmood Ali	0af1509a77	move config files to terraform	2021-01-15 10:49:12 -05:00
Seth Hoenig	fc5f48d936	cni: bump CNI version to v0.9.0 https://github.com/containernetworking/plugins/releases/tag/v0.9.0 Also make the copy-paste install instructions work with arm64 for a better OOTB experience (AWS Graviton, Pi 4's).	2021-01-10 18:03:27 -06:00
Tim Gross	26f4ee7fb1	e2e: dnsmasq configuration fixes * systemd units require absolute paths * ensure directory exists for dnsmasq	2021-01-04 15:40:57 -05:00
Tim Gross	00bc6a7d13	e2e: move dnsmasq config into dnsmasq service unit (#9660 ) Our dnsmasq configuration needs host-specific data that we can't configure in the AMI build. But configuring this in userdata leads to a race between userdata execution, docker.service startup, and dnsmasq.service startup. So rather than letting dnsmasq come up with incorrect configuration and then modifying it after the fact, do the configuration in the service's prestart, and have it kick off a Docker restart when we're done.	2020-12-17 10:33:19 -05:00
Seth Hoenig	ad5918f754	e2e: upgrade terraform consul to 1.9.0	2020-12-03 13:01:14 -06:00
Tim Gross	d686a51d60	e2e: prevent Ubuntu startup race conditions (#9428 ) The cloud-init configuration runs on boot, which can result in a race condition between that and service startup. This has caused provisioning failures because Nomad expects the userdata to have configured a host volume directory. Diagnosing this was also compounded by a warning being fired by systemd for the Nomad unit file. * Update the location of the `StartLimitIntervalSec` field to it's post-systemd-230 location. * Ensure that the weekly AMI build is up-to-date to reduce the risk of unexpected system software changes. * Move the host volume to a directory we can set up at AMI build time rather than in userdata.	2020-11-23 12:29:08 -05:00
Tim Gross	08ae13d3b9	e2e: Windows provisioning improvements (#9246 ) Small changes to the Windows 2016 Packer build for debuggability of provisioning: * improve verbosity of powershell error handling * remove unused "tools" installation * use ssh communicator for Packer to improve Packer build times and eliminate deprecated winrm remote access (unavailable from current macOS)	2020-11-09 13:29:40 -05:00
Tim Gross	dc8e20206d	E2E: switch packer build files to HCL2 (#9219 ) Build configuration files need comments, and JSON is also just the worst, isn't it? Upgrade our E2E packer configs to use the new HCL2 syntax.	2020-10-29 10:03:39 -04:00
Tim Gross	57f694ff2e	E2E: AMI software version bumps and cleanup (#9213 ) * remove unused vault installation from Windows AMI * match Windows and Linux Consul versions * bump AMI base Nomad to current stable	2020-10-29 08:27:50 -04:00
Tim Gross	99c2a2df00	e2e: reduce risk of flaky Ubuntu AMI build (#9207 ) The base Ubuntu AMI modifies apt sources during cloud-init. But the Packer build can potentially start the setup script before that work is done, resulting in errors trying to install base system dependencies like `dnsmasq`. Delay the setup long enough to lose the race with cloud-init.	2020-10-28 15:13:44 -04:00
Tim Gross	7e4a35ad7e	e2e: use more specific names for OS/distros (#9204 ) We intend to expand the nightly E2E test to cover multiple distros and platforms. Change the naming structure for "Linux client" to the more precise "Ubuntu Bionic", and "Windows" to "Windows 2016" to make it easier to add new targets without additional refactoring.	2020-10-28 12:58:00 -04:00
Tim Gross	76f1f5e5df	e2e: use AMI filter for Ubuntu packer image (#9086 ) Instead of hard-coding the base AMI for our Packer image for Ubuntu, use the latest from Canonical so that we always have their current kernel patches.	2020-10-14 11:22:33 -04:00
Tim Gross	727277793b	e2e: bootstrap vault and provision Nomad with vault tokens (#9010 ) Provisions vault with the policies described in the Nomad Vault integration guide, and drops a configuration file for Nomad vault server configuration with its token. The vault root token is exposed to the E2E runner so that tests can write additional policies to vault.	2020-10-05 09:28:37 -04:00
Tim Gross	566dae7b19	e2e: add flag to bootstrap Nomad ACLs (#8961 ) Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad TF provider if we're bootstrapping in the same Terraform stack, so instead of using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous policy. The resulting management token is exported as an environment var with `$(terraform output environment)` and tests that want stricter ACLs will be able to write them using that token. This should also provide a basis to do similar work with Consul ACLs in the future.	2020-09-28 09:22:36 -04:00
Tim Gross	1fc525ec1e	e2e: add flags for provisioning Nomad Enterprise (#8929 )	2020-09-23 10:39:04 -04:00
Tim Gross	9d37233eaf	e2e: provision cluster entirely through Terraform (#8748 ) Have Terraform run the target-specific `provision.sh`/`provision.ps1` script rather than the test runner code which needs to be customized for each distro. Use Terraform's detection of variable value changes so that we can re-run the provisioning without having to re-install Nomad on those specific hosts that need it changed. Allow the configuration "profile" (well-known directory) to be set by a Terraform variable. The default configurations are installed during Packer build time, and symlinked into the live configuration directory by the provision script. Detect changes in the file contents so that we only upload custom configuration files that have changed between Terraform runs	2020-09-18 11:27:24 -04:00
Tim Gross	990fcf7be4	e2e: documentation and minor tweaks to configs (#8912 ) * remove outdated references to envchain in documentation * add new host volume locations in userdata * don't exit the entire script during provisioning, just return	2020-09-17 09:20:18 -04:00
Michael Schurter	5f3a71d0b9	docs: update scripts to 0.12.4	2020-09-09 15:22:37 -07:00
Tim Gross	0577b03479	e2e: minor rename and cleanup (#8824 )	2020-09-04 08:51:22 -04:00
Tim Gross	e6cdd8e0c0	e2e: consolidate cloud-specific Consul configs (#8823 ) The `-recursor` flag in the Consul service unit files is specific to a given cloud, but we already have cloud-specific configuration files. Consolidate all the cloud-specific items into the config.	2020-09-04 08:51:15 -04:00
Tim Gross	bc6ad011fe	e2e: Linux AMI setup cleanup (#8821 ) As we add new Linux targets for E2E, the existing setup.sh script will be used only for Ubuntu. Rather than have the service and config files echo'd from the script, move them into files we upload so they can be reused. Includes some general noise reduction in the setup.sh script and removal of unused bits.	2020-09-03 16:30:58 -04:00
Tim Gross	8c8b91e7b9	e2e: move systemd unit files into Packer build (#8751 )	2020-08-26 16:45:09 -04:00
Tim Gross	693a8a2613	e2e: fix platform path for installing for Linux from s3 (#8708 )	2020-08-21 09:20:09 -04:00
Tim Gross	b23150057a	E2E: move Nomad installation to script on remote hosts (#8706 ) This changeset moves the installation of Nomad binaries out of the provisioning framework and into scripts that are installed on the remote host during AMI builds. This provides a few advantages: * The provisioning framework can be reduced in scope (with the goal of moving most of it into the Terraform stack entirely). * The scripts can be arbitrarily complex if we don't have to stuff them into ssh commands, so it's easier to make them idempotent. In this changeset, the scripts check the version of the existing binary and don't re-download when using the `--nomad_sha` or `--nomad_version` flags. * The scripts can be OS/distro specific, which helps in building new test targets.	2020-08-20 16:10:00 -04:00
Tim Gross	0fd4a05b2f	E2E AMI cleanup (#8697 ) * move CNI install/podman config to build-time * move DNS config to userdata * consolidate apt updates for performance	2020-08-20 10:09:31 -04:00
Tim Gross	9a3caa49db	e2e: remove unused spark dependency (#8695 )	2020-08-19 14:59:36 -04:00
Tim Gross	a49732816c	migrate AMI builds to new account (#8674 )	2020-08-19 08:20:59 -04:00
Tim Gross	d0b03cad7c	e2e: give containers access to dnsmasq DNS (#8536 ) By default, Docker containers get /etc/resolv.conf bound into the container with the localhost entry stripped out. In order to resolve using the host's dnsmasq, we need to make sure the container uses the docker0 IP as its nameserver and that dnsmasq is listening on that port and forwarding to either the AWS VPC DNS (so that we can query private resources like EFS) or to the Consul DNS.	2020-07-24 14:09:18 -04:00
Drew Bailey	01b01f7cac	use latest podman release (#8403 )	2020-07-09 09:28:53 -04:00
Drew Bailey	327843acfa	base podman e2e test and provisioning updates (#8104 ) * initial setup for terrform to install podman task driver podman * Update e2e provisioning to support root podman Excludes setup for rootless podman. updates source ami to ubuntu 18.04 Installs podman and configures podman varlink base podman test ensure client status running revert terraform directory changes * back out random go-discover go mod change * include podman varlink docs * address comments	2020-06-03 14:06:58 -04:00
Seth Hoenig	623c804046	e2e: upgrade consul in packer setup to 1.7.3 from 1.6.1 There have been a number of bug fixes and features particularly around Connect that will help us in Nomad's e2e tests. Upgrade Consul in our packer builder so e2e can make use of the new version.	2020-05-11 11:17:28 -06:00
Tim Gross	79222c36bf	e2e: add EBS and EFS volumes for testing CSI (#7266 ) This changeset adds volumes but does not mount them to instances so that we can test the mounting ("staging") via CSI plugins. The CSI plugins themselves will be installed as Nomad jobs. In order to ensure we can always mount the EFS volume, this changeset pins the deployment of the cluster to a specific subnet. In future work we should spread the cluster out among several AZs and test that behavior explicitly.	2020-03-04 10:44:51 -05:00
Tim Gross	7681f09ae4	e2e: packer builds should not be public (#6998 )	2020-01-27 16:28:25 -05:00
Tim Gross	1e600d573d	e2e: improve reusability of provisioning scripts (#6942 ) This changeset is part of the work to improve our E2E provisioning process to allow our upgrade tests: * Move more of the setup into the AMI image creation so it's a little more obvious to provisioning config authors which bits are essential to deploying a specific version of Nomad. * Make the service file update do a systemd daemon-reload so that we can update an already-running cluster with the same script we use to deploy it initially.	2020-01-16 09:29:36 -05:00
Tim Gross	e012c2b5bf	Infrastructure for Windows e2e testing (#6584 ) Includes: * baseline Windows AMI * initial pass at Terraform configurations * OpenSSH for Windows Using OpenSSH is a lot nicer for Nomad developers than winrm would be, plus it lets us avoid passing around the Windows password in the clear. Note that now we're copying up all the provisioning scripts and configs as a zipped bundle because TF's file provisioner dies in the middle of pushing up multiple files (whereas `scp -r` works fine). We're also running all the provisioning scripts inside the userdata by polling for the zip file to show up (gross!). This is because `remote-exec` provisioners are failing on Windows with the same symptoms as: https://github.com/hashicorp/terraform/issues/17728 If we can't fix this, it'll prevent us from having multiple Windows clients running until TF supports count interpolation in the `template_file`, which is planned for a later 0.12 release.	2019-11-19 11:06:10 -05:00
Drew Bailey	1c2af019c6	update vagrant & packer consul versions	2019-11-07 10:13:14 -05:00
Tim Gross	87b3abddd3	e2e: use sockaddr for IP address configuration (#6548 ) Update the Consul and Vault configs to take advantage of their included `go-sockaddr` library for getting the IP addresses we need in a portable way. This particularly avoids problems with "predictable" interface names provided by systemd. Also adds the `sockaddr` binary to the Packer build so we can use it in our provisioning scripts.	2019-10-25 14:08:38 -04:00
Tim Gross	efbd680d4e	e2e: split Packer build scripts from TF provisioning (#6542 ) Make a clear split between Packer and Terraform provisioning steps: the scripts in the `packer/linux` directory are run when we build the AMI whereas the stuff in shared are run at Terraform provisioning time. Merging all runtime provisioning scripts into a single script for each of server/client solves the following: * Userdata scripts can't take arguments, they can only be templated and that means we have to do TF escaping in bash/powershell scripts. * TF provisioning scripts race with userdata scripts.	2019-10-25 08:08:24 -04:00

45 Commits