open-nomad

Author	SHA1	Message	Date
Drew Bailey	7e78d4a607	e2e license smoke test (#10242 )	2021-03-26 13:21:47 -04:00
Mahmood Ali	dbc3850358	Merge pull request #10145 from hashicorp/b-periodic-init-status periodic: always reset periodic children status	2021-03-26 09:19:08 -04:00
Drew Bailey	64084f3209	e2e allow setting an enterprise license environment variable (#10233 ) * allow setting an enterprise license environment variable * update comment * address pr comments	2021-03-25 14:35:55 -04:00
Mahmood Ali	e643742a38	Add a test for parameterized summary counts	2021-03-25 11:27:09 -04:00
Tim Gross	46223e190e	E2E: bump AWS CSI driver versions	2021-03-24 14:17:38 -04:00
Tim Gross	0e774d40f5	E2E: CSI test should use expected unique-volume name	2021-03-23 08:34:17 -04:00
Tim Gross	fa25e048b2	CSI: unique volume per allocation Add a `PerAlloc` field to volume requests that directs the scheduler to test feasibility for volumes with a source ID that includes the allocation index suffix (ex. `[0]`), rather than the exact source ID. Read the `PerAlloc` field when making the volume claim at the client to determine if the allocation index suffix (ex. `[0]`) should be added to the volume source ID.	2021-03-18 15:35:11 -04:00
Charlie Voiselle	0473f35003	Fixup uses of `sanity` (#10187 ) * Fixup uses of `sanity` * Remove unnecessary comments. These checks are better explained by earlier comments about the context of the test. Per @tgross, moved the tests together to better reinforce the overall shared context. * Update nomad/fsm_test.go	2021-03-16 18:05:08 -04:00
Tim Gross	2a2e36690a	docs: swap master for main in Nomad repo	2021-03-08 14:26:31 -05:00
Mahmood Ali	ff8d67fae2	Merge pull request #9935 from hashicorp/e2e-segment-e2e-clusters e2e: segment e2e clusters	2021-03-01 09:23:21 -05:00
Drew Bailey	86d9e1ff90	Merge pull request #9955 from hashicorp/on-update-services Service and Check on_update configuration option (readiness checks)	2021-02-24 10:11:05 -05:00
Seth Hoenig	d2cd605995	dist: place systemd unit options correctly This PR places StartLimitIntervalSec and StartLimitBurst in the Unit section of systemd unit files, rather than the Service section. https://www.freedesktop.org/software/systemd/man/systemd.unit.html Fixes #10065	2021-02-22 19:23:00 -06:00
Drew Bailey	c152757d38	E2e/fix periodic (#10047 ) * fix periodic * update periodic to not use template nomad job inspect no longer returns an apiliststub so the required fields to query job summary are no longer there, parse cli output instead * rm tmp makefile entry * fix typo * revert makefile change	2021-02-18 12:21:53 -05:00
James Rasell	f95e45b80c	e2e: account for race condition in periodic dispatch test.	2021-02-11 11:08:48 +01:00
Seth Hoenig	7d6e81e9e4	Merge pull request #9990 from hashicorp/f-nsiso-task drivers/exec+java: Add task configuration to restore previous PID/IPC isolation behavior	2021-02-09 13:29:14 -06:00
Seth Hoenig	45e0e70a50	consul/connect: enable custom sidecars to use expose checks This PR enables jobs configured with a custom sidecar_task to make use of the `service.expose` feature for creating checks on services in the service mesh. Before we would check that sidecar_task had not been set (indicating that something other than envoy may be in use, which would not support envoy's expose feature). However Consul has not added support for anything other than envoy and probably never will, so having the restriction in place seems like an unnecessary hindrance. If Consul ever does support something other than Envoy, they will likely find a way to provide the expose feature anyway. Fixes #9854	2021-02-09 10:49:37 -06:00
Seth Hoenig	8ee9835923	drivers/exec+java: Add task configuration to restore previous PID/IPC isolation behavior This PR adds pid_mode and ipc_mode options to the exec and java task driver config options. By default these will defer to the default_pid_mode and default_ipc_mode agent plugin options created in #9969. Setting these values to "host" mode disables isolation for the task. Doing so is not recommended, but may be necessary to support legacy job configurations. Closes #9970	2021-02-08 14:26:35 -06:00
Drew Bailey	b5585882e4	address pr comments	2021-02-08 13:43:05 -05:00
Drew Bailey	b0cf3ffa54	on_update check_restart e2e	2021-02-08 10:49:25 -05:00
Drew Bailey	8507d54e3b	e2e test for on_update service checks check_restart not compatible with on_update=ignore reword caveat	2021-02-08 08:32:40 -05:00
Chris Baker	b1bb8a760e	e2e packer build: upgrade jdk to java 14	2021-02-02 17:33:48 +00:00
Mahmood Ali	45889f9f55	e2e: segment e2e clusters Ensure that the e2e clusters are isolated and never attempt to autojoin with another e2e cluster. This ensures that each cluster servers have a unique `ConsulAutoJoin`, to be used for discovery.	2021-02-01 08:04:21 -05:00
Chris Baker	ce68ee164b	Version 1.0.3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgEuOKAAoJEFGFLYc0j/xMxF8H/3TTU6Tu+Xm0YvcsDaYDphZ/ X7KQBV0aFiuL5VkTw4PzKEsgryIy9/sqEPyxxyKRowAmos9qhiusjNAIfqdP4TF8 tdZmTedkfWir9uPD+hyv/LXpwbQ2T8kTwS3xHTYvaOmaCxZr710FEn+imnMk1AUn Xs5itkd/CYGr0nBLm+I5GutWSDPmL7Uw8J5Z30fFyoaxoCPAbCWQQNk793SCRUc5 f/uo18V2tFInmQ+3sAdnM4gPewyStK/a5VvzWavL9fVDtYK83wlqWSchTXY5jpVz zNEzt/rYhbBzakPQQKb5zieblh2iGI8aHWpD5w4WduqO2Sg6B/5lAeNZIlW0UJg= =2g3c -----END PGP SIGNATURE----- Merge tag 'v1.0.3' into post-release-1.0.3 Version 1.0.3	2021-01-29 19:30:08 +00:00
Chris Baker	2632b81124	lint some nomad HCL job specs	2021-01-28 12:03:19 +00:00
Chris Baker	2adf0f12d6	e2e: java driver isolation tests	2021-01-28 12:03:19 +00:00
Chris Baker	aa55df0413	additional e2e utils for multi-task allocs	2021-01-28 12:03:19 +00:00
Kris Hicks	d67b77f38e	Add a little comment	2021-01-28 12:03:19 +00:00
Kris Hicks	5cf972d2e7	Add test for alloc exec	2021-01-28 12:03:19 +00:00
Kris Hicks	2db8aa2a52	Add e2e test for raw exec	2021-01-28 12:03:19 +00:00
Kris Hicks	87188f04de	Add PID namespacing and e2e test	2021-01-28 12:03:19 +00:00
Mahmood Ali	c92bb342e1	e2e: skip node drain deadline/force tests	2021-01-27 08:42:16 -05:00
Mahmood Ali	b12e8912a9	e2e: use f.NoError instead of requires	2021-01-27 08:36:23 -05:00
Mahmood Ali	1ac8b32e08	e2e: Disable Connect tests The connect tests are very disruptive: restart consul/nomad agents with new tokens. The test seems particularly flaky, failing 32 times out of 73 in my sample. The tests are particularly problematic because they are disruptive and affect other tests. On failure, the nomad or consul agent on the client can get into a wedged state, so health/deployment info in subsequent tests may be wrong. In some cases, the node will be deemed as fail, and then the subsequent tests may fail when the node is deemed lost and the test allocations get migrated unexpectedly.	2021-01-26 10:01:14 -05:00
Mahmood Ali	36ce1e73eb	e2e: deflake nodedrain test The nodedrain deadline test asserts that all allocations are migrated by the deadline. However, when the deadline is short (e.g. 10s), the test may fail because of scheduler/client-propagation delays. In one failing test, it took ~15s from the RPC call to the moment to the moment the scheduler issued migration update, and then 3 seconds for the alloc to be stopped. Here, I increase the timeouts to avoid such false positives.	2021-01-26 10:01:14 -05:00
Mahmood Ali	cf8f6f07d7	e2e: vault increase timeout Increase the timeout for vaultsecrets. As the default interval is 0.1s, 10 retries mean it only retries for one second, a very short time for some waiting scenarios in the test (e.g. starting allocs, etc).	2021-01-26 10:01:14 -05:00
Mahmood Ali	94ad40907c	e2e: prefer testutil.WaitForResultRetries Prefer testutil.WaitForResultRetries that emits more descriptive errors on failures. `require.Evatually` fails with opaque "Condition never satisfied" error message.	2021-01-26 10:01:14 -05:00
Mahmood Ali	f3f8f15b7b	e2e: special case "Unexpected EOF" errors This is an attempt at deflaking the e2e exec tests, and a way to improve messages. e2e occasionally fail with "unexpected EOF" even though the exec output matches expectations. I suspect there is a race in handling EOF in server/http handling. Here, we special case this error and ensures we get all failures, to help debug the case better.	2021-01-26 10:01:14 -05:00
Mahmood Ali	925d9ce952	e2e: tweak failure messages Tweak the error messages for the flakiest tests, so that on test failure, we get more output	2021-01-26 09:16:48 -05:00
Mahmood Ali	6aa3dec6cc	e2e: use testify requires instead of t.Fatal testify requires offer better error message that is easier to notice when seeing a wall of text in the builds.	2021-01-26 09:14:47 -05:00
Mahmood Ali	236b4055a7	e2e: deflake consul/CheckRestart test Ensure we pass the alloc ID to status. Otherwise, the test may fail if there is another spurious allocation running from another test.	2021-01-26 09:12:20 -05:00
Mahmood Ali	0aafd9af64	e2e: Fix build script and pass shellcheck	2021-01-26 09:11:37 -05:00
Mahmood Ali	4397eda209	Merge pull request #9798 from hashicorp/e2e-terraform-tweaks-20200113 This PR makes two ergonomics changes, meant to get e2e builds more reproducible and ease changes. ### AMI Management First, we pin the server AMIs to the commits associated with the build. No more using the latest AMI a developer build in a test branch, or accidentally using a stale AMI because we forgot to build one! Packer is to tag the AMI images with the commit sha used to generate the image, and then Terraform would look up only the AMIs associated with that sha. To minimize churn, we use the SHA associated with the latest Packer configurations, rather than SHA of all. This has few benefits: reproducibility and avoiding accidental AMI changes and contamination of changes across branches. Also, the change is a stepping stone to an e2e pipeline that builds new AMIs automatically if Packer files changed. The downside is that new AMIs will be generated even for irrelevant changes (e.g. spelling, commits), but I suspect that's OK. Also, an engineer will be forced to build the AMI whenever they change Packer files while iterating on e2e scripts; this hasn't been an issue for me yet, and I'll be open for iterating on that later if it proves to be an issue. ### Config Files and Packer Second, this PR moves e2e config hcl management to Terraform instead of Packer. Currently, the config files live in `./terraform/config`, but they are baked into the servers by Packer and changes are ignored. This current behavior surprised me, as I spent a bit of time debugging why my config changes weren't applied. Having Terraform manage them would ease engineer's iteration. Also, make Packer management more consistent (Packer only works `e2e/terraform/packer`), and easing the logic for AMI change detection. The config directory is very small (100KB), and having it as an upload step adds negligible time to `terraform apply`.	2021-01-25 13:20:28 -05:00
Mahmood Ali	39da228964	update readme about profiles and packer build	2021-01-25 11:40:26 -05:00
Seth Hoenig	8b05efcf88	consul/connect: Add support for Connect terminating gateways This PR implements Nomad built-in support for running Consul Connect terminating gateways. Such a gateway can be used by services running inside the service mesh to access "legacy" services running outside the service mesh while still making use of Consul's service identity based networking and ACL policies. https://www.consul.io/docs/connect/gateways/terminating-gateway These gateways are declared as part of a task group level service definition within the connect stanza. service { connect { gateway { proxy { // envoy proxy configuration } terminating { // terminating-gateway configuration entry } } } } Currently Envoy is the only supported gateway implementation in Consul. The gateay task can be customized by configuring the connect.sidecar_task block. When the gateway.terminating field is set, Nomad will write/update the Configuration Entry into Consul on job submission. Because CEs are global in scope and there may be more than one Nomad cluster communicating with Consul, there is an assumption that any terminating gateway defined in Nomad for a particular service will be the same among Nomad clusters. Gateways require Consul 1.8.0+, checked by a node constraint. Closes #9445	2021-01-25 10:36:04 -06:00
Tim Gross	0b49e3da12	e2e: added tests for check restart behavior	2021-01-22 10:55:40 -05:00
Drew Bailey	630babb886	prevent double job status update (#9768 ) * Prevent Job Statuses from being calculated twice https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval insertion iwth job (de-)registration. This change removes a now obsolete guard which checked if the index was equal to the job.CreateIndex, which would empty the status. Now that the job regisration eval insetion is atomic with the registration this check is no longer necessary to set the job statuses correctly. * test to ensure only single job event for job register * periodic e2e * separate job update summary step * fix updatejobstability to use copy instead of modified reference of job * update envoygatewaybindaddresses copy to prevent job diff on null vs empty * set ConsulGatewayBindAddress to empty map instead of nil fix nil assertions for empty map rm unnecessary guard	2021-01-22 09:18:17 -05:00
Mahmood Ali	9dcdafe4cf	e2e: show command output on failure When a command fails, it's nice to have the full output, as it contains diagnostic information. The status code isn't sufficient for debugging.	2021-01-21 10:32:16 -05:00
Mahmood Ali	923725bf3d	e2e: deflake TestVolumeMounts After submitting an update, the test ought to wait until the new allocations are placed. Previously, we'd use the original to-be-stopped allocations and the test fails when attempting to exec.	2021-01-21 10:28:41 -05:00
Mahmood Ali	95b7fc80b8	e2e deflake namespaces: only check namespace jobs Deflake namespace e2e test by only asserting on jobs related to the namespace tests. During our e2e tests, some left over jobs (e.g. prometheus) are left running while being shutdown and cause the test to fail.	2021-01-21 10:26:24 -05:00
Mahmood Ali	2e8bcac261	e2e: deflake events Handle streamCh channel being closed.	2021-01-21 10:25:42 -05:00
Seth Hoenig	991884e715	consul/connect: Enable running multiple ingress gateways per Nomad agent Connect ingress gateway services were being registered into Consul without an explicit deterministic service ID. Consul would generate one automatically, but then Nomad would have no way to register a second gateway on the same agent as it would not supply 'proxy-id' during envoy bootstrap. Set the ServiceID for gateways, and supply 'proxy-id' when doing envoy bootstrap. Fixes #9834	2021-01-19 12:58:36 -06:00
Mahmood Ali	76ce6306a4	add helper for building ami	2021-01-15 10:49:13 -05:00
Mahmood Ali	e51651c34a	set sha	2021-01-15 10:49:13 -05:00
Mahmood Ali	82637715cf	change ami naming	2021-01-15 10:49:12 -05:00
Mahmood Ali	0af1509a77	move config files to terraform	2021-01-15 10:49:12 -05:00
Seth Hoenig	536747f216	e2e: use jobspec2 Parse for parsing jobfile in e2e utils We directly parse job files in e2eutil, but currently using jobspec package. Instead, use the Parse method from the jobspec2 package so we can parse job files with new features.	2021-01-13 14:00:40 -06:00
James Rasell	d6cab8aa14	Merge pull request #9767 from hashicorp/f-e2e-job-scaling-suite e2e: add job scaling test suite.	2021-01-11 18:35:07 +01:00
Seth Hoenig	64a8b795f2	Merge pull request #9766 from hashicorp/f-bump-cni-plugins-version cni: bump CNI plugins version to v0.9.0	2021-01-11 09:59:43 -06:00
Tim Gross	f97505e384	e2e: remove deprecated terraform syntax Also bumps patch versions of some TF modules	2021-01-11 08:25:22 -05:00
James Rasell	4374d99071	e2e: add job scaling test suite.	2021-01-11 11:34:19 +01:00
Seth Hoenig	fc5f48d936	cni: bump CNI version to v0.9.0 https://github.com/containernetworking/plugins/releases/tag/v0.9.0 Also make the copy-paste install instructions work with arm64 for a better OOTB experience (AWS Graviton, Pi 4's).	2021-01-10 18:03:27 -06:00
James Rasell	108fa33393	Merge pull request #9747 from hashicorp/f-e2e-scaling-policy-suite e2e: add ScalingPolicies test suite with initial test case.	2021-01-08 10:51:48 +01:00
James Rasell	b087d68736	e2e: add ScalingPolicies test suite with initial test case.	2021-01-07 14:39:55 +01:00
James Rasell	02b9d9da87	e2e: move namespace tests into OSS.	2021-01-07 09:15:43 +01:00
Seth Hoenig	7da808b43a	e2e: add terraform lockfile Terraform v0.14 is producing a lockfile after running `terraform init`. The docs suggest we should include this file in the git repository: > You should include this file in your version control repository so > that you can discuss potential changes to your external dependencies > via code review, just as you would discuss potential changes to your > configuration itself. Sounds similar to go.sum https://www.terraform.io/docs/configuration/dependency-lock.html#lock-file-location	2021-01-05 08:55:37 -06:00
Seth Hoenig	59f230714f	e2e: add e2e test for service registration	2021-01-05 08:48:12 -06:00
Chris Baker	57b70a27ec	modified e2e test so that it explicitly tested the use case in #6929	2021-01-04 22:25:39 +00:00
Chris Baker	9b125b8837	update template and artifact interpolation to use client-relative paths resolves #9839 resolves #6929 resolves #6910 e2e: template env interpolation path testing	2021-01-04 22:25:34 +00:00
Tim Gross	26f4ee7fb1	e2e: dnsmasq configuration fixes * systemd units require absolute paths * ensure directory exists for dnsmasq	2021-01-04 15:40:57 -05:00
Tim Gross	c4e57fb813	e2e: document some design goals	2020-12-17 10:33:33 -05:00
Tim Gross	88fc79c35e	e2e: bump default version of dev cluster	2020-12-17 10:33:33 -05:00
Tim Gross	00bc6a7d13	e2e: move dnsmasq config into dnsmasq service unit (#9660 ) Our dnsmasq configuration needs host-specific data that we can't configure in the AMI build. But configuring this in userdata leads to a race between userdata execution, docker.service startup, and dnsmasq.service startup. So rather than letting dnsmasq come up with incorrect configuration and then modifying it after the fact, do the configuration in the service's prestart, and have it kick off a Docker restart when we're done.	2020-12-17 10:33:19 -05:00
Kris Hicks	0cf9cae656	Apply some suggested fixes from staticcheck (#9598 )	2020-12-10 07:29:18 -08:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Drew Bailey	2f5710a3b7	use concrete type helper instead of interface surfing (#9585 ) * use concrete type helper instead of interface surfing * wrap err	2020-12-09 09:02:37 -05:00
Kris Hicks	93155ba3da	Add gocritic to golangci-lint config (#9556 )	2020-12-08 12:47:04 -08:00
Seth Hoenig	ad5918f754	e2e: upgrade terraform consul to 1.9.0	2020-12-03 13:01:14 -06:00
Drew Bailey	17de8ebcb1	API: Event stream use full name instead of Eval/Alloc (#9509 ) * use full name for events use evaluation and allocation instead of short name * update api event stream package and shortnames * update docs * make sync; fix typo * backwards compat not from 1.0.0-beta event stream api changes * use api types instead of string * rm backwards compat note that only changed between prereleases * remove backwards incompat that only existed in prereleases	2020-12-03 11:48:18 -05:00
Jasmine Dahilig	6ea00284f1	lifecycle: update e2e test for service job with new docker signal #8932	2020-12-01 23:41:32 -08:00
Seth Hoenig	1b3d409eba	e2e: use test framework Assertions in connect tests	2020-11-30 08:48:40 -06:00
Seth Hoenig	546a8bfb95	e2e: add e2e test for consul connect ingress gateway demo Add the ingress gateway example from the noamd connect examples to the e2e Connect suite. Includes the ACLs enabled version, which means the nomad server consul acl policy will require operator=write permission.	2020-11-25 16:54:02 -06:00
Seth Hoenig	d850f17bc1	e2e: print consulacls scripts output as string The clean up in #8908 inadvertently caused the output from the scripts involved in the Consul ACL bootstrap process to be printed as a big blob of bytes, which is slightly less useful than the text version.	2020-11-25 15:03:33 -06:00
Tim Gross	481f91034c	E2E: CSI driver provisioning (#9443 ) * e2e/csi: wait longer for plugins to become healthy Plugins are Docker containers, and as such sometimes we get delays in startup due to pulling from the registry and this is a source of test flakiness. Give the plugins a little longer to start up. * e2e/csi: version bump for AWS EBS plugins	2020-11-25 09:05:22 -05:00
Seth Hoenig	74a34704c5	Merge pull request #8743 from hashicorp/f-task_network_warning Validate and document 0.12 mbits/network deprecations	2020-11-23 15:36:18 -06:00
Tim Gross	d686a51d60	e2e: prevent Ubuntu startup race conditions (#9428 ) The cloud-init configuration runs on boot, which can result in a race condition between that and service startup. This has caused provisioning failures because Nomad expects the userdata to have configured a host volume directory. Diagnosing this was also compounded by a warning being fired by systemd for the Nomad unit file. * Update the location of the `StartLimitIntervalSec` field to it's post-systemd-230 location. * Ensure that the weekly AMI build is up-to-date to reduce the risk of unexpected system software changes. * Move the host volume to a directory we can set up at AMI build time rather than in userdata.	2020-11-23 12:29:08 -05:00
Nick Ethier	f1ea79f5a8	remove references to default mbits	2020-11-23 10:32:13 -06:00
Nick Ethier	e8784c919f	e2e: update jobs to use new network stanza format	2020-11-23 10:25:30 -06:00
Chris Baker	00841a8525	events: e2e test that API client honors the index flag	2020-11-21 16:38:24 +00:00
Michael Schurter	43b225b19d	e2e: test template path interpolation	2020-11-18 10:48:58 -08:00
Tim Gross	7e4fd79eee	e2e: CSI test should detect un-deregisterable volumes (#9343 ) Assert that deregistering a volume works without errors following a volume reap. Use CLI helpers where feasible to exercise CSI command line. Dump plugin allocation logs on deregistration failures for debugging purposes.	2020-11-13 09:31:21 -05:00
Jasmine Dahilig	d6110cbed4	lifecycle: add poststop hook (#8194 )	2020-11-12 08:01:42 -08:00
Drew Bailey	9a1fc720c8	enables audit log on full-cluster (#9315 )	2020-11-11 08:33:01 -05:00
Tim Gross	08ae13d3b9	e2e: Windows provisioning improvements (#9246 ) Small changes to the Windows 2016 Packer build for debuggability of provisioning: * improve verbosity of powershell error handling * remove unused "tools" installation * use ssh communicator for Packer to improve Packer build times and eliminate deprecated winrm remote access (unavailable from current macOS)	2020-11-09 13:29:40 -05:00
Drew Bailey	c181973265	append custom path to custom_config_files (#9289 ) * append custom path to custom_config_files * remove config_path variable	2020-11-06 11:16:13 -05:00
Tim Gross	dc8e20206d	E2E: switch packer build files to HCL2 (#9219 ) Build configuration files need comments, and JSON is also just the worst, isn't it? Upgrade our E2E packer configs to use the new HCL2 syntax.	2020-10-29 10:03:39 -04:00
Tim Gross	06c75460f3	e2e: provide precedence for version variables (#9216 ) The `nomad_sha`, `nomad_version`, and `nomad_local_binary` variables for the Nomad provisioning module assumed that only one would be set. By having the override each other with an explicit precedence, it makes it easier to avoid problems with Terraform's implicit variables behavior. Set the expected default values in the `terraform.full.tfvars` to avoid shadowing by any future changes to the `terraform.tfvars` file. Update the Makefile to put the `-var` and `-var-file` in the correct order.	2020-10-29 09:15:22 -04:00
Tim Gross	57f694ff2e	E2E: AMI software version bumps and cleanup (#9213 ) * remove unused vault installation from Windows AMI * match Windows and Linux Consul versions * bump AMI base Nomad to current stable	2020-10-29 08:27:50 -04:00
Tim Gross	a2710c7a31	e2e: set default version for dev cluster (#9208 )	2020-10-28 16:50:20 -04:00
Tim Gross	99c2a2df00	e2e: reduce risk of flaky Ubuntu AMI build (#9207 ) The base Ubuntu AMI modifies apt sources during cloud-init. But the Packer build can potentially start the setup script before that work is done, resulting in errors trying to install base system dependencies like `dnsmasq`. Delay the setup long enough to lose the race with cloud-init.	2020-10-28 15:13:44 -04:00
Tim Gross	7e4a35ad7e	e2e: use more specific names for OS/distros (#9204 ) We intend to expand the nightly E2E test to cover multiple distros and platforms. Change the naming structure for "Linux client" to the more precise "Ubuntu Bionic", and "Windows" to "Windows 2016" to make it easier to add new targets without additional refactoring.	2020-10-28 12:58:00 -04:00
Tim Gross	be3f54d296	e2e: make dev cluster the default Terraform vars file (#9202 ) Most of the time that a human is running the TF provisioning, they want the "dev cluster" which is going to deploy an OSS sha, with fewer targets and configuration alternatives. But the default `terraform.tfvars` is the nightly E2E run. Because the nightly run is automated, there's no reason we can't have it pick a non-default `terraform.full.tfvars` file and have the default be the dev cluster.	2020-10-28 10:01:42 -04:00
Tim Gross	4fe1edfd63	Revert "e2e: fix destination of templates in VaultSecrets test (#9146 )" (#9163 ) This reverts commit 8aed53c177aea024d4f24d1fbb4d6e0881f04eab.	2020-10-23 09:01:25 -04:00
Tim Gross	1fb1c9c5d4	artifact/template: make destination path absolute inside taskdir (#9149 ) Prior to Nomad 0.12.5, you could use `${NOMAD_SECRETS_DIR}/mysecret.txt` as the `artifact.destination` and `template.destination` because we would always append the destination to the task working directory. In the recent security patch we treated the `destination` absolute path as valid if it didn't escape the working directory, but this breaks backwards compatibility and interpolation of `destination` fields. This changeset partially reverts the behavior so that we always append the destination, but we also perform the escape check on that new destination after interpolation so the security hole is closed. Also, ConsulTemplate test should exercise interpolation	2020-10-22 15:47:49 -04:00
Tim Gross	344e821ace	e2e: fix destination of templates in VaultSecrets test (#9146 ) The `$NOMAD_SECRETS_DIR` environment variable is rendered as `/secrets`, which prior to the recent security patch would unintentionally escape the file sandbox and get dropped in a directory named `/secrets` where the Nomad client binary was running. The `VaultSecrets` test was accidentally relying on this behavior and that causes the test to fail.	2020-10-22 13:00:08 -04:00
Tim Gross	9fa38bac98	e2e: path fixes for local_binary uploads (#9137 ) When uploading a local binary for provisioning, the location that we pass into the provisioning script needs to be where we uploaded it to, not the source on our laptop. Also, the null_resource for uploading needs to read in the private key, not its path.	2020-10-21 10:20:22 -04:00
Drew Bailey	8451de99b2	adds two base event stream e2e tests (#9126 ) * adds two base event stream e2e tests test evaluation filter keys are included * Apply suggestions from code review Co-authored-by: Tim Gross <tgross@hashicorp.com> * gc aftereach Co-authored-by: Tim Gross <tgross@hashicorp.com>	2020-10-20 08:26:21 -04:00
Tim Gross	8fcdbe0592	e2e: add reporting to flaky spread test (#9115 ) The spread test is infrequently flaky and it's hard to extract what's actually happening. If the test fails, dump all the allocation metrics so that we can debug the behavior.	2020-10-16 11:01:07 -04:00
Tim Gross	54d7f57662	e2e: fix flaky TaskEventsTest (#9114 ) Assert that we get at least N task events, rather than exactly N. When a task within an allocation dies, a sibling task can get an Allocation Unhealthy event after it's also killed, even though it's not the origin of the event.	2020-10-16 10:22:40 -04:00
Tim Gross	e0ff06be2f	e2e: networking test job needs to outlast assert (#9113 ) The `e2ejob` utility asserts that a job is running for 5s, but with a sleep time of 5s, the networking job can race with that check. Sleeping for a longer period should guarantee that we're running long enough to pass the assert. Also constrains the job to Linux because our Windows test targets don't yet support Docker (LCOW), and expand the set of DCs we can safely land on.	2020-10-16 10:13:16 -04:00
Chris Baker	0a85d2bd24	Merge pull request #9089 from hashicorp/b-explicit-rune fix go 1.15 pickiness	2020-10-14 10:37:36 -05:00
Tim Gross	fe88003f29	e2e: eliminate race condition causing rescheduling test flake (#9085 ) The autorevert test checks for reverted allocations to be placed and running before checking the deployment status, but the deployment can be completed and marked "successful" before we check it for "running" status. Instead, just wait for it to be marked "successful" and assert we have the expected count of deployment statuses.	2020-10-14 11:35:30 -04:00
Tim Gross	76f1f5e5df	e2e: use AMI filter for Ubuntu packer image (#9086 ) Instead of hard-coding the base AMI for our Packer image for Ubuntu, use the latest from Canonical so that we always have their current kernel patches.	2020-10-14 11:22:33 -04:00
Chris Baker	d4bae840b2	fix go 1.15 pickiness	2020-10-14 15:19:54 +00:00
Nick Ethier	f5250499b9	e2e/networking: use correct dc (#9088 )	2020-10-14 11:14:09 -04:00
Tim Gross	115edb53a0	e2e: add flag to opt-in to creating EBS/EFS volumes (#9082 ) For everyday developer use, we don't need volumes for testing CSI. Providing a flag to opt-in speeds up deploying dev clusters and slightly reduces infra costs. Skip CSI test if missing volume specs.	2020-10-14 10:29:33 -04:00
Tim Gross	65282a7cf1	E2E: vault secrets (#9081 ) * rename vault API compatibility test for clarity * exercise vault secrets lease renewal	2020-10-14 08:43:28 -04:00
Nick Ethier	d45be0b5a6	client: add NetworkStatus to Allocation (#8657 )	2020-10-12 13:43:04 -04:00
Yoan Blanc	891accb89a	use allow/deny instead of the colored alternatives (#9019 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-10-12 08:47:05 -04:00
Tim Gross	474c18102d	e2e: extend ConsulTemplate test and fix flakiness (#8997 ) Add service discovery integration to the existing consul-template E2E test, and verify both service and key updates force re-rendering. Fixes flakiness by using the longer default wait config we use elsewhere. Removes our last direct dependency on gomega.	2020-10-05 10:51:55 -04:00
Tim Gross	727277793b	e2e: bootstrap vault and provision Nomad with vault tokens (#9010 ) Provisions vault with the policies described in the Nomad Vault integration guide, and drops a configuration file for Nomad vault server configuration with its token. The vault root token is exposed to the E2E runner so that tests can write additional policies to vault.	2020-10-05 09:28:37 -04:00
Tim Gross	b6292528fe	e2e: tfvars.dev file must override default tfvars file (#9005 ) The `-var-file` flag for loading variables into Terraform overlays the default variables file if present. This means that variables that are set in the default variables file will take precedence if the overlay file does not have them set. Set `nomad_acls` and `nomad_enteprise` to `false` in the dev cluster.	2020-10-02 08:02:37 -04:00
Tim Gross	4bab91b81b	e2e: ensure tests are constrained to Linux (#8990 ) Until we have LCOW support in the E2E environment (which requires a Windows 2019 test target), we need to constrain E2E tests to the appropriate kernel	2020-09-30 09:43:30 -04:00
Tim Gross	e49410e97b	e2e: cleanup errors should use assert, not require (#8989 ) The E2E framework wraps testify's `require` so that by default we can stop tests on errors, but the cleanup functions should use `assert` so that we continue to try to cleanup the test environment even if there's a failure.	2020-09-30 09:00:37 -04:00
Tim Gross	fa1fa623f2	e2e: rework rescheduling progress deadline test (#8958 ) Eliminate sources of randomness in the progress deadline test and clarify the purpose of the test to check for progress deadline updates.	2020-09-29 11:02:16 -04:00
Tim Gross	6489c5f626	e2e: namespace support for CLI helpers (#8978 ) Required to support tests for namespaces and other ENT features.	2020-09-28 16:37:34 -04:00
Tim Gross	6bed4ec45b	e2e: ENT placeholder for namespace/quotas tests (#8973 )	2020-09-28 11:23:37 -04:00
Tim Gross	1311f32f1b	e2e: test for host volumes and Docker volumes (#8972 ) Exercises host volume and Docker volume functionality for the `exec` and `docker` task driver, particularly around mounting locations within the container and how this can be used with `template`.	2020-09-28 11:14:13 -04:00
Tim Gross	566dae7b19	e2e: add flag to bootstrap Nomad ACLs (#8961 ) Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad TF provider if we're bootstrapping in the same Terraform stack, so instead of using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous policy. The resulting management token is exported as an environment var with `$(terraform output environment)` and tests that want stricter ACLs will be able to write them using that token. This should also provide a basis to do similar work with Consul ACLs in the future.	2020-09-28 09:22:36 -04:00
Tim Gross	15d3f5ea7e	e2e: remove unused migrations test (#8955 ) The areas of the code this test exercised were merged in with the node drain tests.	2020-09-23 14:50:15 -04:00
Tim Gross	147b16243d	e2e: use more recent instance type (#8954 ) Newer EC2 instances are both cheaper and have generally better performance. The dnsmasq configuration had a hard-coded interface name, so in order to accomodate instances with more recent networking that result in so-called predictable interface names, the dnsmasq configuration needs to be replaced at runtime with userdata to select the default interface.	2020-09-23 14:27:52 -04:00
Tim Gross	1fc525ec1e	e2e: add flags for provisioning Nomad Enterprise (#8929 )	2020-09-23 10:39:04 -04:00
Tim Gross	9cbc604308	e2e: node drain tests (#8906 ) Exercise the `nomad node drain` features, driving them via the new CLI helpers.	2020-09-21 11:52:11 -04:00
Tim Gross	34093f7747	e2e: reschedule tests should check for non-zero rescheduled allocs (#8927 ) The conditional around some of the rescheduling tests was backwards, where we were waiting for allocations to be rescheduled but testing for a count of 0. The test was passing but flaky because if the check happened quickly enough before the scheduler rescheduled the allocations, it would pass.	2020-09-21 08:17:24 -04:00
Tim Gross	3da61545d5	make sure dev-cluster has the option to run windows config (#8928 )	2020-09-18 16:41:35 -04:00
Tim Gross	ea1f6408bf	e2e: remove unused framework provisioning code (#8908 )	2020-09-18 11:46:47 -04:00
Tim Gross	c413fa5e49	e2e: test script for Terraform logic (#8907 )	2020-09-18 11:46:40 -04:00
Tim Gross	9d37233eaf	e2e: provision cluster entirely through Terraform (#8748 ) Have Terraform run the target-specific `provision.sh`/`provision.ps1` script rather than the test runner code which needs to be customized for each distro. Use Terraform's detection of variable value changes so that we can re-run the provisioning without having to re-install Nomad on those specific hosts that need it changed. Allow the configuration "profile" (well-known directory) to be set by a Terraform variable. The default configurations are installed during Packer build time, and symlinked into the live configuration directory by the provision script. Detect changes in the file contents so that we only upload custom configuration files that have changed between Terraform runs	2020-09-18 11:27:24 -04:00
Tim Gross	990fcf7be4	e2e: documentation and minor tweaks to configs (#8912 ) * remove outdated references to envchain in documentation * add new host volume locations in userdata * don't exit the entire script during provisioning, just return	2020-09-17 09:20:18 -04:00
Tim Gross	d7a013b6f5	e2e: refactor CLI utils out of rescheduling test (#8905 ) The CLI helpers in the rescheduling test were intended for shared use, but until some other tests were written we didn't want to waste time making them generic. This changeset refactors them and adds some new helpers associated with the node drain tests (under separate PR).	2020-09-16 16:10:06 -04:00
Tim Gross	bd889c82aa	e2e: constrain rescheduling test workloads to Linux (#8872 ) The rescheduling test workloads were created before we had Windows targets in the E2E nightly run. When these were recently ported to the e2e framework they were missing the constraint to Linux machines. Also added a little extra time to polling to avoid some flakiness on the first run, and a minor readability adjustment to the job names.	2020-09-11 09:21:28 -04:00
Tim Gross	572ae37856	Merge pull request #8860 E2E: rescheduling tests	2020-09-10 13:43:55 -04:00
Tim Gross	294c7149a2	e2e: rescheduling tests Ports the rescheduling tests (which aren't running in CI) into the current test framework so that they're run on nightly, and exercises the new CLI helpers.	2020-09-10 13:00:37 -04:00
Tim Gross	28e9bbbbf4	e2e: helper for sending CLI commands and parsing output The E2E suite exercises the API, but not the CLI. This changeset adds a helper function to send commands via a locally-built Nomad binary (which we'll need to add to the E2E setup), and some helpers to parse the resulting structured outputs in a way that tests can consume.	2020-09-10 13:00:32 -04:00
Michael Schurter	5f3a71d0b9	docs: update scripts to 0.12.4	2020-09-09 15:22:37 -07:00
James Rasell	76b03d3a2f	e2e: fix failure in running metrics test suite jobs. When running the Fabio and Prometheus jobs for the metrics suite it seems the outer directory is required in the call when registering the job. error: "e2e/input/fabio.nomad: no such file or directory"	2020-09-09 08:40:35 +02:00
Tim Gross	f499b44101	e2e: move setup jobs for metrics test into that suite (#8842 ) The fabio and prometheus workloads are specific to the metrics test and aren't used by any other test suite.	2020-09-08 13:21:44 -04:00
Tim Gross	a47b1c1081	e2e: move configurations into profile-specific directories (#8828 ) This changeset stages upcoming E2E provisioning improvements work. It splits the existing shared configuration directory into 3 profiles: * "full-cluster": the set of configurations currently in use * "dev-cluster": a simplified set of mostly existing configurations that weren't in use. * "custom": an empty profile for developers to keep non-standard configurations during complex feature development. The tooling to switch between profiles will be in a later changeset. Also drops some unused configuration knobs from the provisioning scripts to make the next stage of work easier.	2020-09-04 11:23:32 -04:00
Tim Gross	93c1093274	e2e: remove unused EBS volumes and depends_on (#8827 ) Our provisioning process for E2E doesn't require the `depends_on` fields to be set for client instances, so dropping that field allows all instances to be started in parallel. We don't use the extra EBS volumes (they aren't even mounted), so remove them to reduce costs.	2020-09-04 10:25:59 -04:00
Tim Gross	0577b03479	e2e: minor rename and cleanup (#8824 )	2020-09-04 08:51:22 -04:00
Tim Gross	e6cdd8e0c0	e2e: consolidate cloud-specific Consul configs (#8823 ) The `-recursor` flag in the Consul service unit files is specific to a given cloud, but we already have cloud-specific configuration files. Consolidate all the cloud-specific items into the config.	2020-09-04 08:51:15 -04:00

1 2 3 4 5 ...

535 commits