open-nomad

Commit Graph

Author	SHA1	Message	Date
Seth Hoenig	9b20ca5b25	e2e: setup consul ACLs a little more correctly	2020-01-31 19:06:11 -06:00
Seth Hoenig	83c717a624	e2e: remove redundant extra API call for getting allocs	2020-01-31 19:06:07 -06:00
Seth Hoenig	b212654b92	e2e: agent token was only being set for server0	2020-01-31 19:06:03 -06:00
Seth Hoenig	f7a1e9cee3	e2e: use hclfmt on consul acls policy config files	2020-01-31 19:05:59 -06:00
Seth Hoenig	e9e0d2e3fc	e2e: uncomment test case that is not broken	2020-01-31 19:05:55 -06:00
Seth Hoenig	df633ee45f	e2e: do not use eventually when waiting for allocs This test is causing panics. Unlike the other similar tests, this one is using require.Eventually which is doing something bad, and this change replaces it with a for-loop like the other tests. Failure: === RUN TestE2E/Connect === RUN TestE2E/Connect/connect.ConnectE2ETest === RUN TestE2E/Connect/connect.ConnectE2ETest/TestConnectDemo === RUN TestE2E/Connect/connect.ConnectE2ETest/TestMultiServiceConnect === RUN TestE2E/Connect/connect.ConnectClientStateE2ETest panic: Fail in goroutine after TestE2E/Connect/connect.ConnectE2ETest has completed goroutine 38 [running]: testing.(common).Fail(0xc000656500) /opt/google/go/src/testing/testing.go:565 +0x11e testing.(common).Fail(0xc000656100) /opt/google/go/src/testing/testing.go:559 +0x96 testing.(common).FailNow(0xc000656100) /opt/google/go/src/testing/testing.go:587 +0x2b testing.(common).Fatalf(0xc000656100, 0x1512f90, 0x10, 0xc000675f88, 0x1, 0x1) /opt/google/go/src/testing/testing.go:672 +0x91 github.com/hashicorp/nomad/e2e/connect.(ConnectE2ETest).TestMultiServiceConnect.func1(0x0) /home/shoenig/go/src/github.com/hashicorp/nomad/e2e/connect/multi_service.go:72 +0x296 github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually.func1(0xc0004962a0, 0xc0002338f0) /home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1494 +0x27 created by github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually /home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1493 +0x272 FAIL github.com/hashicorp/nomad/e2e 21.427s	2020-01-31 19:05:47 -06:00
Seth Hoenig	5e5fadbcdf	e2e: remove forgotten unused field from new struct	2020-01-31 19:05:41 -06:00
Seth Hoenig	fc498c2b96	e2e: e2e test for connect with consul acls Provide script for managing Consul ACLs on a TF provisioned cluster for e2e testing. Script can be used to 'enable' or 'disable' Consul ACLs, and automatically takes care of the bootstrapping process if necessary. The bootstrapping process takes a long time, so we may need to extend the overall e2e timeout (20 minutes seems fine). Introduces basic tests for Consul Connect with ACLs.	2020-01-31 19:05:36 -06:00
Seth Hoenig	93d347442f	e2e: add a -suite flag to e2e.Framework This change allows for providing the -suite=<Name> flag when running the e2e framework. If set, only the matching e2e/Framework.TestSuite.Component will be run, and all ther suites will be skipped.	2020-01-29 14:57:43 -06:00
Drew Bailey	da4af9bef3	fix tests, update changelog	2020-01-29 13:55:39 -05:00
Tim Gross	7681f09ae4	e2e: packer builds should not be public (#6998 )	2020-01-27 16:28:25 -05:00
Michael Schurter	ed926a9d03	Merge pull request #6938 from hashicorp/e2e-vault test: download Vault binaries for e2e test	2020-01-27 10:26:48 -08:00
Tim Gross	457e3ad5c6	e2e: document e2e provisioning process (#6976 )	2020-01-22 16:55:17 -05:00
Tim Gross	29e1ed6b05	e2e: ensure group script check tests interpolation (#6972 ) Fixes a bug introduced in 0aa58b9 where we're writing a test file to a taskdir-interpolated location, which works when we `alloc exec` but not in the jobspec for a group script check. This changeset also makes the test safe to run multiple times by namespacing the file with the alloc ID, which has the added bonus of exercising our alloc interpolation code for group script checks.	2020-01-22 09:54:54 -05:00
Tim Gross	2edbdfc8be	e2e: update framework to allow deploying Nomad (#6969 ) The e2e framework instantiates clients for Nomad/Consul but the provisioning of the actual Nomad cluster is left to Terraform. The Terraform provisioning process uses `remote-exec` to deploy specific versions of Nomad so that we don't have to bake an AMI every time we want to test a new version. But Terraform treats the resulting instances as immutable, so we can't use the same tooling to update the version of Nomad in-place. This is a prerequisite for upgrade testing. This changeset extends the e2e framework to provide the option of deploying Nomad (and, in the future, Consul/Vault) with specific versions to running infrastructure. This initial implementation is focused on deploying to a single cluster via `ssh` (because that's our current need), but provides interfaces to hook the test run at the start of the run, the start of each suite, or the start of a given test case. Terraform work includes: * provides Terraform output that written to JSON used by the framework to configure provisioning via `terraform output provisioning`. * provides Terraform output that can be used by test operators to configure their shell via `$(terraform output environment)` * drops `remote-exec` provisioning steps from Terraform * makes changes to the deployment scripts to ensure they can be run multiple times w/ different versions against the same host.	2020-01-22 08:48:52 -05:00
Tim Gross	d6aac915a7	e2e: use valid jobspec for group check test (#6967 ) Group service checks cannot interpolate task fields, because the task fields are not available at the time the script check hook is created for the group service. When f31482a was merged this e2e test began failing because we are now correctly matching the script check ID to the service ID, which revealed this jobspec was invalid.	2020-01-21 15:54:46 -05:00
Tim Gross	1e600d573d	e2e: improve reusability of provisioning scripts (#6942 ) This changeset is part of the work to improve our E2E provisioning process to allow our upgrade tests: * Move more of the setup into the AMI image creation so it's a little more obvious to provisioning config authors which bits are essential to deploying a specific version of Nomad. * Make the service file update do a systemd daemon-reload so that we can update an already-running cluster with the same script we use to deploy it initially.	2020-01-16 09:29:36 -05:00
Michael Schurter	ffbfb60f40	test: restore e2e-test target and use -integration	2020-01-14 13:47:51 -08:00
Michael Schurter	da4645e9a4	test: download Vault binaries for e2e test Modernize Vault integration/e2e test a bit: - Download from releases.hashicorp.com instead of using a hardcoded list - Remove old unused make target e2e-test - Use NOMAD_E2E env var instead of -integration flag - Add a README On my machine with ~250 Mbps internet it takes ~400s to download all Vault binaries.	2020-01-14 11:02:02 -08:00
Nick Ethier	1f28633954	Merge pull request #6816 from hashicorp/b-multiple-envoy connect: configure envoy to support multiple sidecars in the same alloc	2020-01-09 23:25:39 -05:00
Tim Gross	b5bcfb533b	upgrade CNI plugins to 0.8.4 (#6921 ) When multiple Connect-enabled task groups start on the same client node, a race condition in the CNI plugins for creating iptables chains causes one of the tasks to fail. We upstreamed a patch to CNI plugins to make iptables chain creation idempotent. This changeset updates end-to-end testing, development tooling, and documentation to use 0.8.4 which includes our patch.	2020-01-09 10:57:07 -05:00
Tim Gross	c11cc60674	commit a hclfmt to eliminate diffs after 'make dev'	2020-01-09 08:18:51 -05:00
Nick Ethier	7b931522f0	e2e: add test for multiple sevice sidecars in the same alloc	2020-01-06 12:48:35 -05:00
Tim Gross	4ba5691656	e2e: give metrics longer to settle (#6884 ) Increase the shortened timeout after the first loop so that metrics that take longer to come in aren't failing the test unnecessarily. Move the check for empty alloc metrics into the loop so that if the first values we get are empty we don't fail the test too early.	2019-12-20 10:39:35 -05:00
Tim Gross	9b2b4da3a4	e2e: run client/allocs metrics nightly tests vs Windows (#6850 ) Adds Windows targets to the client/allocs metrics tests. Removes the `allocstats` test, which covers less than these tests and is now redundant. Adds a firewall rule to our Windows instances so that the prometheus server can scrape the Nomad HTTP API for metrics.	2019-12-16 08:34:17 -05:00
Tim Gross	e439e927ed	e2e: run client/allocs metrics tests nightly (#6842 ) Refactor the metrics end-to-end tests so they can be run with our e2e test framework. Runs fabio/prometheus and a collection of jobs that will cause metrics to be measured. We then query Prometheus to ensure we're publishing those allocation metrics and some metrics from the clients as well. Includes adding a placeholder for running the same tests on Windows.	2019-12-12 12:45:16 -05:00
Seth Hoenig	d81a091ccd	Merge pull request #6752 from hashicorp/docs-vault-token_period docs: vault integration docs should reference new token_period field	2019-12-02 16:21:17 -05:00
Seth Hoenig	953e40c8ed	docs: vault integration docs should reference new token_explicit_max_ttl field	2019-12-02 14:22:47 -06:00
Tim Gross	88cb95261b	e2e: add allocstats test for Windows (#6775 ) Extends the BasicAllocStats test to include a test for Windows clients, exercising stats via a powershell `raw_exec` job. Adds `ListLinuxClientNodes` and `ListWindowsClientNodes` utils so that we can scope tests to run only when Linux or Windows clients are available. This prevents waiting on timeouts when running a subset of the tests against a development cluster (vs our nightly test cluster).	2019-11-26 08:05:42 -05:00
Mahmood Ali	e626a145c6	Merge pull request #6713 from alrs/fix-e2e-cli-close-before-error e2e/cli/command: close after error handling	2019-11-25 14:03:25 -05:00
Lars Lehtonen	c9383ca17d	e2e/cli/command: Wait() after execution	2019-11-25 10:56:40 -08:00
Tim Gross	c9d92f845f	e2e: add a Windows client to test runner (#6735 ) * Adds a constraint to prevent tests from landing on Windows * Improve Terraform output for mixed windows/linux clients * Makes some Windows client config fixes from 0.10.2 testing	2019-11-25 13:31:00 -05:00
Tim Gross	e012c2b5bf	Infrastructure for Windows e2e testing (#6584 ) Includes: * baseline Windows AMI * initial pass at Terraform configurations * OpenSSH for Windows Using OpenSSH is a lot nicer for Nomad developers than winrm would be, plus it lets us avoid passing around the Windows password in the clear. Note that now we're copying up all the provisioning scripts and configs as a zipped bundle because TF's file provisioner dies in the middle of pushing up multiple files (whereas `scp -r` works fine). We're also running all the provisioning scripts inside the userdata by polling for the zip file to show up (gross!). This is because `remote-exec` provisioners are failing on Windows with the same symptoms as: https://github.com/hashicorp/terraform/issues/17728 If we can't fix this, it'll prevent us from having multiple Windows clients running until TF supports count interpolation in the `template_file`, which is planned for a later 0.12 release.	2019-11-19 11:06:10 -05:00
Tim Gross	1210261fe2	hclfmt nomad jobspecs (#6724 )	2019-11-19 10:36:41 -05:00
Drew Bailey	2befab6900	Merge pull request #6573 from hashicorp/update-cci-consul updates default consul version to 1.6.1	2019-11-07 11:01:22 -05:00
Drew Bailey	1c2af019c6	update vagrant & packer consul versions	2019-11-07 10:13:14 -05:00
Drew Bailey	786989dbe3	New monitor pkg for shared monitor functionality Adds new package that can be used by client and server RPC endpoints to facilitate monitoring based off of a logger clean up old code small comment about write rm old comment about minsize rename to Monitor Removes connection logic from monitor command Keep connection logic in endpoints, use a channel to send results from monitoring use new multisink logger and interfaces small test for dropped messages update go-hclogger and update sink/intercept logger interfaces	2019-11-05 09:51:49 -05:00
Tim Gross	3e9ae481ce	e2e: refactor Consul configurations (#6559 ) Ensure that we're reusing the base configuration between client and servers without the possibility of drift. Reduce the amount of `sed` mangling of the configuration file, and make recommended changes from `shellcheck` for this section of the provisioning script. Fixes some rebase errors on the Nomad config as well.	2019-10-28 09:27:40 -04:00
Tim Gross	ba7e7413ef	e2e: refactor Nomad configuration (#6560 ) Share base configuration for telemetry and consul. Have the server configurations respect the `var.server_count` config. Make changes recommended by `shellcheck` in the provisioning scripts for this section. Switch to OS/arch-tagged release bundles on S3 for compatibility with adding Windows builds in the near future.	2019-10-28 08:21:02 -04:00
Tim Gross	8be403f47b	e2e: refactor Vault configuration (#6561 ) Match the configuration directory layout we're using for Consul and other services. Make recommended changes from `shellcheck` for this section of the provisioning script.	2019-10-25 15:29:01 -04:00
Tim Gross	87b3abddd3	e2e: use sockaddr for IP address configuration (#6548 ) Update the Consul and Vault configs to take advantage of their included `go-sockaddr` library for getting the IP addresses we need in a portable way. This particularly avoids problems with "predictable" interface names provided by systemd. Also adds the `sockaddr` binary to the Packer build so we can use it in our provisioning scripts.	2019-10-25 14:08:38 -04:00
Tim Gross	efbd680d4e	e2e: split Packer build scripts from TF provisioning (#6542 ) Make a clear split between Packer and Terraform provisioning steps: the scripts in the `packer/linux` directory are run when we build the AMI whereas the stuff in shared are run at Terraform provisioning time. Merging all runtime provisioning scripts into a single script for each of server/client solves the following: * Userdata scripts can't take arguments, they can only be templated and that means we have to do TF escaping in bash/powershell scripts. * TF provisioning scripts race with userdata scripts.	2019-10-25 08:08:24 -04:00
Tim Gross	c648c4f998	e2e: upgrade terraform to 0.12.x (#6489 )	2019-10-14 11:27:08 -04:00
Tim Gross	15e912ddd6	e2e: move remote-exec inline to script (#6488 ) A failing script in a `remote-exec` provisioner's `inline` stanza won't fail the provisioning step. This lets us continue on to execute tests against potentially broken deployments, rather than letting us know the provisioning itself failed.	2019-10-14 10:23:41 -04:00
Danielle Lancashire	199d24d6bf	chore: initial hclfmt	2019-10-11 14:00:05 +02:00
Lang Martin	0648402150	Merge pull request #6373 from hashicorp/b-raft-proto-upgrade raft protocol defaults to version 2	2019-09-26 14:33:09 -04:00
Tim Gross	d965a15490	driver/networking: don't recreate existing network namespaces	2019-09-25 14:58:17 -04:00
Tim Gross	e86a476bbb	failing test for #6310	2019-09-25 14:58:17 -04:00
Lang Martin	6e0ec6302b	script e2e/upgrades: cluster upgrade scripts	2019-09-24 14:35:45 -04:00
Danielle	940bbcc639	Merge pull request #6342 from hashicorp/f-host-volume-e2e Add Host Volumes E2E test	2019-09-18 12:59:32 -07:00

1 2 3 4 5

212 Commits