open-nomad

Author	SHA1	Message	Date
Michael Schurter	ed3218c3dd	Fixing flaky TestOverlap test (#14780 ) * test: ensure feasible node selected in overlap test * test: warn when getting close to retry limit	2022-10-03 14:35:02 -07:00
Seth Hoenig	7235d9988b	e2e: convert chroot env unit tests into e2e tests (#14710 ) This PR translates two of our most flakey unit tests into e2e tests where they are fit much more naturally.	2022-09-26 15:40:29 -05:00
Michael Schurter	6161b417f3	test: add e2e for non-overlapping placements (#14646 ) * test: add e2e for non-overlapping placements Followup to #10446 Fails (as expected) against 1.3.x at the wait for blocked eval (because the allocs are allowed to overlap). Passes against 1.4.0-beta.1 (as expected). * Update e2e/overlap/overlap_test.go Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2022-09-22 13:06:17 -07:00
Seth Hoenig	2088ca3345	cleanup more helper updates (#14638 ) * cleanup: refactor MapStringStringSliceValueSet to be cleaner * cleanup: replace SliceStringToSet with actual set * cleanup: replace SliceStringSubset with real set * cleanup: replace SliceStringContains with slices.Contains * cleanup: remove unused function SliceStringHasPrefix * cleanup: fixup StringHasPrefixInSlice doc string * cleanup: refactor SliceSetDisjoint to use real set * cleanup: replace CompareSliceSetString with SliceSetEq * cleanup: replace CompareMapStringString with maps.Equal * cleanup: replace CopyMapStringString with CopyMap * cleanup: replace CopyMapStringInterface with CopyMap * cleanup: fixup more CopyMapStringString and CopyMapStringInt * cleanup: replace CopySliceString with slices.Clone * cleanup: remove unused CopySliceInt * cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice * cleanup: replace CopyMap with maps.Clone * cleanup: run go mod tidy	2022-09-21 14:53:25 -05:00
James Rasell	3f78a51fa5	e2e: use unique names for Connect ACL Consul policy names. (#14604 ) In the event a single test fails to clear up properly after itself, all other tests will fail as they attempt to create ACL policies with the same names. This change ensures they use unique ACL names, so when a single test fails, it is easy to identify it is a problem with the test rather than the suite.	2022-09-16 13:35:40 +02:00
James Rasell	90d0b9157f	e2e: rewrite spread suite to use new e2e style. (#14598 ) The rewrite refactors the suite to use the new style along with other recent testing improvements. In order to ensure the spread tests do not impact each other, there is new cleanup functionality to ensure both the job and allocations are removed from state before the test exits completely.	2022-09-15 17:12:20 +02:00
James Rasell	d65267c60c	e2e: do not assume clean cluster when checking return objects. (#14557 )	2022-09-13 14:25:19 +02:00
James Rasell	1f877bac1c	acl: fix encoding expiration time in ACL token list API. (#14542 )	2022-09-12 15:50:35 +02:00
James Rasell	6f790769bb	e2e: fixup service discovery and ACL expiration tests. (#14517 ) The NSD checks tests were racey, whereby the check may not have been triggered by the time it was queried. This change wraps the check so it can account for this. This removes the current ACL expiration GC section in order to get the tests passing and allow more time to investigate the test. I have full confidence the feature is working as expected and have tested extensively locally.	2022-09-09 14:27:40 +02:00
James Rasell	d14d6e051a	e2e: fixup token expiration test to account for longer forced GC. (#14491 )	2022-09-08 14:43:04 +02:00
James Rasell	e24de517fa	e2e: add test to exercise ACL tokens with role and policy links. (#14432 )	2022-09-02 08:56:00 +02:00
James Rasell	5d0cc93939	e2e: add acl test for token expiration. (#14418 ) In order to add an E2E test to cover token expiration, the server config has been updated to include a low minimum allowed TTL value. For ease of reading, the max value is also set.	2022-09-01 09:36:09 +02:00
James Rasell	5f3665230b	e2e: add ACL test suite with ACL Role test. (#14398 ) This adds a new ACL test suite to the e2e framework which includes an initial test for ACL roles. The ACL test includes a helper to track and clean created Nomad resources which keeps the test cluster clean no matter if the test fails early or not.	2022-08-31 10:11:28 +02:00
Seth Hoenig	38727b6ab9	e2e: add e2e tests for nomad service disco checks This PR adds 2 e2e tests for ensuring nomad service discovery checks get created and produce status results as expected.	2022-08-22 15:31:13 -05:00
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Seth Hoenig	b3ea68948b	build: run gofmt on all go source files Go 1.19 will forecefully format all your doc strings. To get this out of the way, here is one big commit with all the changes gofmt wants to make.	2022-08-16 11:14:11 -05:00
Tim Gross	6c080e0b10	e2e: move namespaces test out of legacy framework (#13934 ) This PR continues work we've started on other test suites to use the native golang test runner instead of the custom framework.	2022-08-01 13:24:34 -04:00
Seth Hoenig	634d84edec	e2e: add nsd simple load balancing test	2022-07-14 15:07:19 -05:00
James Rasell	17a467020c	e2e: add terraform init commands to readme doc. (#13655 )	2022-07-08 16:52:35 +02:00
James Rasell	181b247384	core: allow pausing and un-pausing of leader broker routine (#13045 ) * core: allow pause/un-pause of eval broker on region leader. * agent: add ability to pause eval broker via scheduler config. * cli: add operator scheduler commands to interact with config. * api: add ability to pause eval broker via scheduler config * e2e: add operator scheduler test for eval broker pause. * docs: include new opertor scheduler CLI and pause eval API info.	2022-07-06 16:13:48 +02:00
Derek Strickland	34dea90d7a	docker: update images to reference hashicorpdev Docker organization (#12903 ) docker: update images to reference hashicorpdev dockerhub organization generate job_init.bindata_assetfs.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-06-08 15:06:00 -04:00
James Rasell	c3c10d8c10	e2e: use longer wait in template update triggers to avoid flake. (#13271 )	2022-06-07 14:49:03 +02:00
Tim Gross	cc4a1f2ec4	e2e: upgrade playwright package and container image (#13080 ) The nightly playwright tests are currently failing because of a mismatch between the expected version of Chromium and what's in the container image. Unfortunately the previous specific tag we were using for the container image is no longer tagged on the registry. With some testing, I was able to find an image tag that results in a good run.	2022-05-20 08:41:07 -04:00
Derek Strickland	90daed7c1d	e2e: Wait for deployment to finish before disconnect (#12795 ) * Wait for deployment to finish * Don't reschedule disconnect or restart-node jobs	2022-04-27 12:27:03 -04:00
Tim Gross	c763c4cb96	remove pre-0.9 driver code and related E2E test (#12791 ) This test exercises upgrades between 0.8 and Nomad versions greater than 0.9. We have not supported 0.8.x in a very long time and in any case the test has been marked to skip because the downloader doesn't work.	2022-04-27 09:53:37 -04:00
Tim Gross	cfd353207f	E2E: move volume mounts test to use golang's stdlib test runner (#12788 ) Part of ongoing work to remove the old E2E framework code.	2022-04-26 14:28:20 -04:00
Tim Gross	83eb879d61	E2E: remove old CLI for driving provisioning (#12787 ) We moved off the old provisioning process for nightly E2E to one driven entirely by Terraform quite a while back now. We're in the slow process of removing the framework code for this test-by-test, but this chunk of code no longer has any callers.	2022-04-26 13:43:25 -04:00
Tim Gross	f7d6841dd2	E2E: remove platform specific realpath code from UI run script (#12750 ) We don't need the absolute path for any of the commands in this script so long as we `cd` into the source directory path. Doing this removes the need for weird platform-specific tricks we have to do with realpath vs GNU realpath.	2022-04-22 10:10:18 -04:00
Tim Gross	7dd3910e51	E2E: fix debug logging on disconnected clients test (#12621 )	2022-04-22 09:07:05 -04:00
Tim Gross	d200a66509	E2E: make UIs runnable from any working directory (#12739 ) The E2E test runner is running from the root of the Nomad repository. Make this run independent of the working directory for convenience of developers and the test runner.	2022-04-21 17:00:01 -04:00
Tim Gross	dc013b5267	E2E: set longer timeout for CSI plugin alloc start (#12732 ) The CSI plugin allocations take a while to be marked healthy, sometimes causing E2E test flakes during the setup phase of the tests. There's nothing CSI specific about marking plugin allocs healthy, as the plugin supervisor hook does all the fingerprinting in the postrun hook (the prestart hook just makes a couple of empty directories). The timeouts we're seeing may be because of where we're pulling the images from; most our jobs pull from a CDN-backed public registry whereas these are pulling from ECR. Set a 1min timeout for these to make sure we have enough time to pull the image and start the task.	2022-04-21 11:11:43 -04:00
Tim Gross	2ad9f6bc5f	E2E: playwright configuration and smoke test (#12721 ) Scripts for running playwright tests in a Docker container that has chromium and webkit preinstalled. Includes a basic smoke test for authentication so that we can be sure the test rig is working end-to-end. Wiring this up in CI will be in an upcoming PR.	2022-04-21 09:13:10 -04:00
Tim Gross	c4d92205b4	E2E: provide options for reverse proxy for web UI (#12671 ) Our E2E test environment is deployed with mTLS, but it's impractical for us to use mTLS in headless browsers for automated testing (or even in manual testing). Provide certificates for proxying the web UI via Nginx. This proxy uses client certs for proxying to the HTTP endpoint and a self-signed cert for the browser-facing endpoint. We can accept certificate errors in the automated tests we'll be adding in the next step of this work.	2022-04-19 16:55:05 -04:00
Tim Gross	70c262eb95	E2E: terraform provisioner upgrades (#12652 ) While working on infrastructure for testing the UI in E2E, we needed to upgrade the certificate provider. Performing a provider upgrade via the TF `init -upgrade` brought in updates for the file and AWS providers as well. These updates include deprecating the use of `sensitive_content` fields, removing CA algorithm parameters that can be inferred from keys, and removing the requirement to manually specify AWS assume role parameters in the provider config if they're available in the calling environment's AWS config file (as they are via doormat or our E2E environment).	2022-04-19 14:27:14 -04:00
Tim Gross	d62dd5b3fe	E2E: add debugging outputs for disconnected clients test (#12572 ) This test has a failure that's happening only occassionally and not very reproducibly. Print out the allocation status on test failure so that we can do some post-mortum debugging of the test on nightly.	2022-04-14 17:03:57 -04:00
Derek Strickland	3f871973f9	Update E2E terraform output command (#12561 )	2022-04-13 16:46:09 -04:00
Tim Gross	4078e6ea0e	scripts: fix interpreter for bash (#12549 ) Many of our scripts have a non-portable interpreter line for bash and use bash-specific variables like `BASH_SOURCE`. Update the interpreter line to be portable between various Linuxes and macOS without complaint from posix shell users.	2022-04-12 10:08:21 -04:00
Tim Gross	31e72e93ff	E2E: fix flaky event stream test (#12548 ) This changeset fixes two sources of flakiness in the event stream test. First, the stream request gets the event closest to the index, not the exact match. Although events are written before raft entries they're written asynchronously, so it's possible to race and get a raft index from this query higher than the current head of the event buffer. Ensure the job is running before we try to get the index, so that we've given the event enough time to land in the buffer. Second, the assertion that the found index is greater than the start index is only true if the `PlanResult` event manages to land before we do the second registration. Although it should now with the first fix above, it's not a correct assertion for what we're testing.	2022-04-12 08:35:39 -04:00
Tim Gross	77ab8d92f1	E2E: oversubscription assertion needs to wait for stats (#12540 ) The oversubscription test expects an output that requires the client has polled the task for stats at least once. Wait long enough to ensure that we've polled the stats before failing the test.	2022-04-11 11:40:51 -04:00
Tim Gross	c9c3cbd878	E2E: test for nodes disconnected by netsplit (#12407 )	2022-04-11 11:34:27 -04:00
James Rasell	bc800a18d1	e2e: add initial service discovery tests. (#12512 ) Some tests may chose to deregister jobs to check Nomad cleanup logic, however, it is still possible for the test to fail and exit before this is hit. This therefore adds a cancellable cleanup func which can be deferred, using context to control whether it gets run or not.	2022-04-11 11:12:24 +02:00
James Rasell	dbf28a06c1	e2e: fix eventual consistency failure within consultemplate suite. (#12494 )	2022-04-07 17:03:10 +02:00
James Rasell	431c153cd9	client: add Nomad template service functionality to runner. (#12458 ) This change modifies the template task runner to utilise the new consul-template which includes Nomad service lookup template funcs. In order to provide security and auth to consul-template, we use a custom HTTP dialer which is passed to consul-template when setting up the runner. This method follows Vault implementation. Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-04-06 19:17:05 +02:00
Seth Hoenig	3ce4f52740	Merge pull request #12446 from shoenig/no-pkg-err cleanup: purge github.com/pkg/errors	2022-04-04 09:22:44 -05:00
Tim Gross	806a82dd0c	E2E: ensure that CSI EBS tests are isolated from each other (#12443 ) Tear down the volume-consuming job between subtests, rather than after all the tests are complete. For good measure, use a different ID for the volume-consuming job as well.	2022-04-04 09:44:55 -04:00
Seth Hoenig	9670adb6c6	cleanup: purge github.com/pkg/errors	2022-04-01 19:24:02 -05:00
Tim Gross	3030f954a2	E2E disconnected clients test refactor (#12402 ) * Wait longer for node to go down in disconnected clients test. The existing helper only waits 10s, but there's a jitter on heartbeats that we need to account for. Wait for 30s for node to go down to give us plenty of room * Port disconnected clients to stdlib-style test	2022-03-30 09:12:44 -04:00
Tim Gross	19703e3316	E2E: test exercising node drain behavior for CSI volumes (#12384 )	2022-03-29 11:19:23 -04:00
Tim Gross	5c7f2bad0b	E2E: namespace HCP vault and consul policies to avoid collisions (#12386 ) Concurrent E2E runs can collide when provisioning policies on HCP Consul and HCP Vault. Namespace these by the test run name, as we do for most everything else.	2022-03-25 16:05:59 -04:00
Tim Gross	3c15236fd5	E2E: move example test to use golangs stdlib test runner (#12383 ) Our E2E "framework" has a bunch of features around test discovery and standing up infra that were never completed or fully used, and we ended up building out a large test suite that ignored all that in lieu of Terraform-provided infrastructure for the last couple years. This changeset is a proposal (and demonstration) for gradually migrating our E2E tests off the framework code so that developers can write fairly ordinary golang stdlib testing tests.	2022-03-25 14:44:16 -04:00
Tim Gross	67b87e46f1	e2e: test for allocations replacement on disconnected clients (#12375 ) This test exercises the behavior of clients that become disconnected and have their allocations replaced. Future test cases will exercise the `max_client_disconnect` field on the job spec.	2022-03-25 12:26:43 -04:00
Tim Gross	e687a21da9	CSI: set plugin `CSI_ENDPOINT` env var only if unset by user (#12257 ) * Use unix:// prefix for CSI_ENDPOINT variable by default * Some plugins have strict validation over the format of the `CSI_ENDPOINT` variable, and unfortunately not all plugins agree. Allow the user to override the `CSI_ENDPOINT` to workaround those cases. * Update all demos and tests with CSI_ENDPOINT	2022-03-21 11:48:47 -04:00
Tim Gross	bd403f2f88	E2E: ensure `ConnectACLsE2ETest` has clean state before starting (#12334 ) The `ConnectACLsE2ETest` checks that the SI tokens have been properly cleaned up between tests, but following the change to use HCP the previous `Connect` test suite will often have SI tokens that haven't been cleaned up by the time this test suite runs. Wait for the SI tokens to be cleaned up at the start of the test to ensure we have a clean state.	2022-03-21 11:05:02 -04:00
Tim Gross	9f05d62338	E2E with HCP Consul/Vault (#12267 ) Use HCP Consul and HCP Vault for the Consul and Vault clusters used in E2E testing. This has the following benefits: * Without the need to support mTLS bootstrapping for Consul and Vault, we can simplify the mTLS configuration by leaning on Terraform instead of janky bash shell scripting. * Vault bootstrapping is no longer required, so we can eliminate even more janky shell scripting * Our E2E exercises HCP, which is important to us as an organization * With the reduction in configurability, we can simplify the Terraform configuration and drop the complicated `provision.sh`/`provision.ps1` scripts we were using previously. We can template Nomad configuration files and upload them with the `file` provisioner. * Packer builds for Linux and Windows become much simpler. tl;dr way less janky shell scripting!	2022-03-18 09:27:28 -04:00
Seth Hoenig	373d8f7241	ci: missing import for nomad09upgrade	2022-03-17 08:49:15 -05:00
Seth Hoenig	f87eb666c7	e2e: have e2e use ci.Parallel This is a followup to having tests run in serial in CI. The e2e package isn't in CI, but lets use the helper anyway so we can setup semgrep rules covering the entire repository.	2022-03-17 08:37:34 -05:00
Tim Gross	b94837a2b8	csi: add pagination args to `volume snapshot list` (#12193 ) The snapshot list API supports pagination as part of the CSI specification, but we didn't have it plumbed through to the command line.	2022-03-07 12:19:28 -05:00
Tim Gross	09a7612150	csi: volume snapshot list plugin option is required (#12197 ) The RPC for listing volume snapshots requires a plugin ID. Update the `volume snapshot list` command to find the specific plugin from the provided prefix.	2022-03-07 09:58:29 -05:00
Tim Gross	a07386c507	e2e: use context for executing external commands (#12185 ) If any E2E test hangs, it'll eventually timeout and panic, causing the all the remaining tests to fail. External commands should use a short context whenever possible so we can fail the test quickly and move on to the next test.	2022-03-04 08:55:36 -05:00
Tim Gross	5f30279cd2	e2e: `StopJob` should tolerate progress deadline expired (#12179 ) The `TestRescheduleProgressDeadlineFail` E2E test failed during test cleanup because the error message "progress deadline expired" that it emits when we stop the job does not match the one expected from monitoring the `job stop` command. Update the `StopJob` helper to tolerate this use case as well.	2022-03-04 08:55:22 -05:00
Tim Gross	4c4895e19c	e2e: configure prometheus for mTLS for `Metrics` suite (#12181 ) The `Metrics` suite uses prometheus to scrape Nomad metrics so that we're testing the full user experience of extracting metrics from Nomad. With the addition of mTLS, we need to make sure prometheus also has mTLS configuration because the metrics endpoint is protected. Update the Nomad client configuration and prometheus job to bind-mount the client's certs into the task so that the job can use these certs to scrape the server. This is a temporary solution that gets the job passing; we should give the job its own certificates (issued by Vault?) when we've done some of the infrastructure rework we'd like.	2022-03-04 08:55:06 -05:00
Tim Gross	b8b08fb32d	e2e: use UUID for CSI idempotency token (#12183 ) The AWS EBS plugin appears to use the name field of the volume as an idempotency token that persists across the entire AWS account, not just the plugin lifespan. Also fix the regex for the volume ID, which was originally taken from the job ID regex but isn't actually the same. This hasn't failed tests for us because we've always passed in the same volume ID.	2022-03-03 17:00:00 -05:00
Tim Gross	1502af3523	e2e: use `operator api` for Networking suite validation (#12180 ) With mTLS enabled, using `curl` in a bash script for validation involves having to configure arguments to `curl` based on whether or not the test infrastructure is using mTLS, whether ACLs are enabled, etc. Use the new `operator api` command instead to pick up the client configuration from the test environment automatically.	2022-03-03 15:17:29 -05:00
Tim Gross	f2a4ad0949	CSI: implement support for topology (#12129 )	2022-03-01 10:15:46 -05:00
James Rasell	adc3c44e29	e2e: moved missed volume test stop command to util helper.	2022-02-02 08:42:58 +01:00
James Rasell	0a50d9fd2a	e2e: account for new job stop CLI exit behaviour. PR #11550 changed the job stop exit behaviour when monitoring the deployment. When stopping a job, the deployment becomes cancelled and therefore the CLI now exits with status code 1 as it see this as an error. This change adds a new utility e2e function that accounts for this behaviour.	2022-02-01 14:16:37 +01:00
Luiz Aoqui	3c8381bf85	e2e: enable Consul HTTPS port and always restart Nomad systemd unit	2022-01-18 16:56:26 -05:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	ae04e540e6	hclfmt on some config files (#11611 )	2021-12-02 15:25:46 -05:00
Derek Strickland	8a5aa0cd8a	Fix Vault E2E TLS config (#11483 ) * Update e2e/terraform configuration for Vault and default to mtls=true	2021-12-02 12:20:09 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Luiz Aoqui	5d204c8ced	Revert "Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )" (#11433 )	2021-11-02 17:42:52 -04:00
Charlie Voiselle	cb8e52b5df	Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )	2021-10-13 21:23:13 -04:00
Mahmood Ali	4d90afb425	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
James Rasell	645741cd94	Merge pull request #11194 from hashicorp/b-fix-e2e-acl-tls-provision e2e: fix provisioning when ACLs and TLS enabled.	2021-09-17 08:11:10 +02:00
James Rasell	30273d9256	e2e: fix provisioning when ACLs and TLS enabled; no nightly TLS.	2021-09-16 17:15:41 +02:00
James Rasell	0e926ef3fd	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
Luiz Aoqui	f30c024a40	e2e: use absolute path for mTLS env vars (#11126 )	2021-09-03 12:59:21 -04:00
James Rasell	6bd2acd5b3	Merge pull request #11098 from hashicorp/b-fixup-all-incorrect-docstrings chore: fix incorrect docstring formatting.	2021-08-31 09:46:18 +02:00
Mahmood Ali	fec0adbb0e	Support mTLS clusters for e2e testing (#11092 ) This allows us to spin up e2e clusters with mTLS configured for all HashiCorp services, i.e. Nomad, Consul, and Vault. Used it for testing #11089 . mTLS is disabled by default. I have not updated Windows provisioning scripts yet - Windows also lacks ACL support from before. I intend to follow up for them in another round.	2021-08-30 10:18:16 -04:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
James Rasell	73ab63cf68	test: update e2e and dev scripts to use cni plugins v1.0.0	2021-08-27 11:14:47 +02:00
Mahmood Ali	97966c7a71	e2e: Run system jobs on all datacenters (#11060 ) Target all e2e datacenters for system and sysbatch e2e tests. They require that the system jobs run on all linux clients. However, the jobs currenly only target `dc1` datacenter, but the nightly e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing the tests to fail. I missed this problem in e2e dev cluster because it only used a single dc1 datacenter.	2021-08-17 11:01:47 -04:00
Mahmood Ali	28bc234e84	e2e: fix tests Use basic sleeps in busybox images. busybox are very light, and ping has permissions complications, and it may fail for network related issues.	2021-08-03 11:38:35 -04:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Mahmood Ali	70f541287b	e2e: wait for allocs and deployments (#10967 ) As we moved to using `-detach` for registering jobs, we should wait until allocs and deployments are created before asserting their properties. Fixing `TestNodeDrainIgnoreSystem` and `TestRescheduleProgressDeadlineFail` tests as they seem particularly flaky, failing 9 and 7 times (respectively) in the last two weeks.	2021-07-29 10:52:04 -04:00
Mahmood Ali	a9bd176742	e2e: use -detach mode when registering jobs with cli (#10877 ) Pick up 15d39f0dee but for RegisterFromJobspec: > This PR changes the e2e helper thingy to set -detach option > when registering a job with the CLI instead of the API. This is > necessary for jobs which never become healthy, as the deployment > never finishes for failing jobs and the command never returns, > causing the test to timeout after 10 minutes. This case occurs in TestVaultSecrets	2021-07-09 09:25:44 -04:00
Seth Hoenig	80f4340b77	e2e: use -detach mode when registering jobs with cli This PR changes the e2e helper thingy to set -detach option when registering a job with the CLI instead of the API. This is necessary for jobs which never become healthy, as the deployment never finishes for failing jobs and the command never returns, causing the test to timeout after 10 minutes.	2021-06-18 12:18:40 -05:00
James Rasell	939b23936a	Merge pull request #10744 from hashicorp/b-remove-duplicate-imports chore: remove duplicate import statements	2021-06-11 16:42:34 +02:00
James Rasell	2898e5d379	e2e: remove duplicate import statements.	2021-06-11 09:37:23 +02:00
Michael Schurter	319650d481	e2e: use api.ipify.org ipv4.icanhazip.com returns ipv6 addresses	2021-06-07 15:12:42 -07:00
Mahmood Ali	5258ae480b	remove unused Spark security group rules	2021-06-04 11:49:43 -04:00
Mahmood Ali	b852dc5eb8	e2e: pass nomad_url variable	2021-06-04 10:32:51 -04:00
Mahmood Ali	71936e1b27	e2e: NOMAD_VERSION is not set when installing url	2021-06-04 10:31:37 -04:00
Mahmood Ali	d0768bb999	restrict ingress ip	2021-06-04 10:31:35 -04:00
Luiz Aoqui	139c5e8df9	e2e: fix terraform output environment command instruction (#10674 )	2021-06-01 10:10:12 -04:00
Mahmood Ali	d8de4e62bb	Merge pull request #10657 from hashicorp/b-alloc-exec-closing Handle `nomad exec` termination events in order	2021-05-25 14:50:58 -04:00
Mahmood Ali	0853d48927	e2e: Spin clusters with custom url binaries (#10656 ) Ease spinning up a cluster, where binaries are fetched from arbitrary urls. These could be CircleCI `build-binaries` job artifacts, or presigned S3 urls. Co-authored-by: Tim Gross <tgross@hashicorp.com>	2021-05-25 13:47:39 -04:00
Mahmood Ali	3b7c5ff46e	e2e: stop suppressing unexpected EOF errors	2021-05-24 13:35:08 -04:00
Tim Gross	709b92c5a8	e2e: update TF lockfile	2021-05-18 09:35:57 -04:00

1 2 3 4 5 ...

608 commits