open-nomad

Commit Graph

Author	SHA1	Message	Date
Seth Hoenig	83450c8762	vault: configure user agent on Nomad vault clients (#15745 ) * vault: configure user agent on Nomad vault clients This PR attempts to set the User-Agent header on each Vault API client created by Nomad. Still need to figure a way to set User-Agent on the Vault client created internally by consul-template. * vault: fixup find-and-replace gone awry	2023-01-10 10:39:45 -06:00
Seth Hoenig	dfc3b067ea	e2e: fix 1 of 4 client disconnect tests (#15357 ) This PR modifies the disconnect helper job to run as root, which is necesary for manipulating iptables as it does. Also re-organizes the final test logic to wait for client re-connect before looking for the replacement (3rd) allocation in case that client was needed to run the alloc (also giving the sheduler more time to do its thing). Skips the other 3 tests, which fail and I cannot yet figure out what is going on.	2022-11-22 08:51:53 -06:00
Seth Hoenig	d7aa37a5c9	e2e: explicitly wait on task status in chroot download exec test (#15145 ) Also add some debug log lines for this test, because it doesn't make sense for the allocation to be complete yet a task in the allocation to be not started yet, which is what the test failures are implying.	2022-11-04 09:50:11 -05:00
Michael Schurter	9cac60dbed	test: use port collision instead of cpu exhaustion (#14994 ) Originally this test relied on Job 1 blocking Job 2 until Job 1 had a terminal ClientStatus. Job 2 ensured it would get blocked using 2 mechanisms: 1. A constraint requiring it is placed on the same node as Job 1. 2. Job 2 would require all unreserved CPU on the node to ensure it would be blocked until Job 1's resources were free. That 2nd assertion breaks if any previous job is still running on the target node! That seems very likely to happen in the flaky world of our e2e tests. In fact there may be some jobs we intentionally want running throughout; in hindsight it was never safe to assume my test would be the only thing scheduled when it ran. Ports to the rescue! Reserving a static port means that both Job 2 will now block on Job 1 being terminal. It will only conflict with other tests if those tests use that port on every node. I ensured no existing tests were using the port I chose. Other changes: - Gave job a bit more breathing room resource-wise. - Tightened timings a bit since previous failure ran into the `go test` time limit. - Cleaned up the DumpEvals output. It's quite nice and handy now!	2022-10-21 07:53:26 -07:00
Michael Schurter	21eced0a4e	test: extend timing and output of overlap e2e test (#14894 ) Keeps failing in the nightly e2e test with unhelpful output like: ``` Failed === RUN TestOverlap overlap_test.go:92: Followup job overlap93ee1d2b blocked. Sleeping for the rest of overlap48c26c39's shutdown_delay (9.2/10s) overlap_test.go:105: 1500/2000 retries reached for github.com/hashicorp/nomad/e2e/overlap.TestOverlap (err=timed out before an allocation was found for overlap93ee1d2b) overlap_test.go:105: timeout: timed out before an allocation was found for overlap93ee1d2b --- FAIL: TestOverlap (38.96s) ``` I have not been able to replicate it in my own e2e cluster, so I added the EvalDump helper to add detailed eval information like: ``` === RUN TestOverlap 1/1 Job overlap7b0e90ec Eval c38c9919-a4f0-5baf-45f7-0702383c682a Type: service TriggeredBy: job-register Deployment: Status: pending () NextEval: PrevEval: BlockedEval: -- No placement failures -- QueuedAllocs: SnapshotIdx: 0 CreateIndex: 96 ModifyIndex: 96 ... ``` Hopefully helpful when debugging other tests as well!	2022-10-14 14:15:07 -07:00
Seth Hoenig	38727b6ab9	e2e: add e2e tests for nomad service disco checks This PR adds 2 e2e tests for ensuring nomad service discovery checks get created and produce status results as expected.	2022-08-22 15:31:13 -05:00
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Derek Strickland	90daed7c1d	e2e: Wait for deployment to finish before disconnect (#12795 ) * Wait for deployment to finish * Don't reschedule disconnect or restart-node jobs	2022-04-27 12:27:03 -04:00
Tim Gross	c9c3cbd878	E2E: test for nodes disconnected by netsplit (#12407 )	2022-04-11 11:34:27 -04:00
James Rasell	bc800a18d1	e2e: add initial service discovery tests. (#12512 ) Some tests may chose to deregister jobs to check Nomad cleanup logic, however, it is still possible for the test to fail and exit before this is hit. This therefore adds a cancellable cleanup func which can be deferred, using context to control whether it gets run or not.	2022-04-11 11:12:24 +02:00
Tim Gross	3030f954a2	E2E disconnected clients test refactor (#12402 ) * Wait longer for node to go down in disconnected clients test. The existing helper only waits 10s, but there's a jitter on heartbeats that we need to account for. Wait for 30s for node to go down to give us plenty of room * Port disconnected clients to stdlib-style test	2022-03-30 09:12:44 -04:00
Tim Gross	3c15236fd5	E2E: move example test to use golangs stdlib test runner (#12383 ) Our E2E "framework" has a bunch of features around test discovery and standing up infra that were never completed or fully used, and we ended up building out a large test suite that ignored all that in lieu of Terraform-provided infrastructure for the last couple years. This changeset is a proposal (and demonstration) for gradually migrating our E2E tests off the framework code so that developers can write fairly ordinary golang stdlib testing tests.	2022-03-25 14:44:16 -04:00
Tim Gross	67b87e46f1	e2e: test for allocations replacement on disconnected clients (#12375 ) This test exercises the behavior of clients that become disconnected and have their allocations replaced. Future test cases will exercise the `max_client_disconnect` field on the job spec.	2022-03-25 12:26:43 -04:00
Tim Gross	a07386c507	e2e: use context for executing external commands (#12185 ) If any E2E test hangs, it'll eventually timeout and panic, causing the all the remaining tests to fail. External commands should use a short context whenever possible so we can fail the test quickly and move on to the next test.	2022-03-04 08:55:36 -05:00
Tim Gross	5f30279cd2	e2e: `StopJob` should tolerate progress deadline expired (#12179 ) The `TestRescheduleProgressDeadlineFail` E2E test failed during test cleanup because the error message "progress deadline expired" that it emits when we stop the job does not match the one expected from monitoring the `job stop` command. Update the `StopJob` helper to tolerate this use case as well.	2022-03-04 08:55:22 -05:00
James Rasell	0a50d9fd2a	e2e: account for new job stop CLI exit behaviour. PR #11550 changed the job stop exit behaviour when monitoring the deployment. When stopping a job, the deployment becomes cancelled and therefore the CLI now exits with status code 1 as it see this as an error. This change adds a new utility e2e function that accounts for this behaviour.	2022-02-01 14:16:37 +01:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Mahmood Ali	a9bd176742	e2e: use -detach mode when registering jobs with cli (#10877 ) Pick up 15d39f0dee but for RegisterFromJobspec: > This PR changes the e2e helper thingy to set -detach option > when registering a job with the CLI instead of the API. This is > necessary for jobs which never become healthy, as the deployment > never finishes for failing jobs and the command never returns, > causing the test to timeout after 10 minutes. This case occurs in TestVaultSecrets	2021-07-09 09:25:44 -04:00
Seth Hoenig	80f4340b77	e2e: use -detach mode when registering jobs with cli This PR changes the e2e helper thingy to set -detach option when registering a job with the CLI instead of the API. This is necessary for jobs which never become healthy, as the deployment never finishes for failing jobs and the command never returns, causing the test to timeout after 10 minutes.	2021-06-18 12:18:40 -05:00
Seth Hoenig	09cd01a5f3	e2e: add e2e tests for consul namespaces on ent with acls This PR adds e2e tests for Consul Namespaces for Nomad Enterprise with Consul ACLs enabled. Needed to add support for Consul ACL tokens with `namespace` and `namespace_prefix` blocks, which Nomad parses and validates before tossing the token. These bits will need to be picked back to OSS.	2021-04-27 14:45:54 -06:00
Seth Hoenig	2d693127bb	e2e: minor tweaks from CR	2021-04-16 15:32:37 -06:00
Seth Hoenig	7f1191111d	e2e: add tests for consul namespaces from nomad oss This PR adds a set of tests to the Consul test suite for testing Nomad OSS's behavior around setting Consul Namespace on groups, which is to ignore the setting (as Consul Namespaces are currently an Enterprise feature). Tests are generally a reduced facsimile of existing tests, modified to check behavior of when group.consul.namespace is set and not set. Verification is oriented around what happens in Consul; the in-depth functional correctness of these features is left to the original tests. Nomad ENT will get its own version of these tests in `namespaces_ent.go`.	2021-04-16 15:32:37 -06:00
Mahmood Ali	e643742a38	Add a test for parameterized summary counts	2021-03-25 11:27:09 -04:00
Drew Bailey	86d9e1ff90	Merge pull request #9955 from hashicorp/on-update-services Service and Check on_update configuration option (readiness checks)	2021-02-24 10:11:05 -05:00
Drew Bailey	c152757d38	E2e/fix periodic (#10047 ) * fix periodic * update periodic to not use template nomad job inspect no longer returns an apiliststub so the required fields to query job summary are no longer there, parse cli output instead * rm tmp makefile entry * fix typo * revert makefile change	2021-02-18 12:21:53 -05:00
Drew Bailey	b5585882e4	address pr comments	2021-02-08 13:43:05 -05:00
Drew Bailey	b0cf3ffa54	on_update check_restart e2e	2021-02-08 10:49:25 -05:00
Chris Baker	ce68ee164b	Version 1.0.3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgEuOKAAoJEFGFLYc0j/xMxF8H/3TTU6Tu+Xm0YvcsDaYDphZ/ X7KQBV0aFiuL5VkTw4PzKEsgryIy9/sqEPyxxyKRowAmos9qhiusjNAIfqdP4TF8 tdZmTedkfWir9uPD+hyv/LXpwbQ2T8kTwS3xHTYvaOmaCxZr710FEn+imnMk1AUn Xs5itkd/CYGr0nBLm+I5GutWSDPmL7Uw8J5Z30fFyoaxoCPAbCWQQNk793SCRUc5 f/uo18V2tFInmQ+3sAdnM4gPewyStK/a5VvzWavL9fVDtYK83wlqWSchTXY5jpVz zNEzt/rYhbBzakPQQKb5zieblh2iGI8aHWpD5w4WduqO2Sg6B/5lAeNZIlW0UJg= =2g3c -----END PGP SIGNATURE----- Merge tag 'v1.0.3' into post-release-1.0.3 Version 1.0.3	2021-01-29 19:30:08 +00:00
Chris Baker	aa55df0413	additional e2e utils for multi-task allocs	2021-01-28 12:03:19 +00:00
Mahmood Ali	94ad40907c	e2e: prefer testutil.WaitForResultRetries Prefer testutil.WaitForResultRetries that emits more descriptive errors on failures. `require.Evatually` fails with opaque "Condition never satisfied" error message.	2021-01-26 10:01:14 -05:00
Mahmood Ali	925d9ce952	e2e: tweak failure messages Tweak the error messages for the flakiest tests, so that on test failure, we get more output	2021-01-26 09:16:48 -05:00
Mahmood Ali	6aa3dec6cc	e2e: use testify requires instead of t.Fatal testify requires offer better error message that is easier to notice when seeing a wall of text in the builds.	2021-01-26 09:14:47 -05:00
Drew Bailey	630babb886	prevent double job status update (#9768 ) * Prevent Job Statuses from being calculated twice https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval insertion iwth job (de-)registration. This change removes a now obsolete guard which checked if the index was equal to the job.CreateIndex, which would empty the status. Now that the job regisration eval insetion is atomic with the registration this check is no longer necessary to set the job statuses correctly. * test to ensure only single job event for job register * periodic e2e * separate job update summary step * fix updatejobstability to use copy instead of modified reference of job * update envoygatewaybindaddresses copy to prevent job diff on null vs empty * set ConsulGatewayBindAddress to empty map instead of nil fix nil assertions for empty map rm unnecessary guard	2021-01-22 09:18:17 -05:00
Mahmood Ali	9dcdafe4cf	e2e: show command output on failure When a command fails, it's nice to have the full output, as it contains diagnostic information. The status code isn't sufficient for debugging.	2021-01-21 10:32:16 -05:00
Mahmood Ali	95b7fc80b8	e2e deflake namespaces: only check namespace jobs Deflake namespace e2e test by only asserting on jobs related to the namespace tests. During our e2e tests, some left over jobs (e.g. prometheus) are left running while being shutdown and cause the test to fail.	2021-01-21 10:26:24 -05:00
Seth Hoenig	536747f216	e2e: use jobspec2 Parse for parsing jobfile in e2e utils We directly parse job files in e2eutil, but currently using jobspec package. Instead, use the Parse method from the jobspec2 package so we can parse job files with new features.	2021-01-13 14:00:40 -06:00
Seth Hoenig	59f230714f	e2e: add e2e test for service registration	2021-01-05 08:48:12 -06:00
Tim Gross	7e4fd79eee	e2e: CSI test should detect un-deregisterable volumes (#9343 ) Assert that deregistering a volume works without errors following a volume reap. Use CLI helpers where feasible to exercise CSI command line. Dump plugin allocation logs on deregistration failures for debugging purposes.	2020-11-13 09:31:21 -05:00
Tim Gross	65282a7cf1	E2E: vault secrets (#9081 ) * rename vault API compatibility test for clarity * exercise vault secrets lease renewal	2020-10-14 08:43:28 -04:00
Nick Ethier	d45be0b5a6	client: add NetworkStatus to Allocation (#8657 )	2020-10-12 13:43:04 -04:00
Tim Gross	fa1fa623f2	e2e: rework rescheduling progress deadline test (#8958 ) Eliminate sources of randomness in the progress deadline test and clarify the purpose of the test to check for progress deadline updates.	2020-09-29 11:02:16 -04:00
Tim Gross	6489c5f626	e2e: namespace support for CLI helpers (#8978 ) Required to support tests for namespaces and other ENT features.	2020-09-28 16:37:34 -04:00
Tim Gross	1311f32f1b	e2e: test for host volumes and Docker volumes (#8972 ) Exercises host volume and Docker volume functionality for the `exec` and `docker` task driver, particularly around mounting locations within the container and how this can be used with `template`.	2020-09-28 11:14:13 -04:00
Tim Gross	d7a013b6f5	e2e: refactor CLI utils out of rescheduling test (#8905 ) The CLI helpers in the rescheduling test were intended for shared use, but until some other tests were written we didn't want to waste time making them generic. This changeset refactors them and adds some new helpers associated with the node drain tests (under separate PR).	2020-09-16 16:10:06 -04:00
Tim Gross	28e9bbbbf4	e2e: helper for sending CLI commands and parsing output The E2E suite exercises the API, but not the CLI. This changeset adds a helper function to send commands via a locally-built Nomad binary (which we'll need to add to the E2E setup), and some helpers to parse the resulting structured outputs in a way that tests can consume.	2020-09-10 13:00:32 -04:00
Michael Schurter	66bc07d01a	test: deflake consul e2e tests Modernize test patterns by removing gomega and avoiding the mock_driver.	2020-08-19 14:29:22 -07:00
Seth Hoenig	a9991e9ab9	e2e: add tests for connect native Adds 2 tests around Connect Native. Both make use of the example connect native services in https://github.com/hashicorp/nomad-connect-examples One of them runs without Consul ACLs enabled, the other with.	2020-07-01 15:54:28 -05:00

1 2

79 Commits