Commit Graph

547 Commits

Author SHA1 Message Date
Tim Gross b8b08fb32d
e2e: use UUID for CSI idempotency token (#12183)
The AWS EBS plugin appears to use the name field of the volume as an
idempotency token that persists across the entire AWS account, not
just the plugin lifespan.

Also fix the regex for the volume ID, which was originally taken from
the job ID regex but isn't actually the same. This hasn't failed tests
for us because we've always passed in the same volume ID.
2022-03-03 17:00:00 -05:00
Tim Gross 1502af3523
e2e: use `operator api` for Networking suite validation (#12180)
With mTLS enabled, using `curl` in a bash script for validation
involves having to configure arguments to `curl` based on whether or
not the test infrastructure is using mTLS, whether ACLs are enabled,
etc. Use the new `operator api` command instead to pick up the client
configuration from the test environment automatically.
2022-03-03 15:17:29 -05:00
Tim Gross f2a4ad0949
CSI: implement support for topology (#12129) 2022-03-01 10:15:46 -05:00
James Rasell adc3c44e29
e2e: moved missed volume test stop command to util helper. 2022-02-02 08:42:58 +01:00
James Rasell 0a50d9fd2a
e2e: account for new job stop CLI exit behaviour.
PR #11550 changed the job stop exit behaviour when monitoring the
deployment. When stopping a job, the deployment becomes cancelled
and therefore the CLI now exits with status code 1 as it see this
as an error.

This change adds a new utility e2e function that accounts for this
behaviour.
2022-02-01 14:16:37 +01:00
Luiz Aoqui 3c8381bf85
e2e: enable Consul HTTPS port and always restart Nomad systemd unit 2022-01-18 16:56:26 -05:00
James Rasell 45f4689f9c
chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
Tim Gross ae04e540e6
hclfmt on some config files (#11611) 2021-12-02 15:25:46 -05:00
Derek Strickland 8a5aa0cd8a
Fix Vault E2E TLS config (#11483)
* Update e2e/terraform configuration for Vault and default to mtls=true
2021-12-02 12:20:09 -05:00
James Rasell 751c8217d1
core: allow setting and propagation of eval priority on job de/registration (#11532)
This change modifies the Nomad job register and deregister RPCs to
accept an updated option set which includes eval priority. This
param is optional and override the use of the job priority to set
the eval priority.

In order to ensure all evaluations as a result of the request use
the same eval priority, the priority is shared to the
allocReconciler and deploymentWatcher. This creates a new
distinction between eval priority and job priority.

The Nomad agent HTTP API has been modified to allow setting the
eval priority on job update and delete. To keep consistency with
the current v1 API, job update accepts this as a payload param;
job delete accepts this as a query param.

Any user supplied value is validated within the agent HTTP handler
removing the need to pass invalid requests to the server.

The register and deregister opts functions now all for setting
the eval priority on requests.

The change includes a small change to the DeregisterOpts function
which handles nil opts. This brings the function inline with the
RegisterOpts.
2021-11-23 09:23:31 +01:00
Luiz Aoqui 5d204c8ced
Revert "Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799)" (#11433) 2021-11-02 17:42:52 -04:00
Charlie Voiselle cb8e52b5df
Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799) 2021-10-13 21:23:13 -04:00
Mahmood Ali 4d90afb425 gofmt all the files
mostly to handle build directives in 1.17.
2021-10-01 10:14:28 -04:00
James Rasell 645741cd94
Merge pull request #11194 from hashicorp/b-fix-e2e-acl-tls-provision
e2e: fix provisioning when ACLs and TLS enabled.
2021-09-17 08:11:10 +02:00
James Rasell 30273d9256
e2e: fix provisioning when ACLs and TLS enabled; no nightly TLS. 2021-09-16 17:15:41 +02:00
James Rasell 0e926ef3fd
allow configuration of Docker hostnames in bridge mode (#11173)
Add a new hostname string parameter to the network block which
allows operators to specify the hostname of the network namespace.
Changing this causes a destructive update to the allocation and it
is omitted if empty from API responses. This parameter also supports
interpolation.

In order to have a hostname passed as a configuration param when
creating an allocation network, the CreateNetwork func of the
DriverNetworkManager interface needs to be updated. In order to
minimize the disruption of future changes, rather than add another
string func arg, the function now accepts a request struct along with
the allocID param. The struct has the hostname as a field.

The in-tree implementations of DriverNetworkManager.CreateNetwork
have been modified to account for the function signature change.
In updating for the change, the enhancement of adding hostnames to
network namespaces has also been added to the Docker driver, whilst
the default Linux manager does not current implement it.
2021-09-16 08:13:09 +02:00
Luiz Aoqui f30c024a40
e2e: use absolute path for mTLS env vars (#11126) 2021-09-03 12:59:21 -04:00
James Rasell 6bd2acd5b3
Merge pull request #11098 from hashicorp/b-fixup-all-incorrect-docstrings
chore: fix incorrect docstring formatting.
2021-08-31 09:46:18 +02:00
Mahmood Ali fec0adbb0e
Support mTLS clusters for e2e testing (#11092)
This allows us to spin up e2e clusters with mTLS configured for all HashiCorp services, i.e. Nomad, Consul, and Vault. Used it for testing #11089 .

mTLS is disabled by default. I have not updated Windows provisioning scripts yet - Windows also lacks ACL support from before. I intend to follow up for them in another round.
2021-08-30 10:18:16 -04:00
James Rasell b6813f1221
chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
James Rasell 73ab63cf68
test: update e2e and dev scripts to use cni plugins v1.0.0 2021-08-27 11:14:47 +02:00
Mahmood Ali 97966c7a71
e2e: Run system jobs on all datacenters (#11060)
Target all e2e datacenters for system and sysbatch e2e tests.  They
require that the system jobs run on all linux clients.

However, the jobs currenly only target `dc1` datacenter, but the nightly
e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing
the tests to fail.

I missed this problem in e2e dev cluster because it only used a single
dc1 datacenter.
2021-08-17 11:01:47 -04:00
Mahmood Ali 28bc234e84 e2e: fix tests
Use basic sleeps in busybox images. busybox are very light, and ping has
permissions complications, and it may fail for network related
issues.
2021-08-03 11:38:35 -04:00
Seth Hoenig 3371214431 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
Mahmood Ali 70f541287b
e2e: wait for allocs and deployments (#10967)
As we moved to using `-detach` for registering jobs, we should wait
until allocs and deployments are created before asserting their
properties.

Fixing `TestNodeDrainIgnoreSystem` and `TestRescheduleProgressDeadlineFail` tests as they seem particularly flaky, failing 9 and 7 times (respectively) in the last two weeks.
2021-07-29 10:52:04 -04:00
Mahmood Ali a9bd176742
e2e: use -detach mode when registering jobs with cli (#10877)
Pick up 15d39f0dee but for RegisterFromJobspec:

>  This PR changes the e2e helper thingy to set -detach option
>  when registering a job with the CLI instead of the API. This is
>  necessary for jobs which never become healthy, as the deployment
>  never finishes for failing jobs and the command never returns,
>  causing the test to timeout after 10 minutes.

This case occurs in TestVaultSecrets
2021-07-09 09:25:44 -04:00
Seth Hoenig 80f4340b77 e2e: use -detach mode when registering jobs with cli
This PR changes the e2e helper thingy to set -detach option
when registering a job with the CLI instead of the API. This is
necessary for jobs which never become healthy, as the deployment
never finishes for failing jobs and the command never returns,
causing the test to timeout after 10 minutes.
2021-06-18 12:18:40 -05:00
James Rasell 939b23936a
Merge pull request #10744 from hashicorp/b-remove-duplicate-imports
chore: remove duplicate import statements
2021-06-11 16:42:34 +02:00
James Rasell 2898e5d379
e2e: remove duplicate import statements. 2021-06-11 09:37:23 +02:00
Michael Schurter 319650d481 e2e: use api.ipify.org
ipv4.icanhazip.com returns ipv6 addresses
2021-06-07 15:12:42 -07:00
Mahmood Ali 5258ae480b remove unused Spark security group rules 2021-06-04 11:49:43 -04:00
Mahmood Ali b852dc5eb8 e2e: pass nomad_url variable 2021-06-04 10:32:51 -04:00
Mahmood Ali 71936e1b27 e2e: NOMAD_VERSION is not set when installing url 2021-06-04 10:31:37 -04:00
Mahmood Ali d0768bb999 restrict ingress ip 2021-06-04 10:31:35 -04:00
Luiz Aoqui 139c5e8df9
e2e: fix terraform output environment command instruction (#10674) 2021-06-01 10:10:12 -04:00
Mahmood Ali d8de4e62bb
Merge pull request #10657 from hashicorp/b-alloc-exec-closing
Handle `nomad exec` termination events in order
2021-05-25 14:50:58 -04:00
Mahmood Ali 0853d48927
e2e: Spin clusters with custom url binaries (#10656)
Ease spinning up a cluster, where binaries are fetched from arbitrary
urls.  These could be CircleCI `build-binaries` job artifacts, or
presigned S3 urls.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-05-25 13:47:39 -04:00
Mahmood Ali 3b7c5ff46e e2e: stop suppressing unexpected EOF errors 2021-05-24 13:35:08 -04:00
Tim Gross 709b92c5a8 e2e: update TF lockfile 2021-05-18 09:35:57 -04:00
Tim Gross d4465f01ac E2E: remove references to nomad_sha 2021-05-10 16:42:39 -04:00
Mahmood Ali a33ec72dd7
e2e: enable memory oversubscription (#10557)
Enable memory oversubscription for the oversubscription tests.
2021-05-10 14:33:47 -04:00
Michael Schurter 547a718ef6
Merge pull request #10248 from hashicorp/f-remotetask-2021
core: propagate remote task handles
2021-04-30 08:57:26 -07:00
Michael Schurter 982c65c0c7 comment out unused consts to make linter happy 2021-04-30 08:31:31 -07:00
Seth Hoenig d54a606819
Merge pull request #10439 from hashicorp/pick-ent-acls-changes
e2e: add e2e tests for consul namespaces on ent with acls
2021-04-28 08:30:08 -06:00
Tim Gross 79f81d617e licensing: remove raft storage and sync
This changeset is the OSS portion of the work to remove the raft storage and
sync for Nomad Enterprise.
2021-04-28 10:28:23 -04:00
Michael Schurter 0eb5d5136f e2e: use public_ip in packer 2021-04-27 15:07:03 -07:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Seth Hoenig 09cd01a5f3 e2e: add e2e tests for consul namespaces on ent with acls
This PR adds e2e tests for Consul Namespaces for Nomad Enterprise
with Consul ACLs enabled.

Needed to add support for Consul ACL tokens with `namespace` and
`namespace_prefix` blocks, which Nomad parses and validates before
tossing the token. These bits will need to be picked back to OSS.
2021-04-27 14:45:54 -06:00
Seth Hoenig f258fc8270
Merge pull request #10401 from hashicorp/cp-cns-ent-test-fixes
cherry-pick fixes from cns ent tests
2021-04-20 08:45:15 -06:00
Drew Bailey d42f204a89
remove second deploy that did not have anything to do with the test (#10400) 2021-04-20 08:44:44 -04:00
Seth Hoenig 509490e5d2 e2e: consul namespace tests from nomad ent
(cherry-picked from ent without _ent things)

This is part 2/4 of e2e tests for Consul Namespaces. Took a
first pass at what the parameterized tests can look like, but
only on the ENT side for this PR. Will continue to refactor
in the next PRs.

Also fixes 2 bugs:
 - Config Entries registered by Nomad Server on job registration
   were not getting Namespace set
 - Group level script checks were not getting Namespace set

Those changes will need to be copied back to Nomad OSS.

Nomad OSS + no ACLs (previously, needs refactor)
Nomad ENT + no ACLs (this)
Nomad OSS + ACLs (todo)
Nomad ENT + ALCs (todo)
2021-04-19 15:35:31 -06:00
Seth Hoenig 25810b4cd6 e2e: set PORT on counter-api with host networking 2021-04-16 16:28:39 -06:00
Seth Hoenig 2d693127bb e2e: minor tweaks from CR 2021-04-16 15:32:37 -06:00
Seth Hoenig 7f1191111d e2e: add tests for consul namespaces from nomad oss
This PR adds a set of tests to the Consul test suite for testing
Nomad OSS's behavior around setting Consul Namespace on groups,
which is to ignore the setting (as Consul Namespaces are currently
an Enterprise feature).

Tests are generally a reduced facsimile of existing tests, modified
to check behavior of when group.consul.namespace is set and not set.
Verification is oriented around what happens in Consul; the in-depth
functional correctness of these features is left to the original tests.

Nomad ENT will get its own version of these tests in `namespaces_ent.go`.
2021-04-16 15:32:37 -06:00
Tim Gross dcc5268862 E2E/CSI: ensure jobs are stopped before checking claims are released
During refactoring of the CSI jobs, the EBS test dropped stopping the jobs
before checking that the claims were released.
2021-04-15 11:06:11 -04:00
Seth Hoenig 198e0d9f24 e2e: get consul ent in e2e packer builds
Using Consul Enterprise is going to be necessary for testing Nomad's
Consul Namespace integration in Nomad v1.1 in e2e.
2021-04-14 12:05:55 -06:00
Tim Gross a13590fb37 e2e/csi: fix name of column used for snapshot create output parsing 2021-04-13 09:15:19 -04:00
Tim Gross a84eca0136 E2E: remove broken Move-Item call during Windows provisioning
The archive does not include the `pkg/windows_amd64` path and unpacking the
archive happens in the installation directory.
2021-04-09 09:49:42 -04:00
Tim Gross f4ccb360ef E2E: use remote-exec via TF0.14.7+
The E2E provisioning used local-exec to call ssh in a for loop in a hacky
workaround https://github.com/hashicorp/terraform/issues/25634, which
prevented remote-exec from working on Windows. Move to a newer version of
Terraform that fixes the remote-exec bug to make provisioning more reliable
and observable.

Note that Windows remote-exec needs to include the `powershell` call itself,
unlike Unix-alike remote-exec.
2021-04-08 16:03:06 -04:00
Tim Gross da89103c5c E2E: extend CSI test to cover create and snapshot workflows
Split the EBS and EFS tests out into their own test cases:
* EBS exercises the Controller RPCs, including the create/snapshot workflow.
* EFS exercises only the Node RPCs, and assumes we have an existing volume
that gets registered, rather than created.
2021-04-08 12:55:36 -04:00
Yoan Blanc ac0d5d8bd3
chore: bump golangci-lint from v1.24 to v1.39
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2021-04-03 09:50:23 +02:00
Mahmood Ali 85502c1739 oversubscription: e2e tests! 2021-03-30 16:55:58 -04:00
Drew Bailey 7e78d4a607
e2e license smoke test (#10242) 2021-03-26 13:21:47 -04:00
Mahmood Ali dbc3850358
Merge pull request #10145 from hashicorp/b-periodic-init-status
periodic: always reset periodic children status
2021-03-26 09:19:08 -04:00
Drew Bailey 64084f3209
e2e allow setting an enterprise license environment variable (#10233)
* allow setting an enterprise license environment variable

* update comment

* address pr comments
2021-03-25 14:35:55 -04:00
Mahmood Ali e643742a38 Add a test for parameterized summary counts 2021-03-25 11:27:09 -04:00
Tim Gross 46223e190e E2E: bump AWS CSI driver versions 2021-03-24 14:17:38 -04:00
Tim Gross 0e774d40f5 E2E: CSI test should use expected unique-volume name 2021-03-23 08:34:17 -04:00
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Charlie Voiselle 0473f35003
Fixup uses of `sanity` (#10187)
* Fixup uses of `sanity`
* Remove unnecessary comments.

These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.

* Update nomad/fsm_test.go
2021-03-16 18:05:08 -04:00
Tim Gross 2a2e36690a docs: swap master for main in Nomad repo 2021-03-08 14:26:31 -05:00
Mahmood Ali ff8d67fae2
Merge pull request #9935 from hashicorp/e2e-segment-e2e-clusters
e2e: segment e2e clusters
2021-03-01 09:23:21 -05:00
Drew Bailey 86d9e1ff90
Merge pull request #9955 from hashicorp/on-update-services
Service and Check on_update configuration option (readiness checks)
2021-02-24 10:11:05 -05:00
Seth Hoenig d2cd605995 dist: place systemd unit options correctly
This PR places StartLimitIntervalSec and StartLimitBurst in the
Unit section of systemd unit files, rather than the Service section.

https://www.freedesktop.org/software/systemd/man/systemd.unit.html

Fixes #10065
2021-02-22 19:23:00 -06:00
Drew Bailey c152757d38
E2e/fix periodic (#10047)
* fix periodic

* update periodic to not use template

nomad job inspect no longer returns an apiliststub so the required fields to query job summary are no longer there, parse cli output instead

* rm tmp makefile entry

* fix typo

* revert makefile change
2021-02-18 12:21:53 -05:00
James Rasell f95e45b80c
e2e: account for race condition in periodic dispatch test. 2021-02-11 11:08:48 +01:00
Seth Hoenig 7d6e81e9e4
Merge pull request #9990 from hashicorp/f-nsiso-task
drivers/exec+java: Add task configuration to restore previous PID/IPC isolation behavior
2021-02-09 13:29:14 -06:00
Seth Hoenig 45e0e70a50 consul/connect: enable custom sidecars to use expose checks
This PR enables jobs configured with a custom sidecar_task to make
use of the `service.expose` feature for creating checks on services
in the service mesh. Before we would check that sidecar_task had not
been set (indicating that something other than envoy may be in use,
which would not support envoy's expose feature). However Consul has
not added support for anything other than envoy and probably never
will, so having the restriction in place seems like an unnecessary
hindrance. If Consul ever does support something other than Envoy,
they will likely find a way to provide the expose feature anyway.

Fixes #9854
2021-02-09 10:49:37 -06:00
Seth Hoenig 8ee9835923 drivers/exec+java: Add task configuration to restore previous PID/IPC isolation behavior
This PR adds pid_mode and ipc_mode options to the exec and java task
driver config options. By default these will defer to the default_pid_mode
and default_ipc_mode agent plugin options created in #9969. Setting
these values to "host" mode disables isolation for the task. Doing so
is not recommended, but may be necessary to support legacy job configurations.

Closes #9970
2021-02-08 14:26:35 -06:00
Drew Bailey b5585882e4
address pr comments 2021-02-08 13:43:05 -05:00
Drew Bailey b0cf3ffa54
on_update check_restart e2e 2021-02-08 10:49:25 -05:00
Drew Bailey 8507d54e3b
e2e test for on_update service checks
check_restart not compatible with on_update=ignore

reword caveat
2021-02-08 08:32:40 -05:00
Chris Baker b1bb8a760e e2e packer build: upgrade jdk to java 14 2021-02-02 17:33:48 +00:00
Mahmood Ali 45889f9f55 e2e: segment e2e clusters
Ensure that the e2e clusters are isolated and never attempt to autojoin
with another e2e cluster.

This ensures that each cluster servers have a unique `ConsulAutoJoin`,
to be used for discovery.
2021-02-01 08:04:21 -05:00
Chris Baker ce68ee164b Version 1.0.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJgEuOKAAoJEFGFLYc0j/xMxF8H/3TTU6Tu+Xm0YvcsDaYDphZ/
 X7KQBV0aFiuL5VkTw4PzKEsgryIy9/sqEPyxxyKRowAmos9qhiusjNAIfqdP4TF8
 tdZmTedkfWir9uPD+hyv/LXpwbQ2T8kTwS3xHTYvaOmaCxZr710FEn+imnMk1AUn
 Xs5itkd/CYGr0nBLm+I5GutWSDPmL7Uw8J5Z30fFyoaxoCPAbCWQQNk793SCRUc5
 f/uo18V2tFInmQ+3sAdnM4gPewyStK/a5VvzWavL9fVDtYK83wlqWSchTXY5jpVz
 zNEzt/rYhbBzakPQQKb5zieblh2iGI8aHWpD5w4WduqO2Sg6B/5lAeNZIlW0UJg=
 =2g3c
 -----END PGP SIGNATURE-----

Merge tag 'v1.0.3' into post-release-1.0.3

Version 1.0.3
2021-01-29 19:30:08 +00:00
Chris Baker 2632b81124 lint some nomad HCL job specs 2021-01-28 12:03:19 +00:00
Chris Baker 2adf0f12d6 e2e: java driver isolation tests 2021-01-28 12:03:19 +00:00
Chris Baker aa55df0413 additional e2e utils for multi-task allocs 2021-01-28 12:03:19 +00:00
Kris Hicks d67b77f38e Add a little comment 2021-01-28 12:03:19 +00:00
Kris Hicks 5cf972d2e7 Add test for alloc exec 2021-01-28 12:03:19 +00:00
Kris Hicks 2db8aa2a52 Add e2e test for raw exec 2021-01-28 12:03:19 +00:00
Kris Hicks 87188f04de Add PID namespacing and e2e test 2021-01-28 12:03:19 +00:00
Mahmood Ali c92bb342e1 e2e: skip node drain deadline/force tests 2021-01-27 08:42:16 -05:00
Mahmood Ali b12e8912a9 e2e: use f.NoError instead of requires 2021-01-27 08:36:23 -05:00
Mahmood Ali 1ac8b32e08 e2e: Disable Connect tests
The connect tests are very disruptive: restart consul/nomad agents with new
tokens.  The test seems particularly flaky, failing 32 times out of 73 in my
sample.

The tests are particularly problematic because they are disruptive and affect
other tests. On failure, the nomad or consul agent on the client can get into a
wedged state, so health/deployment info in subsequent tests may be wrong. In
some cases, the node will be deemed as fail, and then the subsequent tests may
fail when the node is deemed lost and the test allocations get migrated unexpectedly.
2021-01-26 10:01:14 -05:00
Mahmood Ali 36ce1e73eb e2e: deflake nodedrain test
The nodedrain deadline test asserts that all allocations are migrated by the
deadline. However, when the deadline is short (e.g. 10s), the test may fail
because of scheduler/client-propagation delays.

In one failing test, it took ~15s from the RPC call to the moment to the moment
the scheduler issued migration update, and then 3 seconds for the alloc to be
stopped.

Here, I increase the timeouts to avoid such false positives.
2021-01-26 10:01:14 -05:00
Mahmood Ali cf8f6f07d7 e2e: vault increase timeout
Increase the timeout for vaultsecrets.  As the default  interval is 0.1s, 10
retries mean it only retries for one second, a very short time for some waiting
scenarios in the test (e.g. starting allocs, etc).
2021-01-26 10:01:14 -05:00
Mahmood Ali 94ad40907c e2e: prefer testutil.WaitForResultRetries
Prefer testutil.WaitForResultRetries that emits more descriptive errors on
failures. `require.Evatually` fails with opaque "Condition never satisfied"
error message.
2021-01-26 10:01:14 -05:00
Mahmood Ali f3f8f15b7b e2e: special case "Unexpected EOF" errors
This is an attempt at deflaking the e2e exec tests, and a way to improve
messages.

e2e occasionally fail with "unexpected EOF" even though the exec output matches
expectations. I suspect there is a race in handling EOF in server/http handling.

Here, we special case this error and ensures we get all failures,
to help debug the case better.
2021-01-26 10:01:14 -05:00
Mahmood Ali 925d9ce952 e2e: tweak failure messages
Tweak the error messages for the flakiest tests, so that on test failure, we get
more output
2021-01-26 09:16:48 -05:00