Commit Graph

476 Commits

Author SHA1 Message Date
Mahmood Ali 97966c7a71
e2e: Run system jobs on all datacenters (#11060)
Target all e2e datacenters for system and sysbatch e2e tests.  They
require that the system jobs run on all linux clients.

However, the jobs currenly only target `dc1` datacenter, but the nightly
e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing
the tests to fail.

I missed this problem in e2e dev cluster because it only used a single
dc1 datacenter.
2021-08-17 11:01:47 -04:00
Mahmood Ali 28bc234e84 e2e: fix tests
Use basic sleeps in busybox images. busybox are very light, and ping has
permissions complications, and it may fail for network related
issues.
2021-08-03 11:38:35 -04:00
Seth Hoenig 3371214431 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
Mahmood Ali 70f541287b
e2e: wait for allocs and deployments (#10967)
As we moved to using `-detach` for registering jobs, we should wait
until allocs and deployments are created before asserting their
properties.

Fixing `TestNodeDrainIgnoreSystem` and `TestRescheduleProgressDeadlineFail` tests as they seem particularly flaky, failing 9 and 7 times (respectively) in the last two weeks.
2021-07-29 10:52:04 -04:00
Mahmood Ali a9bd176742
e2e: use -detach mode when registering jobs with cli (#10877)
Pick up 15d39f0dee but for RegisterFromJobspec:

>  This PR changes the e2e helper thingy to set -detach option
>  when registering a job with the CLI instead of the API. This is
>  necessary for jobs which never become healthy, as the deployment
>  never finishes for failing jobs and the command never returns,
>  causing the test to timeout after 10 minutes.

This case occurs in TestVaultSecrets
2021-07-09 09:25:44 -04:00
Seth Hoenig 80f4340b77 e2e: use -detach mode when registering jobs with cli
This PR changes the e2e helper thingy to set -detach option
when registering a job with the CLI instead of the API. This is
necessary for jobs which never become healthy, as the deployment
never finishes for failing jobs and the command never returns,
causing the test to timeout after 10 minutes.
2021-06-18 12:18:40 -05:00
James Rasell 939b23936a
Merge pull request #10744 from hashicorp/b-remove-duplicate-imports
chore: remove duplicate import statements
2021-06-11 16:42:34 +02:00
James Rasell 2898e5d379
e2e: remove duplicate import statements. 2021-06-11 09:37:23 +02:00
Michael Schurter 319650d481 e2e: use api.ipify.org
ipv4.icanhazip.com returns ipv6 addresses
2021-06-07 15:12:42 -07:00
Mahmood Ali 5258ae480b remove unused Spark security group rules 2021-06-04 11:49:43 -04:00
Mahmood Ali b852dc5eb8 e2e: pass nomad_url variable 2021-06-04 10:32:51 -04:00
Mahmood Ali 71936e1b27 e2e: NOMAD_VERSION is not set when installing url 2021-06-04 10:31:37 -04:00
Mahmood Ali d0768bb999 restrict ingress ip 2021-06-04 10:31:35 -04:00
Luiz Aoqui 139c5e8df9
e2e: fix terraform output environment command instruction (#10674) 2021-06-01 10:10:12 -04:00
Mahmood Ali d8de4e62bb
Merge pull request #10657 from hashicorp/b-alloc-exec-closing
Handle `nomad exec` termination events in order
2021-05-25 14:50:58 -04:00
Mahmood Ali 0853d48927
e2e: Spin clusters with custom url binaries (#10656)
Ease spinning up a cluster, where binaries are fetched from arbitrary
urls.  These could be CircleCI `build-binaries` job artifacts, or
presigned S3 urls.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-05-25 13:47:39 -04:00
Mahmood Ali 3b7c5ff46e e2e: stop suppressing unexpected EOF errors 2021-05-24 13:35:08 -04:00
Tim Gross 709b92c5a8 e2e: update TF lockfile 2021-05-18 09:35:57 -04:00
Tim Gross d4465f01ac E2E: remove references to nomad_sha 2021-05-10 16:42:39 -04:00
Mahmood Ali a33ec72dd7
e2e: enable memory oversubscription (#10557)
Enable memory oversubscription for the oversubscription tests.
2021-05-10 14:33:47 -04:00
Michael Schurter 547a718ef6
Merge pull request #10248 from hashicorp/f-remotetask-2021
core: propagate remote task handles
2021-04-30 08:57:26 -07:00
Michael Schurter 982c65c0c7 comment out unused consts to make linter happy 2021-04-30 08:31:31 -07:00
Seth Hoenig d54a606819
Merge pull request #10439 from hashicorp/pick-ent-acls-changes
e2e: add e2e tests for consul namespaces on ent with acls
2021-04-28 08:30:08 -06:00
Tim Gross 79f81d617e licensing: remove raft storage and sync
This changeset is the OSS portion of the work to remove the raft storage and
sync for Nomad Enterprise.
2021-04-28 10:28:23 -04:00
Michael Schurter 0eb5d5136f e2e: use public_ip in packer 2021-04-27 15:07:03 -07:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Seth Hoenig 09cd01a5f3 e2e: add e2e tests for consul namespaces on ent with acls
This PR adds e2e tests for Consul Namespaces for Nomad Enterprise
with Consul ACLs enabled.

Needed to add support for Consul ACL tokens with `namespace` and
`namespace_prefix` blocks, which Nomad parses and validates before
tossing the token. These bits will need to be picked back to OSS.
2021-04-27 14:45:54 -06:00
Seth Hoenig f258fc8270
Merge pull request #10401 from hashicorp/cp-cns-ent-test-fixes
cherry-pick fixes from cns ent tests
2021-04-20 08:45:15 -06:00
Drew Bailey d42f204a89
remove second deploy that did not have anything to do with the test (#10400) 2021-04-20 08:44:44 -04:00
Seth Hoenig 509490e5d2 e2e: consul namespace tests from nomad ent
(cherry-picked from ent without _ent things)

This is part 2/4 of e2e tests for Consul Namespaces. Took a
first pass at what the parameterized tests can look like, but
only on the ENT side for this PR. Will continue to refactor
in the next PRs.

Also fixes 2 bugs:
 - Config Entries registered by Nomad Server on job registration
   were not getting Namespace set
 - Group level script checks were not getting Namespace set

Those changes will need to be copied back to Nomad OSS.

Nomad OSS + no ACLs (previously, needs refactor)
Nomad ENT + no ACLs (this)
Nomad OSS + ACLs (todo)
Nomad ENT + ALCs (todo)
2021-04-19 15:35:31 -06:00
Seth Hoenig 25810b4cd6 e2e: set PORT on counter-api with host networking 2021-04-16 16:28:39 -06:00
Seth Hoenig 2d693127bb e2e: minor tweaks from CR 2021-04-16 15:32:37 -06:00
Seth Hoenig 7f1191111d e2e: add tests for consul namespaces from nomad oss
This PR adds a set of tests to the Consul test suite for testing
Nomad OSS's behavior around setting Consul Namespace on groups,
which is to ignore the setting (as Consul Namespaces are currently
an Enterprise feature).

Tests are generally a reduced facsimile of existing tests, modified
to check behavior of when group.consul.namespace is set and not set.
Verification is oriented around what happens in Consul; the in-depth
functional correctness of these features is left to the original tests.

Nomad ENT will get its own version of these tests in `namespaces_ent.go`.
2021-04-16 15:32:37 -06:00
Tim Gross dcc5268862 E2E/CSI: ensure jobs are stopped before checking claims are released
During refactoring of the CSI jobs, the EBS test dropped stopping the jobs
before checking that the claims were released.
2021-04-15 11:06:11 -04:00
Seth Hoenig 198e0d9f24 e2e: get consul ent in e2e packer builds
Using Consul Enterprise is going to be necessary for testing Nomad's
Consul Namespace integration in Nomad v1.1 in e2e.
2021-04-14 12:05:55 -06:00
Tim Gross a13590fb37 e2e/csi: fix name of column used for snapshot create output parsing 2021-04-13 09:15:19 -04:00
Tim Gross a84eca0136 E2E: remove broken Move-Item call during Windows provisioning
The archive does not include the `pkg/windows_amd64` path and unpacking the
archive happens in the installation directory.
2021-04-09 09:49:42 -04:00
Tim Gross f4ccb360ef E2E: use remote-exec via TF0.14.7+
The E2E provisioning used local-exec to call ssh in a for loop in a hacky
workaround https://github.com/hashicorp/terraform/issues/25634, which
prevented remote-exec from working on Windows. Move to a newer version of
Terraform that fixes the remote-exec bug to make provisioning more reliable
and observable.

Note that Windows remote-exec needs to include the `powershell` call itself,
unlike Unix-alike remote-exec.
2021-04-08 16:03:06 -04:00
Tim Gross da89103c5c E2E: extend CSI test to cover create and snapshot workflows
Split the EBS and EFS tests out into their own test cases:
* EBS exercises the Controller RPCs, including the create/snapshot workflow.
* EFS exercises only the Node RPCs, and assumes we have an existing volume
that gets registered, rather than created.
2021-04-08 12:55:36 -04:00
Yoan Blanc ac0d5d8bd3
chore: bump golangci-lint from v1.24 to v1.39
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2021-04-03 09:50:23 +02:00
Mahmood Ali 85502c1739 oversubscription: e2e tests! 2021-03-30 16:55:58 -04:00
Drew Bailey 7e78d4a607
e2e license smoke test (#10242) 2021-03-26 13:21:47 -04:00
Mahmood Ali dbc3850358
Merge pull request #10145 from hashicorp/b-periodic-init-status
periodic: always reset periodic children status
2021-03-26 09:19:08 -04:00
Drew Bailey 64084f3209
e2e allow setting an enterprise license environment variable (#10233)
* allow setting an enterprise license environment variable

* update comment

* address pr comments
2021-03-25 14:35:55 -04:00
Mahmood Ali e643742a38 Add a test for parameterized summary counts 2021-03-25 11:27:09 -04:00
Tim Gross 46223e190e E2E: bump AWS CSI driver versions 2021-03-24 14:17:38 -04:00
Tim Gross 0e774d40f5 E2E: CSI test should use expected unique-volume name 2021-03-23 08:34:17 -04:00
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Charlie Voiselle 0473f35003
Fixup uses of `sanity` (#10187)
* Fixup uses of `sanity`
* Remove unnecessary comments.

These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.

* Update nomad/fsm_test.go
2021-03-16 18:05:08 -04:00
Tim Gross 2a2e36690a docs: swap master for main in Nomad repo 2021-03-08 14:26:31 -05:00