open-nomad/e2e
Seth Hoenig 51a2212d3d
client: sandbox go-getter subprocess with landlock (#15328)
* client: sandbox go-getter subprocess with landlock

This PR re-implements the getter package for artifact downloads as a subprocess.

Key changes include

On all platforms, run getter as a child process of the Nomad agent.
On Linux platforms running as root, run the child process as the nobody user.
On supporting Linux kernels, uses landlock for filesystem isolation (via go-landlock).
On all platforms, restrict environment variables of the child process to a static set.
notably TMP/TEMP now points within the allocation's task directory
kernel.landlock attribute is fingerprinted (version number or unavailable)
These changes make Nomad client more resilient against a faulty go-getter implementation that may panic, and more secure against bad actors attempting to use artifact downloads as a privilege escalation vector.

Adds new e2e/artifact suite for ensuring artifact downloading works.

TODO: Windows git test (need to modify the image, etc... followup PR)

* landlock: fixup items from cr

* cr: fixup tests and go.mod file
2022-12-07 16:02:25 -06:00
..
acl e2e: do not assume clean cluster when checking return objects. (#14557) 2022-09-13 14:25:19 +02:00
affinities
artifact client: sandbox go-getter subprocess with landlock (#15328) 2022-12-07 16:02:25 -06:00
bin
clientstate
connect e2e: use unique names for Connect ACL Consul policy names. (#14604) 2022-09-16 13:35:40 +02:00
consul cleanup more helper updates (#14638) 2022-09-21 14:53:25 -05:00
consulacls
consultemplate build: run gofmt on all go source files 2022-08-16 11:14:11 -05:00
csi
deployment
disconnectedclients e2e: fix 1 of 4 client disconnect tests (#15357) 2022-11-22 08:51:53 -06:00
e2eutil e2e: fix 1 of 4 client disconnect tests (#15357) 2022-11-22 08:51:53 -06:00
eval_priority
events api: remove mapstructure tags fromPort struct (#12916) 2022-11-08 11:26:28 +01:00
example
execagent
framework build: run gofmt on all go source files 2022-08-16 11:14:11 -05:00
isolation e2e: explicitly wait on task status in chroot download exec test (#15145) 2022-11-04 09:50:11 -05:00
lifecycle
metrics
namespaces e2e: fix incorrect must function usage in namespace suite. (#14805) 2022-10-05 15:50:56 +02:00
networking
nodedrain
nomadexec
operator_scheduler core: allow pausing and un-pausing of leader broker routine (#13045) 2022-07-06 16:13:48 +02:00
overlap test: use port collision instead of cpu exhaustion (#14994) 2022-10-21 07:53:26 -07:00
oversubscription e2e: fixup oversubscription test case for jammy (#15347) 2022-11-21 12:41:55 -06:00
parameterized
periodic
podman
quotas
remotetasks
rescheduling
scaling cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
scalingpolicies
scheduler_sysbatch Allow specification of a custom job name/prefix for parameterized jobs (#14631) 2022-10-06 16:21:40 -04:00
scheduler_system
servicediscovery e2e: fixup service discovery and ACL expiration tests. (#14517) 2022-09-09 14:27:40 +02:00
spread e2e: fixes the ordering on greater than checks within spread test. (#14818) 2022-10-06 15:27:36 +02:00
taskevents
terraform e2e: fix 1 of 4 client disconnect tests (#15357) 2022-11-22 08:51:53 -06:00
ui e2e: upgrade playwright package and container image (#13080) 2022-05-20 08:41:07 -04:00
upgrades
vaultcompat cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
vaultsecrets
volumes
.gitignore
e2e_test.go e2e: move namespaces test out of legacy framework (#13934) 2022-08-01 13:24:34 -04:00
README.md

End to End Tests

This package contains integration tests. Unlike tests alongside Nomad code, these tests expect there to already be a functional Nomad cluster accessible (either on localhost or via the NOMAD_ADDR env var).

See framework/doc.go for how to write tests.

The NOMAD_E2E=1 environment variable must be set for these tests to run.

Provisioning Test Infrastructure on AWS

The terraform/ folder has provisioning code to spin up a Nomad cluster on AWS. You'll need both Terraform and AWS credentials to setup AWS instances on which e2e tests will run. See the README for details. The number of servers and clients is configurable, as is the specific build of Nomad to deploy and the configuration file for each client and server.

Provisioning Local Clusters

To run tests against a local cluster, you'll need to make sure the following environment variables are set:

  • NOMAD_ADDR should point to one of the Nomad servers
  • CONSUL_HTTP_ADDR should point to one of the Consul servers
  • NOMAD_E2E=1

TODO: the scripts in ./bin currently work only with Terraform, it would be nice for us to have a way to deploy Nomad to Vagrant or local clusters.

Running

After completing the provisioning step above, you can set the client environment for NOMAD_ADDR and run the tests as shown below:

# from the ./e2e/terraform directory, set your client environment
# if you haven't already
$(terraform output environment)

cd ..
go test -v ./...

If you want to run a specific suite, you can specify the -suite flag as shown below. Only the suite with a matching Framework.TestSuite.Component will be run, and all others will be skipped.

go test -v -suite=Consul .

If you want to run a specific test, you'll need to regex-escape some of the test's name so that the test runner doesn't skip over framework struct method names in the full name of the tests:

go test -v . -run 'TestE2E/Consul/\*consul\.ScriptChecksE2ETest/TestGroup'
                              ^       ^             ^               ^
                              |       |             |               |
                          Component   |             |           Test func
                                      |             |
                                  Go Package      Struct

We're also in the process of migrating to "stdlib-style" tests that use the standard go testing package without a notion of "suite". You can run these with -run regexes the same way you would any other go test:

go test -v . -run TestExample/TestExample_Simple

I Want To...

...SSH Into One Of The Test Machines

You can use the Terraform output to find the IP address. The keys will in the ./terraform/keys/ directory.

ssh -i keys/nomad-e2e-*.pem ubuntu@${EC2_IP_ADDR}

Run terraform output for IP addresses and details.

...Deploy a Cluster of Mixed Nomad Versions

The variables.tf file describes the nomad_version, and nomad_local_binary variables that can be used for most circumstances. But if you want to deploy mixed Nomad versions, you can provide a list of versions in your terraform.tfvars file.

For example, if you want to provision 3 servers all using Nomad 0.12.1, and 2 Linux clients using 0.12.1 and 0.12.2, you can use the following variables:

# will be used for servers
nomad_version = "0.12.1"

# will override the nomad_version for Linux clients
nomad_version_client_linux = [
    "0.12.1",
    "0.12.2"
]

...Deploy Custom Configuration Files

Set the profile field to "custom" and put the configuration files in ./terraform/config/custom/ as described in the README.

...Deploy More Than 4 Linux Clients

Use the "custom" profile as described above.

...Change the Nomad Version After Provisioning

You can update the nomad_version variable, or simply rebuild the binary you have at the nomad_local_binary path so that Terraform picks up the changes. Then run terraform plan/terraform apply again. This will update Nomad in place, making the minimum amount of changes necessary.