Consolidate all nomad data dir in a single root
`/tmp/nomad-dev-cluster`. Eases clean up.
Allow running script from any path - don't require devs to cd into
`dev/cluster` directory first.
Also, block while nomad processes are running and prapogate
SIGTERM/SIGINT to nomad processes to shutdown.
diffSystemAllocs -> diffSystemAllocsForNode, this function is only used
for diffing system allocations, but lacked awareness of eligible
nodes and the node ID that the allocation was going to be placed.
This change now ignores a change if its existing allocation is on an
ineligible node. For a new allocation, it also checks tainted and
ineligible nodes in the same function instead of nil-ing out the diff
after computation in diffSystemAllocs
This is from a merge conflict resolution that went the wrong direction.
I assumed the block had been added, but really it had been removed. Now,
it is removed once again.
This test is causing panics. Unlike the other similar tests, this
one is using require.Eventually which is doing something bad, and
this change replaces it with a for-loop like the other tests.
Failure:
=== RUN TestE2E/Connect
=== RUN TestE2E/Connect/*connect.ConnectE2ETest
=== RUN TestE2E/Connect/*connect.ConnectE2ETest/TestConnectDemo
=== RUN TestE2E/Connect/*connect.ConnectE2ETest/TestMultiServiceConnect
=== RUN TestE2E/Connect/*connect.ConnectClientStateE2ETest
panic: Fail in goroutine after TestE2E/Connect/*connect.ConnectE2ETest has completed
goroutine 38 [running]:
testing.(*common).Fail(0xc000656500)
/opt/google/go/src/testing/testing.go:565 +0x11e
testing.(*common).Fail(0xc000656100)
/opt/google/go/src/testing/testing.go:559 +0x96
testing.(*common).FailNow(0xc000656100)
/opt/google/go/src/testing/testing.go:587 +0x2b
testing.(*common).Fatalf(0xc000656100, 0x1512f90, 0x10, 0xc000675f88, 0x1, 0x1)
/opt/google/go/src/testing/testing.go:672 +0x91
github.com/hashicorp/nomad/e2e/connect.(*ConnectE2ETest).TestMultiServiceConnect.func1(0x0)
/home/shoenig/go/src/github.com/hashicorp/nomad/e2e/connect/multi_service.go:72 +0x296
github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually.func1(0xc0004962a0, 0xc0002338f0)
/home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1494 +0x27
created by github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually
/home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1493 +0x272
FAIL github.com/hashicorp/nomad/e2e 21.427s
Provide script for managing Consul ACLs on a TF provisioned cluster for
e2e testing. Script can be used to 'enable' or 'disable' Consul ACLs,
and automatically takes care of the bootstrapping process if necessary.
The bootstrapping process takes a long time, so we may need to
extend the overall e2e timeout (20 minutes seems fine).
Introduces basic tests for Consul Connect with ACLs.
Re-orient the management of the tr.kill to happen in the parent of
the spawned goroutine that is doing the actual token derivation. This
makes the code a little more straightforward, making it easier to
reason about not leaking the worker goroutine.
The derivation of an SI token needs to be safegaurded by a context
timeout, otherwise an unresponsive Consul could cause the siHook
to block forever on Prestart.
Apply smaller suggestions like doc strings, variable names, etc.
Co-Authored-By: Nick Ethier <nethier@hashicorp.com>
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
The TestEnvoyBootstrapHook_maybeLoadSIToken test case only works when
running as a non-priveleged user, since it deliberately tries to read
an un-readable file to simulate a failure loading the SI token file.
Was thinking about using the testing pattern where you create executable
shell scripts as test resources which "mock" the process a bit of code
is meant to fork+exec. Turns out that wasn't really necessary in this case.
When creating the envoy bootstrap configuration, we should append
the "-token=<token>" argument in the case where the sidsHook placed
the token in the secrets directory.
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
When a job is configured with Consul Connect aware tasks (i.e. sidecar),
the Nomad Client should be able to request from Consul (through Nomad Server)
Service Identity tokens specific to those tasks.
Enable any Server to lookup the unique ClusterID. If one has not been
generated, and this node is the leader, generate a UUID and attempt to
apply it through raft.
The value is not yet used anywhere in this changeset, but is a prerequisite
for gh-6701.
This change provides an initial pass at setting up the configuration necessary to
enable use of Connect with Consul ACLs. Operators will be able to pass in a Consul
Token through `-consul-token` or `$CONSUL_TOKEN` in the `job run` and `job revert`
commands (similar to Vault tokens).
These values are not actually used yet in this changeset.