Commit graph

55 commits

Author SHA1 Message Date
freddygv a21e5799f7 Use internal server certificate for peering TLS
A previous commit introduced an internally-managed server certificate
to use for peering-related purposes.

Now the peering token has been updated to match that behavior:
- The server name matches the structure of the server cert
- The CA PEMs correspond to the Connect CA

Note that if Conect is disabled, and by extension the Connect CA, we
fall back to the previous behavior of returning the manually configured
certs and local server SNI.

Several tests were updated to use the gRPC TLS port since they enable
Connect by default. This means that the peering token will embed the
Connect CA, and the dialer will expect a TLS listener.
2022-10-07 09:05:32 -06:00
Chris S. Kim dfaf798995 Add debugging logs to freeport 2022-08-08 15:25:17 -04:00
Luke Kysow e9960dfdf3
peering: default to false (#13963)
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
2022-08-01 15:22:36 -04:00
R.B. Boyer 5b801db24b
peering: move peer replication to the external gRPC port (#13698)
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
2022-07-08 12:01:13 -05:00
R.B. Boyer 9ad10318cd
add general runstep test helper instead of copying it all over the place (#13013) 2022-05-10 15:25:51 -05:00
Daniel Nephin 1a9a656a7f sdk: add TestLogLevel for setting log level in tests
And default log level to WARN.
2022-02-03 13:42:28 -05:00
Ashwin Venkatesh fb04a28140
Add support for 'Partition' and 'RetryJoin' (#12126)
- Adding a 'Partition' and 'RetryJoin' command allows test cases where
  one would like to spin up a Consul Agent in a non-default partition to
test use-cases that are common when enabling Admin Partition on
Kubernetes.
2022-01-20 16:49:36 -05:00
Dan Upton 088ba2edaf
[OSS] Remove remaining references to master (#11827) 2022-01-20 12:47:50 +00:00
Daniel Nephin 59a7fb5ca9 testing: use httptest with freeport 2021-11-29 12:01:29 -05:00
FFMMM 27227c0fd2
add root_cert_ttl option for consul connect, vault ca providers (#11428)
* add root_cert_ttl option for consul connect, vault ca providers

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

* add changelog, pr feedback

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* Update .changelog/11428.txt, more docs

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Update website/content/docs/agent/options.mdx

Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>
2021-11-02 11:02:10 -07:00
Freddy 27f40ccf51
Update error texts (#11022)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-09-14 11:08:06 -06:00
Sergey Matyukevich 19a744191f
Allow configuring graceful stop in testutil (#10566)
* Allow configuring graceful stop in testutil

Signed-off-by: Sergey Matyukevich <s.matyukevich@gmail.com>

* add a changelog

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2021-09-08 11:12:54 -04:00
R.B. Boyer f408c4d387
sdk: Stop making a special /tmp/consul-test directory for testutil.TempFile and testutil.TempDir (#10494) 2021-06-24 15:46:23 -05:00
David Bariod 728165f31e
remove a race condition in sdk testutil server stop (#10342)
* remove a race condition in sdk testutil server stop

* ensure the process has actually stopped
2021-06-09 11:30:26 -04:00
Matt Keeler 58b934133d hcs-1936: Prepare for adding license auto-retrieval to auto-config in enterprise 2021-05-24 13:20:30 -04:00
Matt Keeler 82f5cb3f08 Preparation for changing where license management is done. 2021-05-24 10:19:31 -04:00
Daniel Nephin 71d6a2bf4b Fix some test flakes
- return errors in TestAgent.Start so that the retry works correctly
- remove duplicate logging, the error is returned already
- add a missing t.Helper() to retry.Run
- properly set a.Agent to nil so that subsequent retry attempts will actually try to start
2021-05-10 13:20:45 -04:00
Daniel Nephin 65da18a986 submatview: fix godoc and comment typos
Co-authored-by: Paul Banks <banks@banksco.de>
2021-04-28 12:27:38 -04:00
Daniel Nephin d0b2a4a9d3 sdk/retry: support ending the iteration early
I've found this feature to be very useful in https://pkg.go.dev/gotest.tools/v3/poll#WaitOn

I have encountered a few cases where I wanted that same support, so this commit adds it.
2021-04-27 19:03:17 -04:00
Daniel Nephin f169708bb0 sdk/retry: remove the need to pass args to NextOr 2021-04-27 19:03:17 -04:00
Daniel Nephin e456945466 sdk/retry: a few small debug improvements
On a few occasions I've had to read timeout stack traces for tests and
noticed that retry.Run runs the function in a goroutine. This makes
debuging a timeout more difficult because the gourinte of the retryable
function is disconnected from the stack of the actual test. It requires
searching through the entire stack trace to find the other goroutine.

By using panic instead of runtime.Goexit() we remove the need for a
separate goroutine.

Also a few other small improvements:
* add `R.Helper` so that an assertion function can be used with both
  testing.T and retry.R.
* Pass t to `Retryer.NextOr`, and call `t.Helper` in a number of places
  so that the line number reported by `t.Log` is the line in the test
  where `retry.Run` was called, instead of some line in `retry.go` that
  is not relevant to the failure.
* improve the implementation of `dedup` by removing the need to iterate
  twice. Instad track the lines and skip any duplicate when writing to
  the buffer.
2021-04-21 17:10:46 -04:00
John Eikenberry fe938b9849 [SDK] change all cases of *testing.T to testing.TB
Using the interface opens up the use of all methods to benchmarks as
well as tests.
2021-03-16 15:05:39 -07:00
R.B. Boyer d6dce2332a
connect: intentions are now managed as a new config entry kind "service-intentions" (#8834)
- Upgrade the ConfigEntry.ListAll RPC to be kind-aware so that older
copies of consul will not see new config entries it doesn't understand
replicate down.

- Add shim conversion code so that the old API/CLI method of interacting
with intentions will continue to work so long as none of these are
edited via config entry endpoints. Almost all of the read-only APIs will
continue to function indefinitely.

- Add new APIs that operate on individual intentions without IDs so that
the UI doesn't need to implement CAS operations.

- Add a new serf feature flag indicating support for
intentions-as-config-entries.

- The old line-item intentions way of interacting with the state store
will transparently flip between the legacy memdb table and the config
entry representations so that readers will never see a hiccup during
migration where the results are incomplete. It uses a piece of system
metadata to control the flip.

- The primary datacenter will begin migrating intentions into config
entries on startup once all servers in the datacenter are on a version
of Consul with the intentions-as-config-entries feature flag. When it is
complete the old state store representations will be cleared. We also
record a piece of system metadata indicating this has occurred. We use
this metadata to skip ALL of this code the next time the leader starts
up.

- The secondary datacenters continue to run the old intentions
replicator until all servers in the secondary DC and primary DC support
intentions-as-config-entries (via serf flag). Once this condition it met
the old intentions replicator ceases.

- The secondary datacenters replicate the new config entries as they are
migrated in the primary. When they detect that the primary has zeroed
it's old state store table it waits until all config entries up to that
point are replicated and then zeroes its own copy of the old state store
table. We also record a piece of system metadata indicating this has
occurred. We use this metadata to skip ALL of this code the next time
the leader starts up.
2020-10-06 13:24:05 -05:00
Paul Burlumi c28db3960e testutil: use TestingTB instead of testing.TB 2020-09-10 21:37:02 +01:00
R.B. Boyer a14a31ccf1
sdk: also print test agent logs in verbose mode (#8616) 2020-09-04 11:24:11 -05:00
Daniel Nephin b204e342c5
Merge pull request #8514 from hashicorp/dnephin/testing-improvements-1
testing: small improvements to TestSessionCreate and testutil.retry
2020-08-18 18:26:05 -04:00
Daniel Nephin 3e0d63a6b7 testing: use t.Cleanup in testutil.TempFile
So that it has the same behaviour as TempDir.

Also remove the now unnecessary 'defer os.Remove'
2020-08-14 20:06:01 -04:00
Daniel Nephin 5d53e7dbde Use t.Helper in testutil/retry 2020-08-14 18:55:52 -04:00
Daniel Nephin 2725513eea testutil: Add t.Cleanup to TempDir
TempDir registers a Cleanup so that the directory is always removed. To disable to cleanup, set the TEST_NOCLEANUP
env var.
2020-08-14 13:19:10 -04:00
Daniel Nephin e38d271bd1 Use SIGABRT to get a stack trace when the timeout is hit 2020-08-11 12:12:55 -04:00
Daniel Nephin 71e51263be sdk: mitigate api test timeout
Occasionally we are seeing the go-test-api job timeout at 10 minutes.
Looking at the stack trace I saw the following:

1. Lots of tests blocked on server.Stop in NewTestServerConfigT. This
   suggests that SIGINT is being sent to the server, but the server is
   not properly shutting down.

2. Over 20k goroutines that look like this:

goroutine 16355 [select, 8 minutes]:
net/http.(*persistConn).readLoop(0xc004270240)
    /usr/local/go/src/net/http/transport.go:2099 +0x99e
created by net/http.(*Transport).dialConn
    /usr/local/go/src/net/http/transport.go:1647 +0xc56

Issue 1 seems to be the main problem, but debugging that directly is not
possible because our buffered logs do not get sent when the tests
timeout. To mitigate this problem I've added a timeout to the cmd.Wait()
to force kill the process and return an error.

Unfortunately because we retry this operation, we still may not see the
cause because the next attempt will likely pass. I'm tempted to remove
the retry around NewTestServerConfigT.

Issue 2 seems to be caused by not closing the response body. Since the
request is performed many times in a loop, many goroutines are created
and are not closed until the response body is closed.
2020-08-06 17:00:20 -04:00
Daniel Nephin 80ff174880 testutil: NewLogBuffer - buffer logs until a test fails
Replaces #7559

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

Attaching log output from background goroutines also cause data races.  If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race.

The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Related:
https://github.com/golang/go/issues/38458 (may be fixed in go1.15)
https://github.com/golang/go/issues/38382#issuecomment-612940030
2020-07-21 12:50:40 -04:00
Iryna Shustava 5eb8ee0cac
sdk: Use /v1/status/leader endpoint when starting a test server (#8192)
Switch from /v1/agent/self to /v1/status/leader when checking if the test server has come up successfully in the waitForAPI function.

Previously, the test server was relying (probably not intentionally) on the default value of the acl_enforce_version_8 in the TestConfig, which was false. So if you create a test server and enabled ACLs, they would not be enforced and the server would be able to come up pretty quickly because /v1/agent/self would return a 200 status pretty much as soon as the agent is running and most likely before leader election is finished.

Now that we have removed acl_enforce_version_8 property (equivalent to being true by default) if you've created a test server with ACLs enabled, it will need to wait for leader election and for ACLs to be initialized before it'll get a successful response from the /v1/agent/self.

Note: With this change, waitForAPI function no longer requires a 200 response status from the v1/status/leader endpoint. This is because in some tests, namely TestAPI_AgentLeave, we are only running clients, and this endpoint returns a 500 status.
2020-07-07 14:25:17 -07:00
R.B. Boyer 16db20b1f3
acl: remove the deprecated acl_enforce_version_8 option (#7991)
Fixes #7292
2020-05-29 16:16:03 -05:00
R.B. Boyer 9faf8c42d1
sdk: extracting testutil.RequireErrorContains from various places it was duplicated (#7753) 2020-05-01 11:56:34 -05:00
Matt Keeler 111cb51fc8
Testing updates to support namespaced testing of the agent/xds… (#7185)
* Various testing updates to support namespaced testing of the agent/xds package

* agent/proxycfg package updates to support better namespace testing
2020-02-03 09:26:47 -05:00
Chris Piraino 3dd0b59793
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
Matt Keeler 01d647f16a
Unflake the TestAPI_AgentConnectCALeaf test (#7142)
* Unflake the TestAPI_AgentConnectCALeaf test

* Modify the WaitForActiveCARoot to actually verify that at least one root exists
Also verify that the active root id field is set
2020-01-27 14:34:04 -05:00
Chris Piraino 61b92b92bd
Fix up formatting in sdk package (#7109) 2020-01-22 12:45:34 -06:00
Song Yihan 88dc6121d8 tests: fix zombie consul process while invoking TestServer.Stop() method in sdk/testutil in Windows (#6032)
* Fix zombie consul process in Windows 

Windows doesn't support Interrupt signal, thus while stop it on Windows platform
it would fail and left zombie consul process
2020-01-22 17:34:35 +01:00
Paul Banks 1e88978aae
Fix TestAPI_DiscoveryChain_Get flake (#7082) 2020-01-20 14:56:56 +00:00
Mike Morris ac360aff98
sdk: add NewTestServerT, deprecate NewTestServer (#6761)
* prevent nil pointer deref from t.Logf usage

* enforce non-nil *testing.T in NewTestServerT
2019-11-08 17:51:49 -05:00
R.B. Boyer 0425acf58c
sdk: ignore panics due to stray goroutines logging after a test completes (#6632)
If there is imperfect goroutine lifespan tracking if we pipe our logs
through testing.T.Logf there is a chance of a stray goroutine attempting
to log after the test that spawned it completes.

This results in a panic of:

    panic: Log in goroutine after TestLeader_SecondaryCA_Initialize has completed...

This isn't great and should be fixed, but quickly runs into situations
around externally cancelling blocking queries which isn't terribly
possible at the moment. The concession here is to ignore these specific
panics for now.

This can be triggered easily when running some tests with a high
`-count=YYY` value.
2019-10-17 11:01:11 -05:00
Matt Keeler 2040e92e4d
Nil checks in the testWriter to prevent using a bad testing.TB (#6517)
Also needed to update some funcs that were taking a *testing.T to use a testing.TB. This prevents passing a nil pointer as a non-nil interface value
and thus making it impossible to detect nil before using the interfaces functions.
2019-09-20 17:01:08 -04:00
Matt Keeler 5e460b335c
Dont crash in the testutil server creation (#6516) 2019-09-20 15:52:02 -04:00
R.B. Boyer 4a1a7d6fa6 api/watch: reduce timing dependence on tests of watch behavior
Also for debugging purposes send the stdout/stderr streams from consul
processes spawned for API tests to testing.T.Logf
2019-09-19 09:20:53 -05:00
R.B. Boyer 5c5f21088c sdk: add freelist tracking and ephemeral port range skipping to freeport
This should cut down on test flakiness.

Problems handled:

- If you had enough parallel test cases running, the former circular
approach to handling the port block could hand out the same port to
multiple cases before they each had a chance to bind them, leading to
one of the two tests to fail.

- The freeport library would allocate out of the ephemeral port range.
This has been corrected for Linux (which should cover CI).

- The library now waits until a formerly-in-use port is verified to be
free before putting it back into circulation.
2019-09-17 14:30:43 -05:00
Mike Morris 88df658243
connect: remove managed proxies (#6220)
* connect: remove managed proxies implementation and all supporting config options and structs

* connect: remove deprecated ProxyDestination

* command: remove CONNECT_PROXY_TOKEN env var

* agent: remove entire proxyprocess proxy manager

* test: remove all managed proxy tests

* test: remove irrelevant managed proxy note from TestService_ServerTLSConfig

* test: update ContentHash to reflect managed proxy removal

* test: remove deprecated ProxyDestination test

* telemetry: remove managed proxy note

* http: remove /v1/agent/connect/proxy endpoint

* ci: remove deprecated test exclusion

* website: update managed proxies deprecation page to note removal

* website: remove managed proxy configuration API docs

* website: remove managed proxy note from built-in proxy config

* website: add note on removing proxy subdirectory of data_dir
2019-08-09 15:19:30 -04:00
Alvin Huang 5b6fa58453 resolve circleci config conflicts 2019-07-23 20:18:36 -04:00
Christian Muehlhaeuser 2602f6907e Simplified code in various places (#6176)
All these changes should have no side-effects or change behavior:

- Use bytes.Buffer's String() instead of a conversion
- Use time.Since and time.Until where fitting
- Drop unnecessary returns and assignment
2019-07-20 09:37:19 -04:00