open-consul

Commit Graph

Author	SHA1	Message	Date
Chris S. Kim	05b2e290b8	Update go version to 1.19	2022-10-24 16:12:08 -04:00
cskh	163b10a562	integration test: specify image name for testing dev container (#15069 ) * integration test: specify image name for testing dev container * Add todo	2022-10-20 17:12:36 -04:00
cskh	ecf797e789	chore: fix the module path to align with the code structure (#15053 )	2022-10-19 10:36:35 -04:00
R.B. Boyer	e5b73a7b12	test: possibly fix flaky TestEnvoy/case-ingress-gateway-multiple-services test (#15034 ) The integration test TestEnvoy/case-ingress-gateway-multiple-services is flaky and this possibly reduces the flakiness by explicitly waiting for services to show up in the catalog as healthy before waiting for them to show up in envoy as healthy which gives it just a bit more time to sync.	2022-10-18 14:23:52 -05:00
freddygv	a0bcf4b941	Add integ test for peering through gateways	2022-10-13 14:58:05 -06:00
freddygv	6ef8d329d2	Require Connect and TLS to generate peering tokens By requiring Connect and a gRPC TLS listener we can automatically configure TLS for all peering control-plane traffic.	2022-10-07 09:06:29 -06:00
Eric Haberkorn	2f08fab317	Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic (#14817 ) Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic	2022-10-06 09:54:14 -04:00
Alex Oskotsky	4d9309327f	Add the ability to retry on reset connection to service-routers (#12890 )	2022-10-05 13:06:44 -04:00
John Murret	08203ace4a	Upgrade serf to v0.10.1 and memberlist to v0.5.0 to get memberlist size metrics and broadcast queue depth metric (#14873 ) * updating to serf v0.10.1 and memberlist v0.5.0 to get memberlist size metrics and memberlist broadcast queue depth metric * update changelog * update changelog * correcting changelog * adding "QueueCheckInterval" for memberlist to test * updating integration test containers to grab latest api	2022-10-04 17:51:37 -06:00
Derek Menteer	77d640c322	Add peering integration tests (#14836 ) Add peering integration tests.	2022-10-04 13:51:04 -05:00
Eric Haberkorn	2178e38204	Rename `PeerName` to `Peer` on prepared queries and exported services (#14854 )	2022-10-04 14:46:15 -04:00
Luke Kysow	12ac96c356	Remove terminal colouring from test output so it is (#14810 ) more readable in CI. ``` Running primary verification step for case-ingress-gateway-multiple-services... �[34;1mverify.bats �[0m�[1G ingress proxy admin is up on :20000�[K�[75G 1/12�[2G�[1G ✓ ingress proxy admin is up on :20000�[K �[0m�[1G s1 proxy admin is up on :19000�[K�[75G 2/12�[2G�[1G ✓ s1 proxy admin is up on :19000�[K �[0m�[1G s2 proxy admin is up on :19001�[K�[75G 3/12�[2G�[1G ✓ s2 proxy admin is up on :19001�[K �[0m�[1G s1 proxy listener should be up and have right cert�[K�[75G 4/12�[2G�[1G ✓ s1 proxy listener should be up and have right cert�[K �[0m�[1G s2 proxy listener should be up and have right cert�[K�[75G 5/12�[2G�[1G ✓ s2 proxy listener should be up and have right cert�[K �[0m�[1G ingress-gateway should have healthy endpoints for s1�[K�[75G 6/12�[2G�[31;1m�[1G ✗ ingress-gateway should have healthy endpoints for s1�[K �[0m�[31;22m (from function `assert_upstream_has_endpoints_in_status' in file /workdir/primary/bats/helpers.bash, line 385, ``` versus ``` Running primary verification step for case-ingress-gateway-multiple-services... 1..12 ok 1 ingress proxy admin is up on :20000 ok 2 s1 proxy admin is up on :19000 ok 3 s2 proxy admin is up on :19001 ok 4 s1 proxy listener should be up and have right cert ok 5 s2 proxy listener should be up and have right cert not ok 6 ingress-gateway should have healthy endpoints for s1 not ok 7 s1 proxy should have been configured with max_connections in services ok 8 ingress-gateway should have healthy endpoints for s2 ```	2022-10-04 08:35:19 -07:00
cskh	1b587e7d13	fix flaky integration test (#14843 )	2022-10-03 16:55:05 -04:00
cskh	4ece020bf1	feat(ingress gateway: support configuring limits in ingress-gateway c… (#14749 ) * feat(ingress gateway: support configuring limits in ingress-gateway config entry - a new Defaults field with max_connections, max_pending_connections, max_requests is added to ingress gateway config entry - new field max_connections, max_pending_connections, max_requests in individual services to overwrite the value in Default - added unit test and integration test - updated doc Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-09-28 14:56:46 -04:00
DanStough	de58c872ff	release updates for 1.13.2, 1.12.5, and 1.11.9	2022-09-21 15:07:44 -04:00
Evan Culver	f36157c025	Add more content to integration test docs (#14613 )	2022-09-14 16:13:23 -07:00
Evan Culver	aa40adf97e	connect: Bump latest Envoy to 1.23.1 in test matrix (#14573 )	2022-09-14 13:20:16 -07:00
Luke Kysow	8e4e0c23aa	Document integration tests (#14391 )	2022-09-13 10:00:02 -07:00
Eric Haberkorn	1490eedfbc	Implement Cluster Peering Redirects (#14445 ) implement cluster peering redirects	2022-09-09 13:58:28 -04:00
Luke Kysow	095934116e	Suppress "unbound variable" error. (#14424 ) Without this change, you'd see this error: ``` ./run-tests.sh: line 49: LAMBDA_TESTS_ENABLED: unbound variable ./run-tests.sh: line 49: LAMBDA_TESTS_ENABLED: unbound variable ```	2022-08-31 13:06:35 -07:00
Eric Haberkorn	06e7f3cadb	Finish up cluster peering failover (#14396 )	2022-08-30 11:46:34 -04:00
Luke Kysow	63df49b440	Run integration tests locally using amd64 (#14365 ) Locally, always run integration tests using amd64, even if running on an arm mac. This ensures the architecture locally always matches the CI/CD environment. In addition: * Use consul:local for envoy integration and upgrade tests. Previously, consul:local was used for upgrade tests and consul-dev for integration tests. I didn't see a reason to use separate images as it's more confusing. * By default, disable the requirement that aws credentials are set. These are only needed for the lambda tests and make it so you can't run any tests locally, even if you're not running the lambda tests. Now they'll only run if the LAMBDA_TESTS_ENABLED env var is set. * Split out the building of the Docker image for integration tests into its own target from `dev-docker`. This allows us to always use an amd64 image without messing up the `dev-docker` target. * Add support for passing GO_TEST_FLAGs to `test-envoy-integ` target. * Add a wait_for_leader function because tests were failing locally without it.	2022-08-29 16:13:49 -07:00
Eric Haberkorn	11884bfb99	Refactor failover code to use Envoy's aggregate clusters (#14178 )	2022-08-12 14:30:46 -04:00
Chris S. Kim	e74db7732e	Retry docker build steps	2022-08-08 12:22:16 -04:00
Luke Kysow	e9960dfdf3	peering: default to false (#13963 ) * defaulting to false because peering will be released as beta * Ignore peering disabled error in bundles cachetype Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: freddygv <freddy@hashicorp.com> Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2022-08-01 15:22:36 -04:00
Chris S. Kim	c752c5bff2	Update envoy metrics label extraction for peered clusters and listeners (#13818 ) Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service. Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels. Listener filter names were updated to better match the existing regex. Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.	2022-07-25 13:49:00 -04:00
Chris Thain	00f9dc2a70	Add Consul Lambda integration tests (#13770 )	2022-07-21 09:54:56 -07:00
Evan Culver	285b4cef2b	connect: Add support for Envoy 1.23, remove 1.19 (#13807 )	2022-07-19 14:51:04 -07:00
R.B. Boyer	61ebb38092	server: ensure peer replication can successfully use TLS over external gRPC (#13733 ) Ensure that the peer stream replication rpc can successfully be used with TLS activated. Also: - If key material is configured for the gRPC port but HTTPS is not enabled now TLS will still be activated for the gRPC port. - peerstream replication stream opened by the establishing-side will now ignore grpc.WithBlock so that TLS errors will bubble up instead of being awkwardly delayed or suppressed	2022-07-15 13:15:50 -05:00
Evan Culver	d9643ca499	Latest submodule versions (#13750 )	2022-07-15 09:58:21 -07:00
R.B. Boyer	5b801db24b	peering: move peer replication to the external gRPC port (#13698 ) Peer replication is intended to be between separate Consul installs and effectively should be considered "external". This PR moves the peer stream replication bidirectional RPC endpoint to the external gRPC server and ensures that things continue to function.	2022-07-08 12:01:13 -05:00
R.B. Boyer	115000144b	xds: mesh gateways now correctly load up peer-exported discovery chains using L7 protocols (#13624 ) A mesh gateway will now configure the filter chains for L7 exported services using the correct discovery chain information.	2022-06-28 14:52:25 -05:00
R.B. Boyer	f3f941f1a0	test: for upgrade compatibility tests retain assigned container ip addresses on upgrade (#13615 ) Use a synthetic pod construct to hold onto the IP address in the interim.	2022-06-28 09:50:13 -05:00
Dan Upton	883ccc2a98	test: run Envoy integration tests against both servers and clients (#13610 )	2022-06-28 13:15:45 +01:00
R.B. Boyer	988919a581	tests: ensure integration tests show logs from the containers to help debugging (#13593 )	2022-06-24 10:26:17 -05:00
Dhia Ayachi	7eedeeecfc	update github.com/containerd/containerd to 1.5.13 (#13520 )	2022-06-21 12:20:00 -04:00
cskh	340a194894	Load test, upgrade packer version, fix k6s installation (#13382 ) - fix sg: need remote access to test server - Give the load generator a name - Update loadtest hcl filename in readme - Add terraform init - Disable access to the server machine by default	2022-06-15 09:29:38 -04:00
Evan Culver	ca7acd2970	connect: Use Envoy 1.22.2 instead of 1.22.1 (#13444 )	2022-06-14 15:29:41 -07:00
Evan Culver	2adb9f7c8a	connect: Update Envoy support matrix to latest patch releases (#13431 )	2022-06-14 13:19:09 -07:00
R.B. Boyer	33b497e7c9	peering: rename initiate to establish in the context of the APIs (#13419 )	2022-06-10 11:10:46 -05:00
R.B. Boyer	d81d8468db	peering: mesh gateways are required for cross-peer service mesh communication (#13410 ) Require use of mesh gateways in order for service mesh data plane traffic to flow between peers. This also adds plumbing for envoy integration tests involving peers, and one starter peering test.	2022-06-09 11:05:18 -05:00
R.B. Boyer	db48b4f5a2	test: break dep on main consul module (#13373 ) The main consul module is not a great library and complicates some oss/ent module issues. This undoes #13371	2022-06-06 16:06:39 -05:00
R.B. Boyer	977b39cde1	test: use a go mod replace trick for the compat test dependency on the main repo (#13371 )	2022-06-06 14:12:49 -05:00
cskh	bd3a4dfeab	Add isLeader metric to track if a server is a leader (#13304 ) CTIA-21: sdd is_leader metric to track if a server is a leader Co-authored-by: alex <8968914+acpana@users.noreply.github.com>	2022-06-03 13:07:37 -04:00
cskh	206c991058	CTIA-16: add tags to load test resources and run test on PR commit (#13258 ) - retry destroy terraform resources	2022-05-27 14:49:39 -04:00
DanStough	2c8ca25d8a	chore(test): Update bats version	2022-05-24 11:56:08 -04:00
R.B. Boyer	6427784846	test: fix more flakes in the compatibility test (#13145 )	2022-05-19 14:05:41 -05:00
R.B. Boyer	e8bbcaca86	test: cleanup and unflake parts of the upgrade compat tests (#13126 )	2022-05-18 14:52:26 -05:00
Hui Kang	3a6547740f	change to var.vpc_cidr	2022-05-16 16:49:46 -04:00
Hui Kang	bc5f13e331	fix insecure cidr_blocks in load test	2022-05-16 16:37:45 -04:00
Dhia Ayachi	f012db0098	sync changes to healthcheck tests (#12984 )	2022-05-09 15:00:46 -04:00
Dhia Ayachi	516213a93e	Create clients with specific version for integration tests (#12978 ) * tidy code and add some doc strings * add doc strings to tests * add partitions tests, need to adapt to run in both oss and ent * split oss and enterprise versions * remove parallel tests * add error * fix queryBackend in test * revert unneeded change * fix failing tests	2022-05-09 14:36:49 -04:00
R.B. Boyer	7d20b68959	ci: upgrade bats and the circle machine executors to get integration tests to function again (#12918 ) Bonus change: send less context when building the test-sds-server to speed up the setup.	2022-05-03 11:21:32 -05:00
Dhia Ayachi	e3eab78fb3	try to read license from env and mapped to container (#12854 )	2022-04-25 11:58:29 -04:00
Dhia Ayachi	fe22a002e1	Add versions compatibility tests between Consul (#12702 ) * add a sample * Consul cluster test * add build dockerfile * add tests to cover mixed versions tests * use flag to pass docker image name * remove default config and rely on flags to inject the right image to test * add cluster abstraction * fix imports and remove old files * fix imports and remove old files * fix dockerIgnore * make a `Node interface` and encapsulate ConsulContainer * fix a test bug where we only check the leader against a single node. * add upgrade tests to CI * fix yaml alignment * fix alignment take 2 * fix flag naming * fix image to build * fix test run and go mod tidy * add a debug command * run without RYUK * fix parallel run * add skip reaper code * make tempdir in local dir * chmod the temp dir to 0777 * chmod the right dir name * change executor to use machine instead of docker * add docker layer caching * remove setup docker * add gotestsum * install go version * use variable for GO installed version * add environment * add environment in the right place * do not disable RYUK in CI * add service check to tests * assertions outside routines * add queryBackend to the api query meta. * check if we are using the right backend for those tests (streaming) * change the tested endpoint to use one that have streaming. * refactor to test multiple scenarios for streaming * Fix dockerfile Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> * rename Clients to clients Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> * check if cluster have 0 node * tidy code and add some doc strings * use uuid instead of random string * add doc strings to tests * add queryBackend to the api query meta. * add a changelog * fix for api backend query * add missing require * fix q.QueryBackend * Revert "fix q.QueryBackend" This reverts commit cd0e5f7b1a1730e191673d624f8e89b591871c05. * fix circle ci config * tidy go mod after merging main * rename package and fix test scenario * update go download url * address review comments * rename flag in CI * add readme to the upgrade tests * fix golang download url * fix golang arch downloaded * fix AddNodes to handle an empty cluster case * use `parseBool` * rename circle job and add comment * update testcontainer to 0.13 * fix circle ci config * remove build docker file and use `make dev-docker` instead * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * fix a typo Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> Co-authored-by: Dan Upton <daniel@floppy.co>	2022-04-25 10:41:36 -04:00
Evan Culver	9d0b5bf8e9	connect: Add Envoy 1.22 to integration tests, remove Envoy 1.18 (#12805 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-04-18 09:36:07 -07:00
DanStough	a050aa39b9	Update go version to 1.18.1	2022-04-18 11:41:10 -04:00
Evan Culver	e62745c82c	connect: Add Envoy 1.21.1 to support matrix, remove 1.17.4 (#12777 )	2022-04-14 10:44:42 -07:00
R.B. Boyer	13ab14e60c	test: use docker buildkit backend for envoy integration tests (#12726 )	2022-04-11 10:49:44 -05:00
R.B. Boyer	12523197f9	syncing changes back from enterprise (#12701 )	2022-04-05 15:46:56 -05:00
Evan Culver	7889071385	connect: Update supported Envoy versions to include 1.19.3 and 1.18.6	2022-02-24 16:59:33 -08:00
Evan Culver	9f4d9f3f74	connect: Upgrade Envoy 1.20 to 1.20.2 (#12443 )	2022-02-24 16:19:39 -08:00
Evan Culver	ec65890f01	connect: Upgrade Envoy 1.20 to 1.20.1 (#11895 )	2022-01-18 14:35:27 -05:00
Chris S. Kim	d85d16a0c9	Fix integration test with updated file perms (#11916 )	2021-12-23 19:00:02 -05:00
freddygv	a1c1e36be7	Allow cross-partition references in disco chain * Add partition fields to targets like service route destinations * Update validation to prevent cross-DC + cross-partition references * Handle partitions when reading config entries for disco chain * Encode partition in compiled targets	2021-12-06 12:32:19 -07:00
freddygv	d32bc117d8	Fix integ test	2021-12-03 17:02:57 -07:00
R.B. Boyer	83bf7ab3ff	re-run gofmt on 1.17 (#11579 ) This should let freshly recompiled golangci-lint binaries using Go 1.17 pass 'make lint'	2021-11-16 12:04:01 -06:00
freddygv	42d9542ac3	Add cross-partition integration test	2021-11-12 14:45:50 -07:00
freddygv	2d2ff0ae39	Bump retry time for cross-DC RPC The secondary DC now takes longer to populate the MGW snapshot because it needs to wait for the secondary CA to be initialized before it can receive roots and generate xDS config. Previously MGWs could receive empty roots before the CA was initialized. This wasn't necessarily a problem since the cluster ID in the trust domain isn't verified.	2021-11-10 12:00:00 -07:00
Dhia Ayachi	4d763ef9e6	regenerate expired certs (#11462 ) * regenerate expired certs * add documentation to generate tests certificates	2021-11-01 11:40:16 -04:00
Evan Culver	b3c92f22b1	connect: Remove support for Envoy 1.16 (#11354 )	2021-10-27 18:51:35 -07:00
Evan Culver	98acbfa79c	connect: Add support for Envoy 1.20 (#11277 )	2021-10-27 18:38:10 -07:00
R.B. Boyer	63c50e58a0	test: pin the version of bats to one that works on CircleCI (#11401 )	2021-10-22 17:06:25 -05:00
R.B. Boyer	f7932b4ffa	test: remove some envoy integ test warnings (#11369 ) We launch one container as part of the test with --pid=host but apparently within that container it launches a copy of "tini" as a process supervisor that prefers to be PID 1. Because it's not PID 1 it logs a warning message about this to the envoy integration test logs that can lead to thinking somehow that a test failure is related when in fact it's completely unrelated. Adding this environment variable avoids the warning.	2021-10-20 15:50:45 -05:00
Evan Culver	e2363c13ff	Merge branch 'main' into eculver/envoy-1.19.1	2021-09-28 11:54:33 -07:00
Paul Banks	8c8cde524e	Add Envoy integration test for split-route SDS case	2021-09-23 10:17:03 +01:00
Paul Banks	626232e4cd	Minor improvements to SDS server from review	2021-09-23 10:13:41 +01:00
Paul Banks	3b2a4fc458	Allow skipping v2 compat tests for SDS as it's only the SDS server integration that doesn't support v2	2021-09-23 10:12:37 +01:00
Paul Banks	cd6491ea71	Fix integration tests in CI - serve SDS certs from the Docker image not a mounted path	2021-09-23 10:12:37 +01:00
Paul Banks	c2174260bc	Fix integration test for older Envoy versions	2021-09-23 10:12:37 +01:00
Paul Banks	1f62bca08b	Add basic integration test for Envoy ingress with SDS	2021-09-23 10:08:02 +01:00
Evan Culver	b104b7719c	add envoy 1.19.1	2021-09-21 15:39:36 -07:00
Paul Banks	46400a033f	Add Envoy integration test to show Header manip can interpolate Envoy variables	2021-09-10 21:09:24 +01:00
Paul Banks	bc1c86df96	Integration tests for all new header manip features	2021-09-10 21:09:24 +01:00
Freddy	11672defaf	connect: update envoy supported versions to latest patch release (#10961) Relevant advisory: https://github.com/envoyproxy/envoy/security/advisories/GHSA-6g4j-5vrw-2m8h	2021-08-31 10:39:18 -06:00
Kyle Havlovitz	98969c018a	oss: Rename default partition	2021-08-12 14:31:37 -07:00
Matt Keeler	58b934133d	hcs-1936: Prepare for adding license auto-retrieval to auto-config in enterprise	2021-05-24 13:20:30 -04:00
R.B. Boyer	05b52a3d63	connect: update supported envoy versions to 1.18.3, 1.17.3, 1.16.4, and 1.15.5 (#10231 )	2021-05-12 14:06:06 -05:00
Daniel Nephin	61525f9a95	fix failing integration tests The new IDs include a leading slash for the partition ID section	2021-05-06 13:30:07 -04:00
R.B. Boyer	97e57aedfb	connect: update supported envoy versions to 1.18.2, 1.17.2, 1.16.3, and 1.15.4 (#10101 ) The only thing that needed fixing up pertained to this section of the 1.18.x release notes: > grpc_stats: the default value for stats_for_all_methods is switched from true to false, in order to avoid possible memory exhaustion due to an untrusted downstream sending a large number of unique method names. The previous default value was deprecated in version 1.14.0. This only changes the behavior when the value is not set. The previous behavior can be used by setting the value to true. This behavior change by be overridden by setting runtime feature envoy.deprecated_features.grpc_stats_filter_enable_stats_for_all_methods_by_default. For now to maintain status-quo I'm explicitly setting `stats_for_all_methods=true` in all versions to avoid relying upon the default. Additionally the naming of the emitted metrics for these gRPC requests changed slightly so the integration test assertions for `case-grpc` needed adjusting.	2021-04-29 15:22:03 -05:00
R.B. Boyer	91bee6246f	Support Incremental xDS mode (#9855 ) This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged. Union of all commit messages follows to give an overarching summary: xds: exclusively support incremental xDS when using xDS v3 Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail. Work around a strange older envoy behavior involving empty CDS responses over incremental xDS. xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead. Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client. xds: pull out checkStreamACLs method in advance of a later commit xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty. This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10. xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time	2021-04-29 13:54:05 -05:00
R.B. Boyer	6924586874	test: switch envoy integration tests to use pkill instead of ps+grep+awk+kill (#10097 )	2021-04-23 13:23:33 -05:00
Freddy	7682aa341e	Check for optionally prepended namespace in upstream assertion (#10049 )	2021-04-15 18:31:28 -06:00
Yong Wen Chua	db406b700c	Update assertion to not check for port	2021-04-06 17:10:38 +08:00
Yong Wen Chua	85aa7f2785	Merge branch 'master' of github.com:hashicorp/consul into tg-rewrite	2021-04-06 17:05:26 +08:00
R.B. Boyer	503041f216	xds: default to speaking xDS v3, but allow for v2 to be spoken upon request (#9658 ) - Also add support for envoy 1.17.0	2021-02-26 16:23:15 -06:00
Yong Wen Chua	e73781948d	Add integration test check	2021-02-24 16:24:32 +08:00
R.B. Boyer	cdc5e99184	xds: remove deprecated usages of xDS (#9602 ) Note that this does NOT upgrade to xDS v3. That will come in a future PR. Additionally: - Ignored staticcheck warnings about how github.com/golang/protobuf is deprecated. - Shuffled some agent/xds imports in advance of a later xDS v3 upgrade. - Remove support for envoy 1.13.x but don't add in 1.17.x yet. We have to wait until the xDS v3 support is added in a follow-up PR. Fixes #8425	2021-02-22 15:00:15 -06:00
R.B. Boyer	e87d2bb24f	xds: only try to create an ipv6 expose checks listener if ipv6 is supported by the kernel (#9765 ) Fixes #9311 This only fails if the kernel has ipv6 hard-disabled. It is not sufficient to merely not provide an ipv6 address for a network interface.	2021-02-19 14:38:43 -06:00
Alvin Huang	e3a4d843da	remove reference to docker/ path for old docker mirror (#9783 )	2021-02-17 18:37:31 -05:00
Michele Degges	005c48d641	Remove jfrog references (#9782 )	2021-02-17 18:21:52 -05:00
R.B. Boyer	194fb0d144	connect: update supported envoy point releases to 1.16.2, 1.15.3, 1.14.6, 1.13.7 (#9737 )	2021-02-10 13:11:15 -06:00
s-christoff	8fc4de0ead	docs: Update load test documentation and minor clean ups (#9548 )	2021-01-15 12:41:06 -06:00
Daniel Nephin	0fa754f490	Pin alpine/socat image to a version. To fix failing integration tests. The latest version (`1.7.4.0-r0`) appears to not be catting all the bytes, so the expected metrics are missing in the output.	2021-01-06 18:01:39 -05:00
s-christoff	a09737c51f	Up testing threshold for Circle (#9418 )	2020-12-17 13:25:05 -05:00
s-christoff	a907922ddb	Minor load test fixes (#9394 )	2020-12-15 17:03:44 -06:00
s-christoff	e9512fbff6	Allow consul version/consul download url to be inputted via Terraform (#9267 )	2020-12-11 13:11:14 -06:00
Freddy	2763833d32	Add DC and NS support for Envoy metrics (#9207 ) This PR updates the tags that we generate for Envoy stats. Several of these come with breaking changes, since we can't keep two stats prefixes for a filter.	2020-11-16 16:37:19 -07:00
Mike Morris	a343365da7	ci: update to Go 1.15.4 and alpine:3.12 (#9036 ) * ci: stop building darwin/386 binaries Go 1.15 drops support for 32-bit binaries on Darwin https://golang.org/doc/go1.15#darwin * tls: ConnectionState::NegotiatedProtocolIsMutual is deprecated in Go 1.15, this value is always true * correct error messages that changed slightly * Completely regenerate some TLS test data Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2020-11-13 13:02:59 -05:00
R.B. Boyer	adff2316c9	test: use direct service registration in envoy integration tests (#9138 ) This has the biggest impact on enterprise test cases that use namespaced registrations, which prior to this change sometimes failed the initial registration because the namespace was not yet created.	2020-11-09 13:59:46 -06:00
R.B. Boyer	9b37ea7dcb	Revert "Add namespace support for metrics (OSS) (#9117 )" (#9124 ) This reverts commit 06b3b017d326853dbb53bc0ec08ce371265c5ce9.	2020-11-06 10:24:32 -06:00
Freddy	874efe705f	Add namespace support for metrics (OSS) (#9117 )	2020-11-05 18:24:29 -07:00
Aaron Lane	031a1a12b1	Merge pull request #9112 from hashicorp/aaron-lane-patch-2 Update loadtest AMI name, description	2020-11-05 15:36:22 -05:00
Aaron Lane	d3bb81830b	Update loadtest AMI name, description This commit updates the Packer properties `ami_name` and `ami_description` for the loadtest image to reflect the image intent.	2020-11-05 15:05:46 -05:00
Aaron Lane	72cc81d85f	Link to packer directory from terraform ReadMe	2020-11-05 15:00:56 -05:00
R.B. Boyer	5c6d322872	use the docker proxy for more envoy integration test containers (#9085 )	2020-11-02 14:52:33 -06:00
R.B. Boyer	b8a623d3d2	wait_for_namespace should take two args (#9086 )	2020-11-02 14:31:19 -06:00
Alvin Huang	102aefdb49	use hashicorp docker mirror in envoy helper (#9080 )	2020-11-02 11:37:03 -06:00
R.B. Boyer	cf5e9872ce	fix envoy integ test wait_for_namespace to actually work on CI (#9082 )	2020-11-02 11:14:48 -06:00
Alvin Huang	d6652b0bc9	use hashicorp docker mirror to prevent rate limit (#9070 )	2020-10-30 17:59:13 -04:00
R.B. Boyer	c8c87ec317	agent: introduce path allow list for requests going through the metrics proxy (#9059 ) Added a new option `ui_config.metrics_proxy.path_allowlist`. This defaults to `["/api/v1/query", "/api/v1/query_range"]` when the metrics provider is set to `prometheus`. Requests that do not use one of the allow-listed paths (via exact match) get a 403 Forbidden response instead.	2020-10-30 16:49:54 -05:00
R.B. Boyer	e3e1d687df	add namespace waiting function to envoy integration tests (#9051 )	2020-10-28 11:58:40 -05:00
R.B. Boyer	ff82fd822f	missed adding the test delay to the l7-intentions envoy integration test (#9052 )	2020-10-28 08:43:11 -05:00
R.B. Boyer	ed3a6bb59d	Fix even more test flakes in intentions related envoy integration tests (#9013 ) The key thing here is to use `curl --no-keepalive` so that envoy pre-1.15 tests will reliably use the latest listener every time. Extra: - Switched away from editing line-item intentions the legacy way. - Removed some teardown scripts, as we don't share anything between cases anyway - Removed unnecessary use of `run` in some places.	2020-10-26 17:04:35 -05:00
R.B. Boyer	846b80e8a5	fix flaky envoy integration tests involving intentions (#8996 ) There is a delay between an intentions change being made, and it being reflected in the Envoy runtime configuration. Now that the enforcement happens inside of Envoy instead of over in the agent, our tests need to explicitly wait until the xDS reconfiguration is complete before attempting to assert intentions worked. Also remove a few double retry loops.	2020-10-22 14:30:28 -05:00
R.B. Boyer	2183842f0e	connect: add support for envoy 1.16.0, drop support for 1.12.x, and bump point releases as well (#8944 ) Supported versions will be: "1.16.0", "1.15.2", "1.14.5", "1.13.6"	2020-10-22 13:46:19 -05:00
R.B. Boyer	5611d230b7	speed up envoy integration tests by removing docker-compose (#8982 ) This speeds up individual envoy integration test runs from ~23m to ~14m. It's also a pre-req for possibly switching to doing the tests entirely within Go (no shell-outs).	2020-10-22 13:20:31 -05:00
R.B. Boyer	292cad3c9d	restore the discovery of tests cases by file system existence (#8983 )	2020-10-19 16:51:38 -05:00
R.B. Boyer	e8587c58fa	speed up envoy integ tests by not politely stopping containers before destroying them (#8969 ) In local testing this sped up the stop_services call from 11s to 1s per test.	2020-10-15 11:51:37 -05:00
R.B. Boyer	35c4efd220	connect: support defining intentions using layer 7 criteria (#8839 ) Extend Consul’s intentions model to allow for request-based access control enforcement for HTTP-like protocols in addition to the existing connection-based enforcement for unspecified protocols (e.g. tcp).	2020-10-06 17:09:13 -05:00
s-christoff	e89eb9fb42	Add load testing framework (#8571 )	2020-10-05 20:16:09 -05:00
R.B. Boyer	4b525e5d16	agent: enable enable_central_service_config by default (#8746 )	2020-10-01 09:19:14 -05:00
Jack	145bcdc2bb	Add http2 and grpc support to ingress gateways (#8458 )	2020-08-27 15:34:08 -06:00
R.B. Boyer	f2b8bf109c	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
R.B. Boyer	8ea4c482b3	xds: add support for envoy 1.15.0 and drop support for 1.11.x (#8424 ) Related changes: - hard-fail the xDS connection attempt if the envoy version is known to be too old to be supported - remove the RouterMatchSafeRegex proxy feature since all supported envoy versions have it - stop using --max-obj-name-len (due to: envoyproxy/envoy#11740)	2020-07-31 15:52:49 -05:00
Hans Hasselberg	0c39b2c820	add support for envoy 1.14.4, 1.13.4, 1.12.6 (#8216 )	2020-07-13 15:44:44 -05:00
R.B. Boyer	6e3d07c995	xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8222 ) - cut down on extra node metadata transmission - split the golden file generation to compare all envoy version	2020-07-09 17:04:51 -05:00
Chris Piraino	9d92c42c90	Append port number to ingress host domain (#8190 ) A port can be sent in the Host header as defined in the HTTP RFC, so we take any hosts that we want to match traffic to and also add another host with the listener port added. Also fix an issue with envoy integration tests not running the case-ingress-gateway-tls test.	2020-07-07 10:43:04 -05:00
Freddy	146afbe9a2	Merge http2 integration test case into grpc case (#8164 ) http2 is covered by grpc since grpc uses http2	2020-06-22 13:09:04 -06:00
Hans Hasselberg	26494286c7	Support envoy 1.14.2, 1.13.2, 1.12.4 (#8057 )	2020-06-10 23:20:17 +02:00
Chris Piraino	5d0cb00ec3	Always require Host header values for http services (#7990 ) Previously, we did not require the 'service-name.' host header value when on a single http service was exposed. However, this allows a user to get into a situation where, if they add another service to the listener, suddenly the previous service's traffic might not be routed correctly. Thus, we always require the Host header, even if there is only 1 service. Also, we add the make the default domain matching more restrictive by matching "service-name.ingress." by default. This lines up better with the namespace case and more accurately matches the Consul DNS value we expect people to use in this case.	2020-06-08 13:16:24 -05:00
Freddy	f759a48726	Enable gateways to resolve hostnames to IPv4 addresses (#7999 ) The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option. If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS. Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.	2020-06-03 15:28:45 -06:00
Daniel Nephin	cd1115330f	ci: fix log capture for envoy integration tests The previous change, which moved test running to Go, appears to have broken log capturing. I am not entirely sure why, but the run_tests function seems to exit on the first error. This change moves test teardown and log capturing out of run_test, and has the go test runner call them when necessary.	2020-06-02 19:24:56 -04:00
Daniel Nephin	dcc76f8774	Make envoy integration tests a `go test` suite (#7842 ) * test/integration: only run against 1 envoy version These tests are slow enough that it seems unlikely that anyone is running multiple versions locally. If someone wants to, a for loop outside of run_test.sh should do the right thing. Remove unused vars. * Remove logic to iterate over test cases, run a single case * Add a golang runner for integration tests * Use build tags for envoy integration tests And add junit-xml report	2020-05-19 14:00:00 -04:00
Kyle Havlovitz	28b4819882	Merge pull request #7759 from hashicorp/ingress/tls-hosts Add TLS option for Ingress Gateway listeners	2020-05-11 09:18:43 -07:00
Chris Piraino	29afac01c8	Set default protocol to http in TLS integration test	2020-05-08 20:23:23 -07:00
Daniel Nephin	ce3aeb85f5	Add outlier_detection check to integration test Fix decoding of time.Duration types.	2020-05-08 14:56:57 -04:00
Chris Piraino	ad8a0544f2	Require individual services in ingress entry to match protocols (#7774 ) We require any non-wildcard services to match the protocol defined in the listener on write, so that we can maintain a consistent experience through ingress gateways. This also helps guard against accidental misconfiguration by a user. - Update tests that require an updated protocol for ingress gateways	2020-05-06 16:09:24 -05:00
Kyle Havlovitz	a198282349	Add a check for custom host to ingress TLS integration test	2020-05-06 15:12:02 -05:00
Kyle Havlovitz	c194e707e6	Add TLS integration test for ingress gateway - Pull Consul Root CA from API in order to verify certificate chain - Assert on the DNSSAN as well to ensure it is correct	2020-05-06 15:12:02 -05:00
Kyle Havlovitz	bd6bb3bf2d	Add TLS option and DNS SAN support to ingress config xds: Only set TLS context for ingress listener when requested	2020-05-06 15:12:02 -05:00
Chris Piraino	210dda5682	Allow Hosts field to be set on an ingress config entry - Validate that this cannot be set on a 'tcp' listener nor on a wildcard service. - Add Hosts field to api and test in consul config write CLI - xds: Configure envoy with user-provided hosts from ingress gateways	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	b21cd112e5	Allow ingress gateways to route traffic based on Host header This commit adds the necessary changes to allow an ingress gateway to route traffic from a single defined port to multiple different upstream services in the Consul mesh. To do this, we now require all HTTP requests coming into the ingress gateway to specify a Host header that matches "<service-name>.*" in order to correctly route traffic to the correct service. - Differentiate multiple listener's route names by port - Adds a case in xds for allowing default discovery chains to create a route configuration when on an ingress gateway. This allows default services to easily use host header routing - ingress-gateways have a single route config for each listener that utilizes domain matching to route to different services.	2020-05-06 15:06:13 -05:00
freddygv	e30d64289d	PR comments	2020-04-27 11:08:41 -06:00
freddygv	7d5a02ad61	Add envoy integration tests	2020-04-27 11:08:40 -06:00
Chris Piraino	b8a5fbf1bf	Allow ingress gateways to route through mesh gateways - Adds integration test for mesh gateways local + remote modes with ingress - ingress golden files updated for mesh gateway endpoints	2020-04-24 09:31:32 -05:00
Kyle Havlovitz	d433679bd8	Add http routing support and integration test to ingress gateways	2020-04-24 09:31:32 -05:00
Kyle Havlovitz	6a5eba63ab	Ingress Gateways for TCP services (#7509 ) * Implements a simple, tcp ingress gateway workflow This adds a new type of gateway for allowing Ingress traffic into Connect from external services. Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-04-16 14:00:48 -07:00
Pierre Souchay	a3da31c86a	More tolerant assert_alive_wan_member_count to fix unstable tests Example of failure (very frequent): https://circleci.com/gh/hashicorp/consul/157985	2020-04-13 16:02:45 +02:00
Hans Hasselberg	b78220981c	connect: support envoy 1.14.1 (#7624 )	2020-04-09 20:58:22 +02:00
Chris Piraino	d7a870fd32	Fix flapping of mesh gateway connect-service watches (#7575 )	2020-04-02 10:12:13 -05:00
Pierre Souchay	bf7aa070c1	tests: fixed bats warning (#7544 ) This fixes this bats warning: duplicate test name(s) in /workdir/primary/bats/verify.bats: test_s1_upstream_made_1_connection Test was already defined at line 42, rename it to avoid test name duplication	2020-03-31 22:29:27 +02:00
Hans Hasselberg	f49144fcee	connect: support for envoy 1.13.1 and 1.12.3 (#7380 ) * setup new envoy versions for CI * bump version on the website too.	2020-03-10 11:04:46 +01:00
R.B. Boyer	a7fb26f50f	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Matt Keeler	31dde765bf	Change where the envoy snapshots get put when a test fails (#7298 ) This will allow us to capture them in CI	2020-03-05 16:01:10 -05:00
Hans Hasselberg	4ae725cab2	add envoy version 1.12.2 and 1.13.0 to the matrix (#7240 ) * add 1.12.2 * add envoy 1.13.0 * Introduce -envoy-version to get 1.10.0 passing. * update old version and fix consul-exec case * add envoy_version and fix check * Update Envoy CLI tests to account for the 1.13 compatibility changes. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-02-10 14:53:04 -05:00
Paschalis Tsilias	1b81cccbf9	Expose Envoy's /stats for statsd agents (#7173 ) * Expose Envoy /stats for statsd agents; Add testcases * Remove merge conflict leftover * Add support for prefix instead of path; Fix docstring to mirror these changes * Add new config field to docs; Add testcases to check that /stats/prometheus is exposed as well * Parametrize matchType (prefix or path) and value * Update website/source/docs/connect/proxies/envoy.md Co-Authored-By: Paul Banks <banks@banksco.de> Co-authored-by: Paul Banks <banks@banksco.de>	2020-02-03 17:19:34 +00:00
Matt Keeler	25568626f7	Fix a couple bugs regarding intentions with namespaces (#7169 )	2020-01-29 17:30:38 -05:00
Matt Keeler	485a0a65ea	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
Chris Piraino	2a95701341	Allow configuration of upstream connection limits in Envoy (#6829 ) * Adds 'limits' field to the upstream configuration of a connect proxy This allows a user to configure the envoy connect proxy with 'max_connections', 'max_queued_requests', and 'max_concurrent_requests'. These values are defined in the local proxy on a per-service instance basis and should thus NOT be thought of as a global-level or even service-level value.	2019-12-03 14:13:33 -06:00
R.B. Boyer	a9343db838	xds: mesh gateway CDS requests are now allowed to receive an empty CDS reply (#6787 ) This is the rest of the fix for #6543 that was incompletely fixed in #6576.	2019-11-26 15:55:13 -06:00
Paul Banks	5f405c3277	Fix support for RSA CA keys in Connect. (#6638 ) * Allow RSA CA certs for consul and vault providers to correctly sign EC leaf certs. * Ensure key type ad bits are populated from CA cert and clean up tests * Add integration test and fix error when initializing secondary CA with RSA key. * Add more tests, fix review feedback * Update docs with key type config and output * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com>	2019-11-01 13:20:26 +00:00
R.B. Boyer	1ab04a8b6a	xds: tcp services using the discovery chain should not assume RDS during LDS (#6623 ) Previously the logic for configuring RDS during LDS for L7 upstreams was overapplied to TCP proxies resulting in a cluster name of <emptystring> being used incorrectly. Fixes #6621	2019-10-17 16:44:59 -05:00
Freddy	5eace88ce2	Expose HTTP-based paths through Connect proxy (#6446 ) Fixes: #5396 This PR adds a proxy configuration stanza called expose. These flags register listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only listening on the loopback interface, while still accepting traffic from non Connect-enabled services. Under expose there is a boolean checks flag that would automatically expose all registered HTTP and gRPC check paths. This stanza also accepts a paths list to expose individual paths. The primary use case for this functionality would be to expose paths for third parties like Prometheus or the kubelet. Listeners for requests to exposed paths are be configured dynamically at run time. Any time a proxy, or check can be registered, a listener can also be created. In this initial implementation requests to these paths are not authenticated/encrypted.	2019-09-25 20:55:52 -06:00
R.B. Boyer	9adc39cce0	tests: make envoy integration tests more tolerant of internal retries that may inflate counters (#6539 ) This should remove false positives that look like: cluster.s2.default.primary.*cx_total - expected count: 2, actual count: 3	2019-09-25 09:08:42 -05:00
Pierre Souchay	bfc27eb244	[BUGFIX][BUILD] When test fail in circle-ci in main, have a proper error message (#6416 ) Since FUNCNAME is not defined when running outside a function, trap does not work and display wrong error message. Example from https://circleci.com/gh/hashicorp/consul/69506 : ``` ⨯ FAIL /home/circleci/project/test/integration/connect/envoy/run-tests.sh: line 1: FUNCNAME[0]: unbound variable make: *** [GNUmakefile:363: test-envoy-integ] Error 1 ``` This fix will avoid this error message and display the real cause.	2019-08-28 10:26:05 -04:00
Matt Keeler	80b67c50da	Turned on Envoy 1.11.1 integration tests (#6347 ) I also ran this against 1.5.2 so the docs update claiming compatibility should still be accurate.	2019-08-20 10:20:13 -04:00
R.B. Boyer	91df06098c	xds: improve how envoy metrics are emitted (#6312 ) Since generated envoy clusters all are named using (mostly) SNI syntax we can have envoy read the various fields out of that structure and emit it as stats labels to the various telemetry backends. I changed the delimiter for the 'customization hash' from ':' to '~' because ':' is always reencoded by envoy as '_' when generating metrics keys.	2019-08-16 09:30:17 -05:00
R.B. Boyer	64fc002e03	connect: fix failover through a mesh gateway to a remote datacenter (#6259 ) Failover is pushed entirely down to the data plane by creating envoy clusters and putting each successive destination in a different load assignment priority band. For example this shows that normally requests go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080: - name: foo load_assignment: cluster_name: foo policy: overprovisioning_factor: 100000 endpoints: - priority: 0 lb_endpoints: - endpoint: address: socket_address: address: 1.2.3.4 port_value: 8080 - priority: 1 lb_endpoints: - endpoint: address: socket_address: address: 6.7.8.9 port_value: 8080 Mesh gateways route requests based solely on the SNI header tacked onto the TLS layer. Envoy currently only lets you configure the outbound SNI header at the cluster layer. If you try to failover through a mesh gateway you ideally would configure the SNI value per endpoint, but that's not possible in envoy today. This PR introduces a simpler way around the problem for now: 1. We identify any target of failover that will use mesh gateway mode local or remote and then further isolate any resolver node in the compiled discovery chain that has a failover destination set to one of those targets. 2. For each of these resolvers we will perform a small measurement of comparative healths of the endpoints that come back from the health API for the set of primary target and serial failover targets. We walk the list of targets in order and if any endpoint is healthy we return that target, otherwise we move on to the next target. 3. The CDS and EDS endpoints both perform the measurements in (2) for the affected resolver nodes. 4. For CDS this measurement selects which TLS SNI field to use for the cluster (note the cluster is always going to be named for the primary target) 5. For EDS this measurement selects which set of endpoints will populate the cluster. Priority tiered failover is ignored. One of the big downsides to this approach to failover is that the failover detection and correction is going to be controlled by consul rather than deferring that entirely to the data plane as with the prior version. This also means that we are bound to only failover using official health signals and cannot make use of data plane signals like outlier detection to affect failover. In this specific scenario the lack of data plane signals is ok because the effectiveness is already muted by the fact that the ultimate destination endpoints will have their data plane signals scrambled when they pass through the mesh gateway wrapper anyway so we're not losing much. Another related fix is that we now use the endpoint health from the underlying service, not the health of the gateway (regardless of failover mode).	2019-08-05 13:30:35 -05:00
R.B. Boyer	4666599e18	connect: reconcile how upstream configuration works with discovery chains (#6225 ) * connect: reconcile how upstream configuration works with discovery chains The following upstream config fields for connect sidecars sanely integrate into discovery chain resolution: - Destination Namespace/Datacenter: Compilation occurs locally but using different default values for namespaces and datacenters. The xDS clusters that are created are named as they normally would be. - Mesh Gateway Mode (single upstream): If set this value overrides any value computed for any resolver for the entire discovery chain. The xDS clusters that are created may be named differently (see below). - Mesh Gateway Mode (whole sidecar): If set this value overrides any value computed for any resolver for the entire discovery chain. If this is specifically overridden for a single upstream this value is ignored in that case. The xDS clusters that are created may be named differently (see below). - Protocol (in opaque config): If set this value overrides the value computed when evaluating the entire discovery chain. If the normal chain would be TCP or if this override is set to TCP then the result is that we explicitly disable L7 Routing and Splitting. The xDS clusters that are created may be named differently (see below). - Connect Timeout (in opaque config): If set this value overrides the value for any resolver in the entire discovery chain. The xDS clusters that are created may be named differently (see below). If any of the above overrides affect the actual result of compiling the discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op override to "tcp") then the relevant parameters are hashed and provided to the xDS layer as a prefix for use in naming the Clusters. This is to ensure that if one Upstream discovery chain has no overrides and tangentially needs a cluster named "api.default.XXX", and another Upstream does have overrides for "api.default.XXX" that they won't cross-pollinate against the operator's wishes. Fixes #6159	2019-08-01 22:03:34 -05:00
Matt Keeler	155cdf022f	Envoy Mesh Gateway integration tests (#6187 ) * Allow setting the mesh gateway mode for an upstream in config files * Add envoy integration test for mesh gateways This necessitated many supporting changes in most of the other test cases. Add remote mode mesh gateways integration test	2019-07-24 17:01:42 -04:00
R.B. Boyer	2bfad66efa	connect: rework how the service resolver subset OnlyPassing flag works (#6173 ) The main change is that we no longer filter service instances by health, preferring instead to render all results down into EDS endpoints in envoy and merely label the endpoints as HEALTHY or UNHEALTHY. When OnlyPassing is set to true we will force consul checks in a 'warning' state to render as UNHEALTHY in envoy. Fixes #6171	2019-07-23 20:20:24 -05:00
R.B. Boyer	e060748d3f	tests: adding new envoy integration tests for L7 service-resolvers (#6129 ) Additionally: - wait for bootstrap config entries to be applied - run the verify container in the host's PID namespace so we can kill envoys without mounting the docker socket * assert that we actually send HEALTHY and UNHEALTHY endpoints down in EDS during failover	2019-07-23 20:08:36 -05:00
R.B. Boyer	df0d932296	tests: when running envoy integration tests try to limit container bleedover between cases (#6148 )	2019-07-17 09:20:10 -05:00
R.B. Boyer	5932bcbe72	unknown fields now fail, so omit these unimplemented fields (#6125 )	2019-07-12 14:04:15 -05:00
R.B. Boyer	9e1e9aad2e	Fix bug in service-resolver redirects if the destination uses a default resolver. (#6122 ) Also: - add back an internal http endpoint to dump a compiled discovery chain for debugging purposes Before the CompiledDiscoveryChain.IsDefault() method would test: - is this chain just one resolver step? - is that resolver step just the default? But what I forgot to test: - is that resolver step for the same service that the chain represents? This last point is important because if you configured just one config entry: kind = "service-resolver" name = "web" redirect { service = "other" } and requested the chain for "web" you'd get back a default resolver for "other". In the xDS code the IsDefault() method is used to determine if this chain is "empty". If it is then we use the pre-discovery-chain logic that just uses data embedded in the Upstream object (and still lets the escape hatches function). In the example above that means certain parts of the xDS code were going to try referencing a cluster named "web..." despite the other parts of the xDS code maintaining clusters named "other...".	2019-07-12 12:21:25 -05:00
R.B. Boyer	c7df80ebf9	tests: further reduce envoy integration test flakiness (#6112 ) In addition to waiting until s2 shows up healthy in the Catalog, wait until s2 endpoints show up healthy via EDS in the s1 upstream clusters.	2019-07-12 11:12:56 -05:00
R.B. Boyer	f8033254c7	test: for envoy integration tests bump the time to wait for the upstream to be healthy (#6109 )	2019-07-10 18:07:47 -04:00
R.B. Boyer	2165e97efa	test: for envoy integration tests, wait until 's2' is healthy in consul before interrogating envoy (#6108 ) When the envoy healthy panic threshold was explicitly disabled as part of L7 traffic management it changed how envoy decided to load balance to endpoints in a cluster. This only matters when envoy is in "panic mode" aka "when you have a bunch of unhealthy endpoints". Panic mode sends traffic to unhealthy instances in certain circumstances. Note: Prior to explicitly disabling the healthy panic threshold, the default value is 50%. What was happening is that the test harness was bringing up consul the sidecars, and the service instances all at once and sometimes the proxies wouldn't have time to be checked by consul to be labeled as 'passing' in the catalog before a round of EDS happened. The xDS server in consul effectively queries /v1/health/connect/s2 and gets 1 result, but that one result has a 'critical' check so the xDS server sends back that endpoint labeled as UNHEALTHY. Envoy sees that 100% of the endpoints in the cluster are unhealthy and would enter panic mode and still send traffic to s2. This is why the test suites PRIOR to disabling the healthy panic threshold worked. They were _incorrectly_ passing. When the healthy panic threshol is disabled, envoy never enters panic mode in this situation and thus the cluster has zero healthy endpoints so load balancing goes nowhere and the tests fail. Why does this only affect the test suites for envoy 1.8.0? My guess is that https://github.com/envoyproxy/envoy/pull/4442 was merged into the 1.9.x series and somehow that plays a role. This PR modifies the bats scripts to explicitly wait until the upstream sidecar is healthy as measured by /v1/health/connect/s2?passing BEFORE trying to interrogate envoy which should make the tests less racy.	2019-07-10 15:58:25 -05:00
Jack Pearkes	2b1761bab3	Make cluster names SNI always (#6081 ) * Make cluster names SNI always * Update some tests * Ensure we check for prepared query types * Use sni for route cluster names * Proper mesh gateway mode defaulting when the discovery chain is used * Ignore service splits from PatchSliceOfMaps * Update some xds golden files for proper test output * Allow for grpc/http listeners/cluster configs with the disco chain * Update stats expectation	2019-07-08 12:48:48 +01:00
R.B. Boyer	bccbb2b4ae	activate most discovery chain features in xDS for envoy (#6024 )	2019-07-01 22:10:51 -05:00
Hans Hasselberg	73c4e9f07c	tls: auto_encrypt enables automatic RPC cert provisioning for consul clients (#5597 )	2019-06-27 22:22:07 +02:00
Paul Banks	e33e41195c	Fix envoy 1.10 exec (#5964 ) * Make exec test assert Envoy version - it was not rebuilding before and so often ran against wrong version. This makes 1.10 fail consistenty. * Switch Envoy exec to use a named pipe rather than FD magic since Envoy 1.10 doesn't support that. * Refactor to use an internal shim command for piping the bootstrap through. * Fmt. So sad that vscode golang fails so often these days. * go mod tidy * revert go mod tidy changes * Revert "ignore consul-exec tests until fixed (#5986)" This reverts commit 683262a6869033cb79e68fa1dba0f9ea83e9187d. * Review cleanups	2019-06-21 16:06:25 +01:00
Alvin Huang	e6419d4c23	ignore consul-exec tests until fixed (#5986 )	2019-06-18 15:45:32 -04:00
Paul Banks	737be347eb	Upgrade xDS (go-control-plane) API to support Envoy 1.10. (#5872 ) * Upgrade xDS (go-control-plane) API to support Envoy 1.10. This includes backwards compatibility shim to work around the ext_authz package rename in 1.10. It also adds integration test support in CI for 1.10.0. * Fix go vet complaints * go mod vendor * Update Envoy version info in docs * Update website/source/docs/connect/proxies/envoy.md	2019-06-07 07:10:43 -05:00
Paul Banks	2fb98e476a	Envoy integration test improvements (#5797 ) * Grab consul logs on integration test failures too and don't remove .gitignore * Don't wipe logs so we have some artifacts to upload at the end	2019-05-21 14:17:41 +01:00
Alvin Huang	7fd179c6dd	remove container after docker run exits (#5798 )	2019-05-07 10:13:07 -04:00
Paul Banks	c04f4a25ed	Make central conf test work when run in a suite. (#5767 ) * Make central conf test work when run in a suite. This switches integration tests to hard restart Consul each time which causes less surpise when some tests need to set configs that don't work on consul reload. This also increases the isolation and repeatability of the tests by dropping Consul's state entirely for each case run. * Remove aborted attempt to make restart optional.	2019-05-02 12:53:06 +01:00
Paul Banks	078f4cf5bb	Add integration test for central config; fix central config WIP (#5752 ) * Add integration test for central config; fix central config WIP * Add integration test for central config; fix central config WIP * Set proxy protocol correctly and begin adding upstream support * Add upstreams to service config cache key and start new notify watcher if they change. This doesn't update the tests to pass though. * Fix some merging logic get things working manually with a hack (TODO fix properly) * Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway * Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does * Fix up service manageer and API test failures * Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally * Remove version.go hack - will make integration test fail until release * Remove unused code from commands and upstream merge * Re-bump version to 1.5.0	2019-05-01 16:39:31 -07:00
Paul Banks	d6c0557e86	Connect: allow configuring Envoy for L7 Observability (#5558 ) * Add support for HTTP proxy listeners * Add customizable bootstrap configuration options * Debug logging for xDS AuthZ * Add Envoy Integration test suite with basic test coverage * Add envoy command tests to cover new cases * Add tracing integration test * Add gRPC support WIP * Merged changes from master Docker. get CI integration to work with same Dockerfile now * Make docker build optional for integration * Enable integration tests again! * http2 and grpc integration tests and fixes * Fix up command config tests * Store all container logs as artifacts in circle on fail * Add retries to outer part of stats measurements as we keep missing them in CI * Only dump logs on failing cases * Fix typos from code review * Review tidying and make tests pass again * Add debug logs to exec test. * Fix legit test failure caused by upstream rename in envoy config * Attempt to reduce cases of bad TLS handshake in CI integration tests * bring up the right service * Add prometheus integration test * Add test for denied AuthZ both HTTP and TCP * Try ANSI term for Circle	2019-04-29 17:27:57 +01:00

... 2 3 4 5 6 ...

372 Commits