more readable in CI.
```
Running primary verification step for case-ingress-gateway-multiple-services...
�[34;1mverify.bats
�[0m�[1G ingress proxy admin is up on :20000�[K�[75G 1/12�[2G�[1G ✓ ingress proxy admin is up on :20000�[K
�[0m�[1G s1 proxy admin is up on :19000�[K�[75G 2/12�[2G�[1G ✓ s1 proxy admin is up on :19000�[K
�[0m�[1G s2 proxy admin is up on :19001�[K�[75G 3/12�[2G�[1G ✓ s2 proxy admin is up on :19001�[K
�[0m�[1G s1 proxy listener should be up and have right cert�[K�[75G 4/12�[2G�[1G ✓ s1 proxy listener should be up and have right cert�[K
�[0m�[1G s2 proxy listener should be up and have right cert�[K�[75G 5/12�[2G�[1G ✓ s2 proxy listener should be up and have right cert�[K
�[0m�[1G ingress-gateway should have healthy endpoints for s1�[K�[75G 6/12�[2G�[31;1m�[1G ✗ ingress-gateway should have healthy endpoints for s1�[K
�[0m�[31;22m (from function `assert_upstream_has_endpoints_in_status' in file /workdir/primary/bats/helpers.bash, line 385,
```
versus
```
Running primary verification step for case-ingress-gateway-multiple-services...
1..12
ok 1 ingress proxy admin is up on :20000
ok 2 s1 proxy admin is up on :19000
ok 3 s2 proxy admin is up on :19001
ok 4 s1 proxy listener should be up and have right cert
ok 5 s2 proxy listener should be up and have right cert
not ok 6 ingress-gateway should have healthy endpoints for s1
not ok 7 s1 proxy should have been configured with max_connections in services
ok 8 ingress-gateway should have healthy endpoints for s2
```
Without this change, you'd see this error:
```
./run-tests.sh: line 49: LAMBDA_TESTS_ENABLED: unbound variable
./run-tests.sh: line 49: LAMBDA_TESTS_ENABLED: unbound variable
```
Locally, always run integration tests using amd64, even if running
on an arm mac. This ensures the architecture locally always matches
the CI/CD environment.
In addition:
* Use consul:local for envoy integration and upgrade tests. Previously,
consul:local was used for upgrade tests and consul-dev for integration
tests. I didn't see a reason to use separate images as it's more
confusing.
* By default, disable the requirement that aws credentials are set.
These are only needed for the lambda tests and make it so you
can't run any tests locally, even if you're not running the
lambda tests. Now they'll only run if the LAMBDA_TESTS_ENABLED
env var is set.
* Split out the building of the Docker image for integration
tests into its own target from `dev-docker`. This allows us to always
use an amd64 image without messing up the `dev-docker` target.
* Add support for passing GO_TEST_FLAGs to `test-envoy-integ` target.
* Add a wait_for_leader function because tests were failing locally
without it.
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
Require use of mesh gateways in order for service mesh data plane
traffic to flow between peers.
This also adds plumbing for envoy integration tests involving peers, and
one starter peering test.
* Add partition fields to targets like service route destinations
* Update validation to prevent cross-DC + cross-partition references
* Handle partitions when reading config entries for disco chain
* Encode partition in compiled targets
We launch one container as part of the test with --pid=host but
apparently within that container it launches a copy of "tini" as a
process supervisor that prefers to be PID 1.
Because it's not PID 1 it logs a warning message about this to the envoy
integration test logs that can lead to thinking somehow that a test
failure is related when in fact it's completely unrelated.
Adding this environment variable avoids the warning.
The only thing that needed fixing up pertained to this section of the 1.18.x release notes:
> grpc_stats: the default value for stats_for_all_methods is switched from true to false, in order to avoid possible memory exhaustion due to an untrusted downstream sending a large number of unique method names. The previous default value was deprecated in version 1.14.0. This only changes the behavior when the value is not set. The previous behavior can be used by setting the value to true. This behavior change by be overridden by setting runtime feature envoy.deprecated_features.grpc_stats_filter_enable_stats_for_all_methods_by_default.
For now to maintain status-quo I'm explicitly setting `stats_for_all_methods=true` in all versions to avoid relying upon the default.
Additionally the naming of the emitted metrics for these gRPC requests changed slightly so the integration test assertions for `case-grpc` needed adjusting.
This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged.
Union of all commit messages follows to give an overarching summary:
xds: exclusively support incremental xDS when using xDS v3
Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail.
Work around a strange older envoy behavior involving empty CDS responses over incremental xDS.
xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support
Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead.
Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client.
xds: pull out checkStreamACLs method in advance of a later commit
xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings
In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty.
This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10.
xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time
To fix failing integration tests. The latest version (`1.7.4.0-r0`)
appears to not be catting all the bytes, so the expected metrics are
missing in the output.
This has the biggest impact on enterprise test cases that use namespaced
registrations, which prior to this change sometimes failed the initial
registration because the namespace was not yet created.
The key thing here is to use `curl --no-keepalive` so that envoy
pre-1.15 tests will reliably use the latest listener every time.
Extra:
- Switched away from editing line-item intentions the legacy way.
- Removed some teardown scripts, as we don't share anything between cases anyway
- Removed unnecessary use of `run` in some places.
This speeds up individual envoy integration test runs from ~23m to ~14m.
It's also a pre-req for possibly switching to doing the tests entirely within Go (no shell-outs).
Related changes:
- hard-fail the xDS connection attempt if the envoy version is known to be too old to be supported
- remove the RouterMatchSafeRegex proxy feature since all supported envoy versions have it
- stop using --max-obj-name-len (due to: envoyproxy/envoy#11740)