* xds: refactor ingress listener SDS configuration
* xds: update resolveListenerSDS call args in listeners_test
* ingress: add TLS min, max and cipher suites to GatewayTLSConfig
* xds: implement envoyTLSVersions and envoyTLSCipherSuites
* xds: merge TLS config
* xds: configure TLS parameters with ingress TLS context from leaf
* xds: nil check in resolveListenerTLSConfig validation
* xds: nil check in makeTLSParameters* functions
* changelog: add entry for TLS params on ingress config entries
* xds: remove indirection for TLS params in TLSConfig structs
* xds: return tlsContext, nil instead of ambiguous err
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
* xds: switch zero checks to types.TLSVersionUnspecified
* ingress: add validation for ingress config entry TLS params
* ingress: validate listener TLS config
* xds: add basic ingress with TLS params tests
* xds: add ingress listeners mixed TLS min version defaults precedence test
* xds: add more explicit tests for ingress listeners inheriting gateway defaults
* xds: add test for single TLS listener on gateway without TLS defaults
* xds: regen golden files for TLSVersionInvalid zero value, add TLSVersionAuto listener test
* types/tls: change TLSVersion to string
* types/tls: update TLSCipherSuite to string type
* types/tls: implement validation functions for TLSVersion and TLSCipherSuites, make some maps private
* api: add TLS params to GatewayTLSConfig, add tests
* api: add TLSMinVersion to ingress gateway config entry test JSON
* xds: switch to Envoy TLS cipher suite encoding from types package
* xds: fixup validation for TLSv1_3 min version with cipher suites
* add some kitchen sink tests and add a missing struct tag
* xds: check if mergedCfg.TLSVersion is in TLSVersionsWithConfigurableCipherSuites
* xds: update connectTLSEnabled comment
* xds: remove unsued resolveGatewayServiceTLSConfig function
* xds: add makeCommonTLSContextFromLeafWithoutParams
* types/tls: add LessThan comparator function for concrete values
* types/tls: change tlsVersions validation map from string to TLSVersion keys
* types/tls: remove unused envoyTLSCipherSuites
* types/tls: enable chacha20 cipher suites for Consul agent
* types/tls: remove insecure cipher suites from allowed config
TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 and TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 are both explicitly listed as insecure and disabled in the Go source.
Refs https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/crypto/tls/cipher_suites.go;l=329-330
* types/tls: add ValidateConsulAgentCipherSuites function, make direct lookup map private
* types/tls: return all unmatched cipher suites in validation errors
* xds: check that Envoy API value matching TLS version is found when building TlsParameters
* types/tls: check that value is found in map before appending to slice in MarshalEnvoyTLSCipherSuiteStrings
* types/tls: cast to string rather than fmt.Printf in TLSCihperSuite.String()
* xds: add TLSVersionUnspecified to list of configurable cipher suites
* structs: update note about config entry warning
* xds: remove TLS min version cipher suite unconfigurable test placeholder
* types/tls: update tests to remove assumption about private map values
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Previously we could get into a state where discovery chain entries were
not cleaned up after the associated watch was cancelled. These changes
add handling for that case where stray chain references are encountered.
Previously SAN validation for prepared queries was broken because we
validated against the name, namespace, and datacenter for prepared
queries.
However, prepared queries can target:
- Services with a name that isn't their own
- Services in multiple datacenters
This means that the SpiffeID to validate needs to be based on the
prepared query endpoints, and not the prepared query's upstream
definition.
This commit updates prepared query clusters to account for that.
These changes ensure that the identity of services dialed is
cryptographically verified.
For all upstreams we validate against SPIFFE IDs in the format used by
Consul's service mesh:
spiffe://<trust-domain>/ns/<namespace>/dc/<datacenter>/svc/<service>
CatalogDestinationsOnly is a passthrough that would enable dialing
addresses outside of Consul's catalog. However, when this flag is set to
true only _connect_ endpoints for services can be dialed.
This flag is being renamed to signal that non-Connect endpoints can't be
dialed by transparent proxies when the value is set to true.
This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged.
Union of all commit messages follows to give an overarching summary:
xds: exclusively support incremental xDS when using xDS v3
Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail.
Work around a strange older envoy behavior involving empty CDS responses over incremental xDS.
xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support
Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead.
Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client.
xds: pull out checkStreamACLs method in advance of a later commit
xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings
In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty.
This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10.
xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time
This config entry is being renamed primarily because in k8s the name
cluster could be confusing given that the config entry applies across
federated datacenters.
Additionally, this config entry will only apply to Consul as a service
mesh, so the more generic "cluster" name is not needed.
This PR replaces the original boolean used to configure transparent
proxy mode. It was replaced with a string mode that can be set to:
- "": Empty string is the default for when the setting should be
defaulted from other configuration like config entries.
- "direct": Direct mode is how applications originally opted into the
mesh. Proxy listeners need to be dialed directly.
- "transparent": Transparent mode enables configuring Envoy as a
transparent proxy. Traffic must be captured and redirected to the
inbound and outbound listeners.
This PR also adds a struct for transparent proxy specific configuration.
Initially this is not stored as a pointer. Will revisit that decision
before GA.
This is done because after removing ID and NodeName from
ServiceConfigRequest we will no longer know whether a request coming in
is for a Consul client earlier than v1.10.
Note that this does NOT upgrade to xDS v3. That will come in a future PR.
Additionally:
- Ignored staticcheck warnings about how github.com/golang/protobuf is deprecated.
- Shuffled some agent/xds imports in advance of a later xDS v3 upgrade.
- Remove support for envoy 1.13.x but don't add in 1.17.x yet. We have to wait until the xDS v3 support is added in a follow-up PR.
Fixes#8425
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Currently when passing hostname clusters to Envoy, we set each service instance registered with Consul as an LbEndpoint for the cluster.
However, Envoy can only handle one per cluster:
[2020-06-04 18:32:34.094][1][warning][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Error adding/updating cluster(s) dc2.internal.ddd90499-9b47-91c5-4616-c0cbf0fc358a.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint, server.dc2.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint
Envoy is currently handling this gracefully by only picking one of the endpoints. However, we should avoid passing multiple to avoid these warning logs.
This PR:
* Ensures we only pass one endpoint, which is tied to one service instance.
* We prefer sending an endpoint which is marked as Healthy by Consul.
* If no endpoints are healthy we emit a warning and skip the cluster.
* If multiple unique hostnames are spread across service instances we emit a warning and let the user know which will be resolved.