* Implements a simple, tcp ingress gateway workflow
This adds a new type of gateway for allowing Ingress traffic into Connect from external services.
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
Failover is pushed entirely down to the data plane by creating envoy
clusters and putting each successive destination in a different load
assignment priority band. For example this shows that normally requests
go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080:
- name: foo
load_assignment:
cluster_name: foo
policy:
overprovisioning_factor: 100000
endpoints:
- priority: 0
lb_endpoints:
- endpoint:
address:
socket_address:
address: 1.2.3.4
port_value: 8080
- priority: 1
lb_endpoints:
- endpoint:
address:
socket_address:
address: 6.7.8.9
port_value: 8080
Mesh gateways route requests based solely on the SNI header tacked onto
the TLS layer. Envoy currently only lets you configure the outbound SNI
header at the cluster layer.
If you try to failover through a mesh gateway you ideally would
configure the SNI value per endpoint, but that's not possible in envoy
today.
This PR introduces a simpler way around the problem for now:
1. We identify any target of failover that will use mesh gateway mode local or
remote and then further isolate any resolver node in the compiled discovery
chain that has a failover destination set to one of those targets.
2. For each of these resolvers we will perform a small measurement of
comparative healths of the endpoints that come back from the health API for the
set of primary target and serial failover targets. We walk the list of targets
in order and if any endpoint is healthy we return that target, otherwise we
move on to the next target.
3. The CDS and EDS endpoints both perform the measurements in (2) for the
affected resolver nodes.
4. For CDS this measurement selects which TLS SNI field to use for the cluster
(note the cluster is always going to be named for the primary target)
5. For EDS this measurement selects which set of endpoints will populate the
cluster. Priority tiered failover is ignored.
One of the big downsides to this approach to failover is that the failover
detection and correction is going to be controlled by consul rather than
deferring that entirely to the data plane as with the prior version. This also
means that we are bound to only failover using official health signals and
cannot make use of data plane signals like outlier detection to affect
failover.
In this specific scenario the lack of data plane signals is ok because the
effectiveness is already muted by the fact that the ultimate destination
endpoints will have their data plane signals scrambled when they pass through
the mesh gateway wrapper anyway so we're not losing much.
Another related fix is that we now use the endpoint health from the
underlying service, not the health of the gateway (regardless of
failover mode).
In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand.
Also removed ability to configure the envoy OverprovisioningFactor.
* connect: reconcile how upstream configuration works with discovery chains
The following upstream config fields for connect sidecars sanely
integrate into discovery chain resolution:
- Destination Namespace/Datacenter: Compilation occurs locally but using
different default values for namespaces and datacenters. The xDS
clusters that are created are named as they normally would be.
- Mesh Gateway Mode (single upstream): If set this value overrides any
value computed for any resolver for the entire discovery chain. The xDS
clusters that are created may be named differently (see below).
- Mesh Gateway Mode (whole sidecar): If set this value overrides any
value computed for any resolver for the entire discovery chain. If this
is specifically overridden for a single upstream this value is ignored
in that case. The xDS clusters that are created may be named differently
(see below).
- Protocol (in opaque config): If set this value overrides the value
computed when evaluating the entire discovery chain. If the normal chain
would be TCP or if this override is set to TCP then the result is that
we explicitly disable L7 Routing and Splitting. The xDS clusters that
are created may be named differently (see below).
- Connect Timeout (in opaque config): If set this value overrides the
value for any resolver in the entire discovery chain. The xDS clusters
that are created may be named differently (see below).
If any of the above overrides affect the actual result of compiling the
discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op
override to "tcp") then the relevant parameters are hashed and provided
to the xDS layer as a prefix for use in naming the Clusters. This is to
ensure that if one Upstream discovery chain has no overrides and
tangentially needs a cluster named "api.default.XXX", and another
Upstream does have overrides for "api.default.XXX" that they won't
cross-pollinate against the operator's wishes.
Fixes#6159
* Ensure the mesh gateway configuration comes back in the api within each upstream
* Add a test for the MeshGatewayConfig in the ToAPI functions
* Ensure we don’t use gateways for dc local connections
* Update the svc kind index for deletions
* Replace the proxycfg.state cache with an interface for testing
Also start implementing proxycfg state testing.
* Update the state tests to verify some gateway watches for upstream-targets of a discovery chain.
* Proxy Config Manager
This component watches for local state changes on the agent and ensures that each service registered locally with Kind == connect-proxy has it's state being actively populated in the cache.
This serves two purposes:
1. For the built-in proxy, it ensures that the state needed to accept connections is available in RAM shortly after registration and likely before the proxy actually starts accepting traffic.
2. For (future - next PR) xDS server and other possible future proxies that require _push_ based config discovery, this provides a mechanism to subscribe and be notified about updates to a proxy instance's config including upstream service discovery results.
* Address review comments
* Better comments; Better delivery of latest snapshot for slow watchers; Embed Config
* Comment typos
* Add upstream Stringer for funsies