open-consul

Commit Graph

Author	SHA1	Message	Date
freddygv	7fba7456ec	Fix race of upstreams with same passthrough ip Due to timing, a transparent proxy could have two upstreams to dial directly with the same address. For example: - The orders service can dial upstreams shipping and payment directly. - An instance of shipping at address 10.0.0.1 is deregistered. - Payments is scaled up and scheduled to have address 10.0.0.1. - The orders service receives the event for the new payments instance before seeing the deregistration for the shipping instance. At this point two upstreams have the same passthrough address and Envoy will reject the listener configuration. To disambiguate this commit considers the Raft index when storing passthrough addresses. In the example above, 10.0.0.1 would only be associated with the newer payments service instance.	2022-02-10 17:01:57 -07:00
R.B. Boyer	baf886c6f3	proxycfg: introduce explicit UpstreamID in lieu of bare string (#12125 ) The gist here is that now we use a value-type struct proxycfg.UpstreamID as the map key in ConfigSnapshot maps where we used to use "upstream id-ish" strings. These are internal only and used just for bidirectional trips through the agent cache keyspace (like the discovery chain target struct). For the few places where the upstream id needs to be projected into xDS, that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS inject the partition and namespace into these things without making stuff like the golden testdata diverge.	2022-01-20 10:12:04 -06:00
freddygv	b704d4e2dd	Validate chains are associated with upstreams Previously we could get into a state where discovery chain entries were not cleaned up after the associated watch was cancelled. These changes add handling for that case where stray chain references are encountered.	2021-12-13 18:56:13 -07:00
Daniel Upton	caa5b5a5a6	xds: prefer fed state gateway definitions if they're fresher (#11522 ) Fixes an issue described in #10132, where if two DCs are WAN federated over mesh gateways, and the gateway in the non-primary DC is terminated and receives a new IP address (as is commonly the case when running them on ephemeral compute instances) the primary DC is unable to re-establish its connection until the agent running on its own gateway is restarted. This was happening because we always preferred gateways discovered by the `Internal.ServiceDump` RPC (which would fail because there's no way to dial the remote DC) over those discovered in the federation state, which is replicated as long as the primary DC's gateway is reachable.	2021-11-09 16:45:36 +00:00
freddygv	ce43e8cf99	Store GatewayKey in proxycfg snapshot for re-use	2021-11-01 13:58:53 -06:00
freddygv	51c888a41a	Update locality check in xds	2021-11-01 13:58:53 -06:00
freddygv	4de3537391	Split up locality check from hostname check	2021-10-27 11:15:25 -06:00
freddygv	0391a65772	Replace default partition check	2021-10-27 11:15:25 -06:00
freddygv	ee45ac9dc5	PR comments	2021-10-27 11:15:25 -06:00
freddygv	8b5a9369eb	Account for partitions in xds gen for mesh gw This commit avoids skipping gateways in remote partitions of the local DC when generating listeners/clusters/endpoints.	2021-10-27 11:15:25 -06:00
freddygv	d1d513b1b3	Account for partition in SNI for gateways	2021-10-27 11:15:25 -06:00
freddygv	4f0432be5e	Update xds pkg to account for GatewayKey	2021-10-27 09:03:56 -06:00
freddygv	4737ad118d	Swap in structs.EqualPartitions for cmp	2021-10-26 23:36:01 -06:00
freddygv	69476221c1	Update XDS for sidecars dialing through gateways	2021-10-26 23:35:48 -06:00
Dhia Ayachi	eb19271fd7	add partition to SNI when partition is non default (#10917 )	2021-09-01 10:35:39 -04:00
Mark Anderson	10963d0cbd	Add support for downstreams Enhance config by adding SocketPath and LocalSocketPath config values Supports syntax of the form: ``` services { name = "sock_forwarder" id = "sock_forwarder.1" socket_path = "/tmp/downstream_3.sock" connect { sidecar_service { proxy { local_service_socket_path = "/tmp/downstream.sock" } } } } ``` Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2021-05-04 12:41:43 -07:00
R.B. Boyer	91bee6246f	Support Incremental xDS mode (#9855 ) This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged. Union of all commit messages follows to give an overarching summary: xds: exclusively support incremental xDS when using xDS v3 Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail. Work around a strange older envoy behavior involving empty CDS responses over incremental xDS. xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead. Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client. xds: pull out checkStreamACLs method in advance of a later commit xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty. This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10. xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time	2021-04-29 13:54:05 -05:00
freddygv	6c43195e2a	Merge master and fix upstream config protocol defaulting	2021-03-17 21:13:40 -06:00
freddygv	9256d09255	Handle prepared queries in Upstreams loop and escape hatches in disco chain loop	2021-03-17 15:17:43 -06:00
freddygv	5b59780431	Update xds for transparent proxy	2021-03-17 13:40:49 -06:00
freddygv	d90240d367	Restore old Envoy prefix on escape hatches This is done because after removing ID and NodeName from ServiceConfigRequest we will no longer know whether a request coming in is for a Consul client earlier than v1.10.	2021-03-15 14:12:57 -06:00
freddygv	68148a1dae	finish moving UpstreamConfig and related fields to structs pkg	2021-03-10 21:04:13 -07:00
R.B. Boyer	503041f216	xds: default to speaking xDS v3, but allow for v2 to be spoken upon request (#9658 ) - Also add support for envoy 1.17.0	2021-02-26 16:23:15 -06:00
R.B. Boyer	cdc5e99184	xds: remove deprecated usages of xDS (#9602 ) Note that this does NOT upgrade to xDS v3. That will come in a future PR. Additionally: - Ignored staticcheck warnings about how github.com/golang/protobuf is deprecated. - Shuffled some agent/xds imports in advance of a later xDS v3 upgrade. - Remove support for envoy 1.13.x but don't add in 1.17.x yet. We have to wait until the xDS v3 support is added in a follow-up PR. Fixes #8425	2021-02-22 15:00:15 -06:00
R.B. Boyer	6e3d07c995	xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8222 ) - cut down on extra node metadata transmission - split the golden file generation to compare all envoy version	2020-07-09 17:04:51 -05:00
Daniel Nephin	07c1081d39	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
R.B. Boyer	ba83b52b32	connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165 )	2020-06-23 15:19:56 -05:00
freddygv	1e7e716742	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
Freddy	f759a48726	Enable gateways to resolve hostnames to IPv4 addresses (#7999 ) The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option. If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS. Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.	2020-06-03 15:28:45 -06:00
Raphaël Rondeau	b799471e29	connect: fix endpoints clusterName when using cluster escape hatch (#7319 ) ```changelog * fix(connect): fix endpoints clusterName when using cluster escape hatch ```	2020-05-26 10:57:22 +02:00
Daniel Nephin	3e76ebbd49	Remove unused var The usage was removed in 8e22d80e3550592a32144f2d1c84ebf6becb957b, however it seems there may be a bug here because the cluster name is not updated when the target changes.	2020-05-19 16:50:14 -04:00
Daniel Nephin	545bd766e7	Fix a number of problems found by staticcheck Some of these problems are minor (unused vars), but others are real bugs (ignored errors). Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-05-19 16:50:14 -04:00
Kyle Havlovitz	e4268c8b7f	Support multiple listeners referencing the same service in gateway definitions	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	b21cd112e5	Allow ingress gateways to route traffic based on Host header This commit adds the necessary changes to allow an ingress gateway to route traffic from a single defined port to multiple different upstream services in the Consul mesh. To do this, we now require all HTTP requests coming into the ingress gateway to specify a Host header that matches "<service-name>.*" in order to correctly route traffic to the correct service. - Differentiate multiple listener's route names by port - Adds a case in xds for allowing default discovery chains to create a route configuration when on an ingress gateway. This allows default services to easily use host header routing - ingress-gateways have a single route config for each listener that utilizes domain matching to route to different services.	2020-05-06 15:06:13 -05:00
freddygv	929491c979	Add subset support	2020-04-27 11:08:40 -06:00
freddygv	2e35a9bb18	Add xds cluster/listener/endpoint management	2020-04-27 11:08:40 -06:00
Chris Piraino	b8a5fbf1bf	Allow ingress gateways to route through mesh gateways - Adds integration test for mesh gateways local + remote modes with ingress - ingress golden files updated for mesh gateway endpoints	2020-04-24 09:31:32 -05:00
Kyle Havlovitz	6a5eba63ab	Ingress Gateways for TCP services (#7509 ) * Implements a simple, tcp ingress gateway workflow This adds a new type of gateway for allowing Ingress traffic into Connect from external services. Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-04-16 14:00:48 -07:00
Daniel Nephin	6d612abbde	Remove unused token parameter	2020-03-27 17:57:16 -04:00
R.B. Boyer	a7fb26f50f	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Matt Keeler	154eafe140	xDS Mesh Gateway Resolver Subset Fixes (#7294 ) * xDS Mesh Gateway Resolver Subset Fixes The first fix was that clusters were being generated for every service resolver subset regardless of there being any service instances of the associated service in that dc. The previous logic didn’t care at all but now it will omit generating those clusters unless we also have service instances that should be proxied. The second fix was to respect the DefaultSubset of a service resolver so that mesh-gateways would configure the endpoints of the unnamed subset cluster to only those endpoints matched by the default subsets filters. * Refactor the gateway endpoint generation to be a little easier to read	2020-02-19 11:57:55 -05:00
Matt Keeler	485a0a65ea	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
R.B. Boyer	0675e0606e	connect: generate the full SNI names for discovery targets in the compiler rather than in the xds package (#6340 )	2019-08-19 13:03:03 -05:00
R.B. Boyer	64fc002e03	connect: fix failover through a mesh gateway to a remote datacenter (#6259 ) Failover is pushed entirely down to the data plane by creating envoy clusters and putting each successive destination in a different load assignment priority band. For example this shows that normally requests go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080: - name: foo load_assignment: cluster_name: foo policy: overprovisioning_factor: 100000 endpoints: - priority: 0 lb_endpoints: - endpoint: address: socket_address: address: 1.2.3.4 port_value: 8080 - priority: 1 lb_endpoints: - endpoint: address: socket_address: address: 6.7.8.9 port_value: 8080 Mesh gateways route requests based solely on the SNI header tacked onto the TLS layer. Envoy currently only lets you configure the outbound SNI header at the cluster layer. If you try to failover through a mesh gateway you ideally would configure the SNI value per endpoint, but that's not possible in envoy today. This PR introduces a simpler way around the problem for now: 1. We identify any target of failover that will use mesh gateway mode local or remote and then further isolate any resolver node in the compiled discovery chain that has a failover destination set to one of those targets. 2. For each of these resolvers we will perform a small measurement of comparative healths of the endpoints that come back from the health API for the set of primary target and serial failover targets. We walk the list of targets in order and if any endpoint is healthy we return that target, otherwise we move on to the next target. 3. The CDS and EDS endpoints both perform the measurements in (2) for the affected resolver nodes. 4. For CDS this measurement selects which TLS SNI field to use for the cluster (note the cluster is always going to be named for the primary target) 5. For EDS this measurement selects which set of endpoints will populate the cluster. Priority tiered failover is ignored. One of the big downsides to this approach to failover is that the failover detection and correction is going to be controlled by consul rather than deferring that entirely to the data plane as with the prior version. This also means that we are bound to only failover using official health signals and cannot make use of data plane signals like outlier detection to affect failover. In this specific scenario the lack of data plane signals is ok because the effectiveness is already muted by the fact that the ultimate destination endpoints will have their data plane signals scrambled when they pass through the mesh gateway wrapper anyway so we're not losing much. Another related fix is that we now use the endpoint health from the underlying service, not the health of the gateway (regardless of failover mode).	2019-08-05 13:30:35 -05:00
R.B. Boyer	0165e93517	connect: expose an API endpoint to compile the discovery chain (#6248 ) In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand. Also removed ability to configure the envoy OverprovisioningFactor.	2019-08-02 15:34:54 -05:00
R.B. Boyer	782c647bf4	connect: simplify the compiled discovery chain data structures (#6242 ) This should make them better for sending over RPC or the API. Instead of a chain implemented explicitly like a linked list (nodes holding pointers to other nodes) instead switch to a flat map of named nodes with nodes linking other other nodes by name. The shipped structure is just a map and a string to indicate which key to start from. Other changes: * inline the compiler option InferDefaults as true * introduce compiled target config to avoid needing to send back additional maps of Resolvers; future target-specific compiled state can go here * move compiled MeshGateway out of the Resolver and into the TargetConfig where it makes more sense.	2019-08-01 22:44:05 -05:00
R.B. Boyer	4666599e18	connect: reconcile how upstream configuration works with discovery chains (#6225 ) * connect: reconcile how upstream configuration works with discovery chains The following upstream config fields for connect sidecars sanely integrate into discovery chain resolution: - Destination Namespace/Datacenter: Compilation occurs locally but using different default values for namespaces and datacenters. The xDS clusters that are created are named as they normally would be. - Mesh Gateway Mode (single upstream): If set this value overrides any value computed for any resolver for the entire discovery chain. The xDS clusters that are created may be named differently (see below). - Mesh Gateway Mode (whole sidecar): If set this value overrides any value computed for any resolver for the entire discovery chain. If this is specifically overridden for a single upstream this value is ignored in that case. The xDS clusters that are created may be named differently (see below). - Protocol (in opaque config): If set this value overrides the value computed when evaluating the entire discovery chain. If the normal chain would be TCP or if this override is set to TCP then the result is that we explicitly disable L7 Routing and Splitting. The xDS clusters that are created may be named differently (see below). - Connect Timeout (in opaque config): If set this value overrides the value for any resolver in the entire discovery chain. The xDS clusters that are created may be named differently (see below). If any of the above overrides affect the actual result of compiling the discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op override to "tcp") then the relevant parameters are hashed and provided to the xDS layer as a prefix for use in naming the Clusters. This is to ensure that if one Upstream discovery chain has no overrides and tangentially needs a cluster named "api.default.XXX", and another Upstream does have overrides for "api.default.XXX" that they won't cross-pollinate against the operator's wishes. Fixes #6159	2019-08-01 22:03:34 -05:00
Matt Keeler	7e69646a77	Fix prepared query upstream endpoint generation (#6236 ) Use the correct SNI value for prepared query upstreams	2019-07-29 11:15:55 -04:00
R.B. Boyer	2bfad66efa	connect: rework how the service resolver subset OnlyPassing flag works (#6173 ) The main change is that we no longer filter service instances by health, preferring instead to render all results down into EDS endpoints in envoy and merely label the endpoints as HEALTHY or UNHEALTHY. When OnlyPassing is set to true we will force consul checks in a 'warning' state to render as UNHEALTHY in envoy. Fixes #6171	2019-07-23 20:20:24 -05:00
Jack Pearkes	2b1761bab3	Make cluster names SNI always (#6081 ) * Make cluster names SNI always * Update some tests * Ensure we check for prepared query types * Use sni for route cluster names * Proper mesh gateway mode defaulting when the discovery chain is used * Ignore service splits from PatchSliceOfMaps * Update some xds golden files for proper test output * Allow for grpc/http listeners/cluster configs with the disco chain * Update stats expectation	2019-07-08 12:48:48 +01:00

1 2

57 Commits