open-consul

Commit Graph

Author	SHA1	Message	Date
Derek Menteer	8efe862b76	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-08 14:53:08 -05:00
Derek Menteer	6aaf1c6035	Various cleanups.	2022-09-08 10:51:50 -05:00
Chris S. Kim	9b5c5c5062	Merge pull request #14285 from hashicorp/NET-638-push-server-address-updates-to-the-peer peering: Subscribe to server address changes and push updates to peers	2022-09-07 09:30:45 -04:00
Freddy	a7f38384ae	Add SpiffeID for Consul server agents (#14485 ) Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> By adding a SpiffeID for server agents, servers can now request a leaf certificate from the Connect CA. This new Spiffe ID has a key property: servers are identified by their datacenter name and trust domain. All servers that share these attributes will share a ServerURI. The aim is to use these certificates to verify the server name of ANY server in a Consul datacenter.	2022-09-06 17:58:13 -06:00
Daniel Upton	8cd6c9f95e	proxycfg-glue: server-local implementation of ResolvedServiceConfig This is the OSS portion of enterprise PR 2460. Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig interface that sources data from a blocking query against the server's state store. It moves the service config resolution logic into the agent/configentry package so that it can be used in both the RPC handler and data source. I've also done a little re-arranging and adding comments to call out data sources for which there is to be no server-local equivalent.	2022-09-06 23:27:25 +01:00
Derek Menteer	b50bc443f3	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-06 10:51:04 -05:00
Derek Menteer	d771725a14	Add kv txn get-not-exists operation.	2022-09-06 10:28:59 -05:00
Chris S. Kim	9ad8bf67a5	Add testcase for parsing grpc_port	2022-09-06 10:17:44 -04:00
Kyle Havlovitz	a484a759c8	Merge pull request #14429 from hashicorp/ca-prune-intermediates Prune old expired intermediate certs when appending a new one	2022-09-02 15:34:33 -07:00
Derek Menteer	cb478b0e61	Address PR comments.	2022-09-01 16:54:24 -05:00
Kyle Havlovitz	90fa16c8b5	Prune intermediates before appending new one	2022-09-01 14:24:30 -07:00
malizz	ef5f697121	Add additional parameters to envoy passive health check config (#14238 ) * draft commit * add changelog, update test * remove extra param * fix test * update type to account for nil value * add test for custom passive health check * update comments and tests * update description in docs * fix missing commas	2022-09-01 09:59:11 -07:00
Chris S. Kim	e70ba97e45	Add Internal.ServiceDump support for querying by PeerName	2022-09-01 10:32:59 -04:00
Derek Menteer	ab9d421ba2	Change serf-tag references to field references.	2022-08-31 16:38:42 -05:00
Kyle Havlovitz	c5370d52e9	Prune old expired intermediate certs when appending a new one	2022-08-31 11:41:58 -07:00
Chris S. Kim	9c157e40a3	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-30 11:09:25 -04:00
Freddy	f27a9effca	Merge pull request #13496 from maxb/fix-kv_entries-metric	2022-08-29 15:35:11 -06:00
Freddy	69d99aa8c0	Merge pull request #14364 from hashicorp/peering/term-delete	2022-08-29 15:33:18 -06:00
Max Bowsher	3aefc4123f	Merge branch 'main' into fix-kv_entries-metric	2022-08-29 22:22:10 +01:00
Chris S. Kim	7b267f5c01	Merge pull request #14371 from hashicorp/kisunji/peering-metrics-update Adjust metrics reporting for peering tracker	2022-08-29 17:16:19 -04:00
Chris S. Kim	e4a154c88e	Add heartbeat timeout grace period when accounting for peering health	2022-08-29 16:32:26 -04:00
Derek Menteer	b641dcf03d	Expose `grpc_tls` via serf for cluster peering.	2022-08-29 13:43:49 -05:00
Derek Menteer	4a01d75cf8	Add separate grpc_tls port. To ease the transition for users, the original gRPC port can still operate in a deprecated mode as either plain-text or TLS mode. This behavior should be removed in a future release whenever we no longer support this. The resulting behavior from this commit is: `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports. `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port. `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).	2022-08-29 13:43:43 -05:00
freddygv	f790d84c04	Add validation to prevent switching dialing mode This prevents unexpected changes to the output of ShouldDial, which should never change unless a peering is deleted and recreated.	2022-08-29 12:31:13 -06:00
Eric Haberkorn	2a370d456b	Update the structs and discovery chain for service resolver redirects to cluster peers. (#14366 )	2022-08-29 09:51:32 -04:00
Chris S. Kim	b1025f2dd9	Adjust metrics reporting for peering tracker	2022-08-26 17:34:17 -04:00
freddygv	19f25fc3a5	Allow terminated peerings to be deleted Peerings are terminated when a peer decides to delete the peering from their end. Deleting a peering sends a termination message to the peer and triggers them to mark the peering as terminated but does NOT delete the peering itself. This is to prevent peerings from disappearing from both sides just because one side deleted them. Previously the Delete endpoint was skipping the deletion if the peering was not marked as active. However, terminated peerings are also inactive. This PR makes some updates so that peerings marked as terminated can be deleted by users.	2022-08-26 10:52:47 -06:00
Chris S. Kim	516a6daefa	Merge branch 'main' into catalog-service-list-filter	2022-08-26 11:16:06 -04:00
Chris S. Kim	a2c857df40	Fix tests for enterprise	2022-08-26 11:14:02 -04:00
Chris S. Kim	a5e9ea6d96	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-26 10:43:56 -04:00
Chris S. Kim	a8090268d4	Replace ring buffer with async version (#14314 ) We need to watch for changes to peerings and update the server addresses which get served by the ring buffer. Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.	2022-08-26 10:27:13 -04:00
alex	f64af3be24	peering: add peer health metric (#14004 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-08-25 16:32:59 -07:00
Chris S. Kim	2e75833133	Exit loop when context is cancelled	2022-08-25 11:48:25 -04:00
skpratt	c039028401	no-op: refactor usagemetrics tests for clarity and DRY cases (#14313 )	2022-08-24 12:00:09 -05:00
Dan Upton	20c87d235f	dataplane: update envoy bootstrap params for consul-dataplane (#14017 ) Contains 2 changes to the GetEnvoyBootstrapParams response to support consul-dataplane. Exposing node_name and node_id: consul-dataplane will support providing either the node_id or node_name in its configuration. Unfortunately, supporting both in the xDS meta adds a fair amount of complexity (partly because most tables are currently indexed on node_name) so for now we're going to return them both from the bootstrap params endpoint, allowing consul-dataplane to exchange a node_id for a node_name (which it will supply in the xDS meta). Properly setting service for gateways: To avoid the need to special case gateways in consul-dataplane, service will now either be the destination service name for connect proxies, or the gateway service name. This means it can be used as-is in Envoy configuration (i.e. as a cluster name or in metric tags).	2022-08-24 12:03:15 +01:00
Chris S. Kim	1e7a3b8d8d	PR feedback to specify Node name in test mock	2022-08-23 11:51:04 -04:00
Eric Haberkorn	3d45306e1b	Cluster peering failover disco chain changes (#14296 )	2022-08-23 09:13:43 -04:00
Chris S. Kim	0ae3462e61	Add missing mock assertions	2022-08-22 13:55:01 -04:00
cskh	e30d6bfc40	Fix: add missing ent meta for test (#14289 )	2022-08-22 13:51:04 -04:00
Chris S. Kim	9f96f98ab6	Expose external gRPC port in autopilot The grpc_port was added to a NodeService's meta in ea58f235f5da416224ba615405269661ba1f4d8d	2022-08-22 10:07:00 -04:00
cskh	a87d8f48be	fix: missing MaxInboundConnections field in service-defaults config entry (#14072 ) * fix: missing max_inbound_connections field in merge config	2022-08-19 14:11:21 -04:00
cskh	7f66dfc780	Fix: upgrade pkg imdario/merg to prevent merge config panic (#14237 ) * upgrade imdario/merg to prevent merge config panic * test: service definition takes precedence over service-defaults in merged results	2022-08-17 21:14:04 -04:00
James Hartig	a5a200e0e9	Use the maximum jitter when calculating the timeout The timeout should include the maximum possible jitter since the server will randomly add to it's timeout a jitter. If the server's timeout is less than the client's timeout then the client will return an i/o deadline reached error. Before: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' rpc error making call: i/o deadline reached real 10m11.469s user 0m0.018s sys 0m0.023s ``` After: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' [...] real 10m35.835s user 0m0.021s sys 0m0.021s ```	2022-08-17 10:24:09 -04:00
cskh	c20d016f62	fix: missing segment and partition (#14194 )	2022-08-12 15:21:39 -04:00
cskh	e7b5baa3cc	feat(telemetry): add labels to serf and memberlist metrics (#14161 ) * feat(telemetry): add labels to serf and memberlist metrics * changelog * doc update Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-08-11 22:09:56 -04:00
Chris S. Kim	182399255b	Handle breaking change for ServiceVirtualIP restore (#14149 ) Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0. This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.	2022-08-11 14:47:10 -04:00
Chris S. Kim	55945a8231	Add test to verify forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	fbbb54fdc2	Register peerStreamServer internally to enable RPC forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	534096a6ac	Handle wrapped errors in isFailedPreconditionErr	2022-08-11 11:16:02 -04:00
Daniel Kimsey	4243e1e05f	Add support for filtering the 'List Services' API 1. Create a bexpr filter for performing the filtering 2. Change the state store functions to return the raw (not aggregated) list of ServiceNodes. 3. Move the aggregate service tags by name logic out of the state store functions into a new function called from the RPC endpoint 4. Perform the filtering in the endpoint before aggregation.	2022-08-10 16:52:32 -05:00
Kyle Havlovitz	57afbb58ac	Merge pull request #13958 from hashicorp/gateway-wildcard-fix Fix wildcard picking up services it shouldn't for ingress/terminating gateways	2022-08-08 12:54:40 -07:00
Kyle Havlovitz	2a0ab31ca4	Add some extra handling for destination deletes	2022-08-08 11:38:13 -07:00
freddygv	1e48b4f665	Update snapshot test	2022-08-08 09:17:15 -06:00
freddygv	65bcd3d84f	Re-validate existing secrets at state store Previously establishment and pending secrets were only checked at the RPC layer. However, given that these are Check-and-Set transactions we should ensure that the given secrets are still valid when persisting a secret exchange or promotion. Otherwise it would be possible for concurrent requests to overwrite each other.	2022-08-08 09:06:07 -06:00
freddygv	67aa7ed15c	Test fixes	2022-08-08 08:31:47 -06:00
freddygv	01b0cbcbd7	Use proto message for each secrets write op Previously there was a field indicating the operation that triggered a secrets write. Now there is a message for each operation and it contains the secret ID being persisted.	2022-08-08 01:41:00 -06:00
Kyle Havlovitz	3f435f31ac	Update ingress/terminating wildcard logic and handle destinations	2022-08-05 07:56:10 -07:00
freddygv	3a623f2e9d	Inherit active secret when exchanging	2022-08-03 17:32:53 -05:00
freddygv	b089472a12	Pass explicit signal with op for secrets write Previously the updates to the peering secrets UUID table relied on inferring what action triggered the update based on a reconciliation against the existing secrets. Instead we now explicitly require the operation to be given so that the inference isn't necessary. This makes the UUID table logic easier to reason about and fixes some related bugs. There is also an update so that the peering secrets get handled on snapshots/restores.	2022-08-03 17:25:12 -05:00
freddygv	544b3603e9	Avoid deleting peering secret UUIDs at dialers Dialers do not keep track of peering secret UUIDs, so they should not attempt to clean up data from that table when their peering is deleted. We also now keep peer server addresses when marking peerings for deletion. Peer server addresses are used by the ShouldDial() helper when determining whether the peering is for a dialer or an acceptor. We need to keep this data so that peering secrets can be cleaned up accordingly.	2022-08-03 16:34:57 -05:00
Kyle Havlovitz	fce49a1ec0	Fix wildcard picking up services it shouldn't for ingress/terminating gateways	2022-08-02 09:41:31 -07:00
Freddy	56144cf5f7	Various peering fixes (#13979 ) * Avoid logging StreamSecretID * Wrap additional errors in stream handler * Fix flakiness in leader test and rename servers for clarity. There was a race condition where the peering was being deleted in the test before the stream was active. Now the test waits for the stream to be connected on both sides before deleting the associated peering. * Run flaky test serially	2022-08-01 15:06:18 -06:00
Luke Kysow	e9960dfdf3	peering: default to false (#13963 ) * defaulting to false because peering will be released as beta * Ignore peering disabled error in bundles cachetype Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: freddygv <freddy@hashicorp.com> Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2022-08-01 15:22:36 -04:00
Freddy	a54903b0f4	Merge branch 'main' into fix-kv_entries-metric	2022-08-01 13:19:27 -06:00
Matt Keeler	795e5830c6	Implement/Utilize secrets for Peering Replication Stream (#13977 )	2022-08-01 10:33:18 -04:00
alex	0f6354685b	block PeerName register requests (#13887 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-29 14:36:22 -07:00
Luke Kysow	17594a123e	peering: retry establishing connection more quickly on certain errors (#13938 ) When we receive a FailedPrecondition error, retry that more quickly because we expect it will resolve shortly. This is particularly important in the context of Consul servers behind a load balancer because when establishing a connection we have to retry until we randomly land on a leader node. The default retry backoff goes from 2s, 4s, 8s, etc. which can result in very long delays quite quickly. Instead, this backoff retries in 8ms five times, then goes exponentially from there: 16ms, 32ms, ... up to a max of 8152ms.	2022-07-29 13:04:32 -07:00
acpana	70e052f35f	sync more acl enforcement sync w ent at 32756f7 Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-28 12:01:52 -07:00
Ashwin Venkatesh	e4aaf467eb	Add peer counts to emitted metrics. (#13930 )	2022-07-27 18:34:04 -04:00
Luke Kysow	a2290791b2	Merge pull request #13924 from hashicorp/lkysow/util-metric-peering peering: don't track imported services/nodes in usage	2022-07-27 14:49:55 -07:00
Chris S. Kim	213e985d17	Reduce arm64 flakes for TestConnectCA_ConfigurationSet_ChangeKeyConfig_Primary There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12. testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.	2022-07-27 13:54:34 -04:00
Chris S. Kim	c80ab10527	Retry checks for virtual IP metadata	2022-07-27 13:54:34 -04:00
Chris S. Kim	146dd93775	Sort slice of ServiceNames deterministically	2022-07-27 13:54:34 -04:00
Luke Kysow	92c1f30359	peering: don't track imported services/nodes in usage Services/nodes that are imported from other peers are stored in state. We don't want to count those as part of our own cluster's usage.	2022-07-27 09:08:51 -07:00
cskh	f7858a1bda	chore: clarify the error message: service.service must not be empty (#13907 ) - when register service using catalog endpoint, the key of service name actually should be "service". Add this information to the error message will help user to quickly fix in the request.	2022-07-27 10:16:46 -04:00
Chris S. Kim	1f8ae56951	Preserve PeeringState on upsert (#13666 ) Fixes a bug where if the generate token is called twice, the second call upserts the zero-value (undefined) of PeeringState.	2022-07-25 14:37:56 -04:00
DanStough	f690d299c9	feat: convert destination address to slice	2022-07-25 12:31:58 -04:00
freddygv	5bbc0cc615	Add ACL enforcement to peering endpoints	2022-07-25 09:34:29 -06:00
Luke Kysow	d21f793b74	peering: add config to enable/disable peering (#13867 ) * peering: add config to enable/disable peering Add config: ``` peering { enabled = true } ``` Defaults to true. When disabled: 1. All peering RPC endpoints will return an error 2. Leader won't start its peering establishment goroutines 3. Leader won't start its peering deletion goroutines	2022-07-22 15:20:21 -07:00
alex	7bd55578cc	peering: emit exported services count metric (#13811 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-22 12:05:08 -07:00
Eric Haberkorn	e044343105	Add Cluster Peering Failover Support to Prepared Queries (#13835 ) Add peering failover support to prepared queries	2022-07-22 09:14:43 -04:00
acpana	b847f656a8	Rename peering internal to ~ sync ENT to 5679392c81 Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-21 10:51:05 -07:00
Daniel Upton	e3bff8fb39	proxycfg-glue: server-local implementation of `PeeredUpstreams` This is the OSS portion of enterprise PR 2352. It adds a server-local implementation of the proxycfg.PeeredUpstreams interface based on a blocking query against the server's state store. It also fixes an omission in the Virtual IP freeing logic where we were never updating the max index (and therefore blocking queries against VirtualIPsForAllImportedServices would not return on service deletion).	2022-07-21 13:51:59 +01:00
Paul Glass	a9f17c0f99	Extract AWS auth implementation out of Consul (#13760 )	2022-07-19 16:26:44 -05:00
alex	64b3705a31	peering: refactor reconcile, cleanup (#13795 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-19 11:43:29 -07:00
alex	4ff097c4cf	peering: track exported services (#13784 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-18 10:20:04 -07:00
Luke Kysow	3968f21339	Add docs for peerStreamServer vs peeringServer. (#13781 )	2022-07-15 12:23:05 -07:00
Luke Kysow	a8721c33c5	peerstream: dialer should reconnect when stream closes (#13745 ) * peerstream: dialer should reconnect when stream closes If the stream is closed unexpectedly (i.e. when we haven't received a terminated message), the dialer should attempt to re-establish the stream. Previously, the `HandleStream` would return `nil` when the stream was closed. The caller then assumed the stream was terminated on purpose and so didn't reconnect when instead it was stopped unexpectedly and the dialer should have attempted to reconnect.	2022-07-15 11:58:33 -07:00
R.B. Boyer	61ebb38092	server: ensure peer replication can successfully use TLS over external gRPC (#13733 ) Ensure that the peer stream replication rpc can successfully be used with TLS activated. Also: - If key material is configured for the gRPC port but HTTPS is not enabled now TLS will still be activated for the gRPC port. - peerstream replication stream opened by the establishing-side will now ignore grpc.WithBlock so that TLS errors will bubble up instead of being awkwardly delayed or suppressed	2022-07-15 13:15:50 -05:00
alex	70ad4804b6	peering: track imported services (#13718 )	2022-07-15 10:20:43 -07:00
Dan Stough	084f9d7084	feat: connect proxy xDS for destinations Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>	2022-07-14 15:27:02 -04:00
Daniel Upton	688dfe3138	proxycfg-glue: server-local implementation of `ServiceList` This is the OSS portion of enterprise PR 2242. This PR introduces a server-local implementation of the proxycfg.ServiceList interface, backed by streaming events and a local materializer.	2022-07-14 18:22:12 +01:00
Dan Upton	34140ff3e0	grpc: rename public/private directories to external/internal (#13721 ) Previously, public referred to gRPC services that are both exposed on the dedicated gRPC port and have their definitions in the proto-public directory (so were considered usable by 3rd parties). Whereas private referred to services on the multiplexed server port that are only usable by agents and other servers. Now, we're splitting these definitions, such that external/internal refers to the port and public/private refers to whether they can be used by 3rd parties. This is necessary because the peering replication API needs to be exposed on the dedicated port, but is not (yet) suitable for use by 3rd parties.	2022-07-13 16:33:48 +01:00
R.B. Boyer	c880728ab4	peerstream: some cosmetic refactors to make this easier to follow (#13732 ) - Use some protobuf construction helper methods for brevity. - Rename a local variable to avoid later shadowing. - Rename the Nonce field to be more like xDS's naming. - Be more explicit about which PeerID fields are empty.	2022-07-13 10:00:35 -05:00
R.B. Boyer	ee5eb5a960	state: prohibit changing an exported tcp discovery chain in a way that would break SAN validation (#13727 ) For L4/tcp exported services the mesh gateways will not be terminating TLS. A caller in one peer will be directly establishing TLS connections to the ultimate exported service in the other peer. The caller will be doing SAN validation using the replicated SpiffeID values shipped from the exporting side. There are a class of discovery chain edits that could be done on the exporting side that would cause the introduction of a new SpiffeID value. In between the time of the config entry update on the exporting side and the importing side getting updated peer stream data requests to the exported service would fail due to SAN validation errors. This is unacceptable so instead prohibit the exporting peer from making changes that would break peering in this way.	2022-07-12 11:17:33 -05:00
R.B. Boyer	2c329475ce	state: prohibit exported discovery chains to have cross-datacenter or cross-partition references (#13726 ) Because peerings are pairwise, between two tuples of (datacenter, partition) having any exported reference via a discovery chain that crosses out of the peered datacenter or partition will ultimately not be able to work for various reasons. The biggest one is that there is no way in the ultimate destination to configure an intention that can allow an external SpiffeID to access a service. This PR ensures that a user simply cannot do this, so they won't run into weird situations like this.	2022-07-12 11:03:41 -05:00
Chris S. Kim	9f5ab3ec10	Return error if ServerAddresses is empty (#13714 )	2022-07-12 11:09:00 -04:00
R.B. Boyer	5b801db24b	peering: move peer replication to the external gRPC port (#13698 ) Peer replication is intended to be between separate Consul installs and effectively should be considered "external". This PR moves the peer stream replication bidirectional RPC endpoint to the external gRPC server and ensures that things continue to function.	2022-07-08 12:01:13 -05:00
R.B. Boyer	40c5c7eee2	server: broadcast the public grpc port using lan serf and update the consul service in the catalog with the same data (#13687 ) Currently servers exchange information about their WAN serf port and RPC port with serf tags, so that they all learn of each other's addressing information. We intend to make larger use of the new public-facing gRPC port exposed on all of the servers, so this PR addresses that by passing around the gRPC port via serf tags and then ensuring the generated consul service in the catalog has metadata about that new port as well for ease of non-serf-based lookup.	2022-07-07 13:55:41 -05:00
R.B. Boyer	4ce9651421	test: update mockery use to put mocks into test files (#13656 ) --testonly doesn't do anything anymore so switch to --filename instead	2022-07-05 16:57:15 -05:00
Chris S. Kim	0910c41d95	Revise possible states for a peering. (#13661 ) These changes are primarily for Consul's UI, where we want to be more specific about the state a peering is in. - The "initial" state was renamed to pending, and no longer applies to peerings being established from a peering token. - Upon request to establish a peering from a peering token, peerings will be set as "establishing". This will help distinguish between the two roles: the cluster that generates the peering token and the cluster that establishes the peering. - When marked for deletion, peering state will be set to "deleting". This way the UI determines the deletion via the state rather than the "DeletedAt" field. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-07-04 10:47:58 -04:00
Daniel Upton	e1d0aff462	proxycfg: server-local intention upstreams data source This is the OSS portion of enterprise PR 2157. It builds on the local blocking query work in #13438 to implement the proxycfg.IntentionUpstreams interface using server-local data. Also moves the ACL filtering logic from agent/consul into the acl/filter package so that it can be reused here.	2022-07-04 10:48:36 +01:00
Daniel Upton	21ea217b1d	proxycfg: server-local intentions data source This is the OSS portion of enterprise PR 2141. This commit provides a server-local implementation of the `proxycfg.Intentions` interface that sources data from streaming events. It adds events for the `service-intentions` config entry type, and then consumes event streams (via materialized views) for the service's explicit intentions and any applicable wildcard intentions, merging them into a single list of intentions. An alternative approach I considered was to consume _all_ intention events (via `SubjectWildcard`) and filter out the irrelevant ones. This would admittedly remove some complexity in the `agent/proxycfg-glue` package but at the expense of considerable overhead from waking potentially many thousands of connect proxies every time any intention is updated.	2022-07-04 10:48:36 +01:00
Daniel Upton	497df1ca3b	proxycfg: server-local config entry data sources This is the OSS portion of enterprise PR 2056. This commit provides server-local implementations of the proxycfg.ConfigEntry and proxycfg.ConfigEntryList interfaces, that source data from streaming events. It makes use of the LocalMaterializer type introduced for peering replication, adding the necessary support for authorization. It also adds support for "wildcard" subscriptions (within a topic) to the event publisher, as this is needed to fetch service-resolvers for all services when configuring mesh gateways. Currently, events will be emitted for just the ingress-gateway, service-resolver, and mesh config entry types, as these are the only entries required by proxycfg — the events will be emitted on topics named IngressGateway, ServiceResolver, and MeshConfig topics respectively. Though these events will only be consumed "locally" for now, they can also be consumed via the gRPC endpoint (confirmed using grpcurl) so using them from client agents should be a case of swapping the LocalMaterializer for an RPCMaterializer.	2022-07-04 10:48:36 +01:00
Chris S. Kim	0dc1b11e42	Fix ENT drift in files (#13647 )	2022-06-29 16:53:22 -04:00
Chris S. Kim	25aec40e74	Add internal endpoint to fetch peered upstream candidates from VirtualIP table (#13642 ) For initial cluster peering TProxy support we consider all imported services of a partition to be potential upstreams. We leverage the VirtualIP table because it stores plain service names (e.g. "api", not "api-sidecar-proxy").	2022-06-29 16:34:58 -04:00
R.B. Boyer	2dba16be52	peering: replicate all SpiffeID values necessary for the importing side to do SAN validation (#13612 ) When traversing an exported peered service, the discovery chain evaluation at the other side may re-route the request to a variety of endpoints. Furthermore we intend to terminate mTLS at the mesh gateway for arriving peered traffic that is http-like (L7), so the caller needs to know the mesh gateway's SpiffeID in that case as well. The following new SpiffeID values will be shipped back in the peerstream replication: - tcp: all possible SpiffeIDs resulting from the service-resolver component of the exported discovery chain - http-like: the SpiffeID of the mesh gateway	2022-06-27 14:37:18 -05:00
Max Bowsher	728cd03e24	Merge branch 'main' into fix-kv_entries-metric	2022-06-27 18:57:03 +01:00
alex	4333312be9	peering, internal: support UIServices, UINodes, UINodeInfo (#13577 )	2022-06-24 15:17:35 -07:00
Chris S. Kim	a5f9994128	Add new index for PeeredServiceName and ServiceVirtualIP (#13582 ) For TProxy we will be leveraging the VirtualIP table, which needs to become peer-aware	2022-06-24 14:38:39 -04:00
Will Jordan	25f4c44268	Add per-node max indexes (#12399 ) Adds fine-grained node.[node] entries to the index table, allowing blocking queries to return fine-grained indexes that prevent them from returning immediately when unrelated nodes/services are updated. Co-authored-by: kisunji <ckim@hashicorp.com>	2022-06-23 11:13:25 -04:00
Chris S. Kim	aaf3c051f2	Make memdb indexers generic (#13558 ) We have many indexer functions in Consul which take interface{} and type assert before building the index. We can use generics to get rid of the initial plumbing and pass around functions with better defined signatures. This has two benefits: 1) Less verbosity; 2) Developers can parse the argument types to memdb schemas without having to introspect the function for the type assertion.	2022-06-23 11:07:19 -04:00
Matt Keeler	dc19b9f46f	Port over the index 0 -> 1 code that lived in the old rpc setQueryMeta function. (#13561 )	2022-06-23 09:34:47 -04:00
R.B. Boyer	e7a7232a6b	state: peering ID assignment cannot happen inside of the state store (#13525 ) Move peering ID assignment outisde of the FSM, so that the ID is written to the raft log and the same ID is used by all voters, and after restarts.	2022-06-21 13:04:08 -05:00
Matt Keeler	c2421825f3	Add server local blocking queries and watches (#13438 ) Co-authored-by: Dan Upton <daniel@floppy.co>	2022-06-21 13:36:49 -04:00
Max Bowsher	3f93b3cc88	Fix incorrect name and doc for kv_entries metric The name of the metric as registered with the metrics library to provide the help string, was incorrect compared with the actual code that sets the metric value - bring them into sync. Also, the help message was incorrect. Rather than copy the help message from telemetry.mdx, which was correct, but felt a bit unnatural in the way it was worded, update both of them to a new wording.	2022-06-19 11:58:23 +01:00
Dan Upton	989b22425c	Move ACLResolveResult into acl/resolver package (#13467 ) Having this type live in the agent/consul package makes it difficult to put anything that relies on token resolution (e.g. the new gRPC services) in separate packages without introducing import cycles. For example, if package foo imports agent/consul for the ACLResolveResult type it means that agent/consul cannot import foo to register its service. We've previously worked around this by wrapping the ACLResolver to "downgrade" its return type to an acl.Authorizer - aside from the added complexity, this also loses the resolved identity information. In the future, we may want to move the whole ACLResolver into the acl/resolver package. For now, putting the result type there at least, fixes the immediate import cycle issues.	2022-06-17 10:24:43 +01:00
DanStough	37694eefb5	feat: tgtwy xDS generation for destinations Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>	2022-06-16 16:17:49 -04:00
alex	d73adfef81	peering: block Intention.Apply ops (#13451 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-16 12:07:28 -07:00
alex	ba1f235d70	peering, state: account for peer intentions (#13443 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-16 10:27:31 -07:00
R.B. Boyer	93611819e2	xds: mesh gateways now have their own leaf certificate when involved in a peering (#13460 ) This is only configured in xDS when a service with an L7 protocol is exported. They also load any relevant trust bundles for the peered services to eventually use for L7 SPIFFE validation during mTLS termination.	2022-06-15 14:36:18 -05:00
Riddhi Shah	414bb7e34e	[OSS] Support merge-central-config option in node services list API (#13450 ) Adds the merge-central-config query param option to the /catalog/node-services/:node-name API, to get a service definition in the response that is merged with central defaults (proxy-defaults/service-defaults). Updated the consul connect envoy command to use this option when retrieving the proxy service details so as to render the bootstrap configuration correctly.	2022-06-15 08:30:31 -07:00
freddygv	a288d0c388	Avoid deleting peerings marked as terminated. When our peer deletes the peering it is locally marked as terminated. This termination should kick off deleting all imported data, but should not delete the peering object itself. Keeping peerings marked as terminated acts as a signal that the action took place.	2022-06-14 15:37:09 -06:00
freddygv	a5283e4361	Add leader routine to clean up peerings Once a peering is marked for deletion a new leader routine will now clean up all imported resources and then the peering itself. A lot of the logic was grabbed from the namespace/partitions deferred deletions but with a handful of simplifications: - The rate limiting is not configurable. - Deleting imported nodes/services/checks is done by deleting nodes with the Txn API. The services and checks are deleted as a side-effect. - There is no "round rate limiter" like with namespaces and partitions. This is because peerings are purely local, and deleting a peering in the datacenter does not depend on deleting data from other DCs like with WAN-federated namespaces. All rate limiting is handled by the Raft rate limiter.	2022-06-14 15:36:50 -06:00
freddygv	dbcbf3978f	Fixup stream tear-down steps. 1. Fix a bug where the peering leader routine would not track all active peerings in the "stored" reconciliation map. This could lead to tearing down streams where the token was generated, since the ConnectedStreams() method used for reconciliation returns all streams and not just the ones initiated by this leader routine. 2. Fix a race where stream contexts were being canceled before termination messages were being processed by a peer. Previously the leader routine would tear down streams by canceling their context right after the termination message was sent. This context cancelation could be propagated to the server side faster than the termination message. Now there is a change where the dialing peer uses CloseSend() to signal when no more messages will be sent. Eventually the server peer will read an EOF after receiving and processing the preceding termination message. Using CloseSend() is actually not enough to address the issue mentioned, since it doesn't wait for the server peer to finish processing messages. Because of this now the dialing peer also reads from the stream until an error signals that there are no more messages. Receiving an EOF from our peer indicates that they processed the termination message and have no additional work to do. Given that the stream is being closed, all the messages received by Recv are discarded. We only check for errors to avoid importing new data.	2022-06-13 12:10:42 -06:00
freddygv	6d368b5eed	Update peering state and RPC for deferred deletion When deleting a peering we do not want to delete the peering and all imported data in a single operation, since deleting a large amount of data at once could overload Consul. Instead we defer deletion of peerings so that: 1. When a peering deletion request is received via gRPC the peering is marked for deletion by setting the DeletedAt field. 2. A leader routine will monitor for peerings that are marked for deletion and kick off a throttled deletion of all imported resources before deleting the peering itself. This commit mostly addresses point #1 by modifying the peering service to mark peerings for deletion. Another key change is to add a PeeringListDeleted state store function which can return all peerings marked for deletion. This function is what will be watched by the deferred deletion leader routine.	2022-06-13 12:10:32 -06:00
Freddy	9eeb9e4ee3	Clean up imported nodes/services/checks as needed (#13367 ) Previously, imported data would never be deleted. As nodes/services/checks were registered and deregistered, resources deleted from the exporting cluster would accumulate in the imported cluster. This commit makes updates to replication so that whenever an update is received for a service name we reconcile what was present in the catalog against what was received. This handleUpdateService method can handle both updates and deletions.	2022-06-13 11:52:28 -06:00
R.B. Boyer	33b497e7c9	peering: rename initiate to establish in the context of the APIs (#13419 )	2022-06-10 11:10:46 -05:00
kisunji	1eff652620	Add missing index for read	2022-06-08 13:53:31 -04:00
kisunji	86bf8b8fb6	Add IntentionMatch tests for source peers	2022-06-08 13:53:31 -04:00
kisunji	7d9f413949	Update ServiceIntentionSourceIndex to handle peer	2022-06-08 13:53:31 -04:00
R.B. Boyer	c1f20d17ee	peering: allow protobuf requests to populate the default partition or namespace (#13398 )	2022-06-08 11:55:18 -05:00
Dhia Ayachi	80556c9ffc	Fix intentions wildcard dest (#13397 ) * when enterprise meta are wildcard assume it's a service intention * fix partition and namespace * move kind outside the loops * get the kind check outside the loop and add a comment Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-08 10:38:55 -04:00
R.B. Boyer	bf647bc9d2	peering: avoid a race between peering establishment and termination (#13389 )	2022-06-07 16:29:09 -05:00
Dhia Ayachi	3deaf767f2	Egress gtw/intention rpc endpoint (#13354 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * first draft of destinations intention in connect proxy * remove ServiceDestinationList * fix failing tests * fix agent/consul failing tests * change to filter intentions in the state store instead of adding a field. * fix failing tests * fix comment * fix comments * store service kind destination and add relevant tests * changes based on review * filter on destinations when querying source match * change state store API to get an IntentionTarget parameter * add intentions tests * add destination upstream endpoint * fix failing test * fix failing test and a bug with wildcard intentions * fix failing test * Apply suggestions from code review Co-authored-by: alex <8968914+acpana@users.noreply.github.com> * add missing test and clarify doc * fix style * gofmt intention.go * fix merge introduced issue Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-07 15:55:02 -04:00
Dhia Ayachi	7602b6ebf2	Egress gtw/connect destination intentions (#13341 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * first draft of destinations intention in connect proxy * remove ServiceDestinationList * fix failing tests * fix agent/consul failing tests * change to filter intentions in the state store instead of adding a field. * fix failing tests * fix comment * fix comments * store service kind destination and add relevant tests * changes based on review * filter on destinations when querying source match * Apply suggestions from code review Co-authored-by: alex <8968914+acpana@users.noreply.github.com> * fix style * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * rename destinationType to targetType. Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-07 15:03:59 -04:00
R.B. Boyer	0681f3571d	peering: allow mesh gateways to proxy L4 peered traffic (#13339 ) Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied. Note: this does not change the callers to use these mesh gateways.	2022-06-06 14:20:41 -05:00
alex	ff2ad3ba0c	peering: send leader addr (#13342 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-06 10:00:38 -07:00
cskh	bd3a4dfeab	Add isLeader metric to track if a server is a leader (#13304 ) CTIA-21: sdd is_leader metric to track if a server is a leader Co-authored-by: alex <8968914+acpana@users.noreply.github.com>	2022-06-03 13:07:37 -04:00
freddygv	ad6dbe081a	Add agent cache-type for TrustBundleListByService There are a handful of changes in this commit: * When querying trust bundles for a service we need to be able to specify the namespace of the service. * The endpoint needs to track the index because the cache watches use it. * Extracted bulk of the endpoint's logic to a state store function so that index tracking could be tested more easily. * Removed check for service existence, deferring that sort of work to ACL authz * Added the cache type	2022-06-01 17:05:10 -06:00
freddygv	073c9e3a91	Update assumptions around exported-service config Given that the exported-services config entry can use wildcards, the precedence for wildcards is handled as with intentions. The most exact match is the match that applies for any given service. We do not take the union of all that apply. Another update that was made was to reflect that only one exported-services config entry applies to any given service in a partition. This is a pre-existing constraint that gets enforced by the Normalize() method on that config entry type.	2022-06-01 17:03:51 -06:00
Freddy	6ef38eaea7	Configure upstream TLS context with peer root certs (#13321 ) For mTLS to work between two proxies in peered clusters with different root CAs, proxies need to configure their outbound listener to use different root certificates for validation. Up until peering was introduced proxies would only ever use one set of root certificates to validate all mesh traffic, both inbound and outbound. Now an upstream proxy may have a leaf certificate signed by a CA that's different from the dialing proxy's. This PR makes changes to proxycfg and xds so that the upstream TLS validation uses different root certificates depending on which cluster is being dialed.	2022-06-01 15:53:52 -06:00
Dhia Ayachi	d4a04457e1	update gateway-services table with endpoints (#13217 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * rename consts and fix kind when no destination is defined in the service-defaults. * rename Kind to ServiceKind and change switch to use .(type) Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-05-31 16:20:12 -04:00
Dan Upton	a6a6d5a8ee	Enable servers to configure arbitrary proxies from the catalog (#13244 ) OSS port of enterprise PR 1822 Includes the necessary changes to the `proxycfg` and `xds` packages to enable Consul servers to configure arbitrary proxies using catalog data. Broadly, `proxycfg.Manager` now has public methods for registering, deregistering, and listing registered proxies — the existing local agent state-sync behavior has been moved into a separate component that makes use of these methods. When an xDS session is started for a proxy service in the catalog, a goroutine will be spawned to watch the service in the server's state store and re-register it with the `proxycfg.Manager` whenever it is updated (and clean it up when the client goes away).	2022-05-27 12:38:52 +01:00
alex	2d8664d384	monitor leadership in peering service (#13257 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-05-26 17:55:16 -07:00
Chris S. Kim	d73a9522cb	Add support for streaming CA roots to peers (#13260 ) Sender watches for changes to CA roots and sends them through the replication stream. Receiver saves CA roots to tablePeeringTrustBundle	2022-05-26 15:24:09 -04:00
Riddhi Shah	e5f1d8dce4	Add support for merge-central-config query param (#13001 ) Adds a new query param merge-central-config for use with the below endpoints: /catalog/service/:service /catalog/connect/:service /health/service/:service /health/connect/:service If set on the request, the response will include a fully resolved service definition which is merged with the proxy-defaults/global and service-defaults/:service config entries (on-demand style). This is useful to view the full service definition for a mesh service (connect-proxy kind or gateway kind) which might not be merged before being written into the catalog (example: in case of services in the agentless model).	2022-05-25 13:20:17 -07:00
R.B. Boyer	bc10055edc	peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218 ) The importing peer will need to know what SNI and SPIFFE name corresponds to each exported service. Additionally it will need to know at a high level the protocol in use (L4/L7) to generate the appropriate connection pool and local metrics. For replicated connect synthetic entities we edit the `Connect{}` part of a `NodeService` to have a new section: { "PeerMeta": { "SNI": [ "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul" ], "SpiffeID": [ "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web" ], "Protocol": "tcp" } } This data is then replicated and saved as-is at the importing side. Both SNI and SpiffeID are slices for now until I can be sure we don't need them for how mesh gateways will ultimately work.	2022-05-25 12:37:44 -05:00
alex	451dc50f4f	peering: expose IsLeader, hung up on dialer if follower (#13164 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-05-23 11:30:58 -07:00
cskh	39cb731988	Upgrade golangci-lint for go v1.18 (#13176 )	2022-05-23 10:26:45 -04:00

1 2 3 4 5 ...

2026 Commits