Commit Graph

1876 Commits

Author SHA1 Message Date
Chris S. Kim 55945a8231 Add test to verify forwarding 2022-08-11 11:16:02 -04:00
Chris S. Kim fbbb54fdc2 Register peerStreamServer internally to enable RPC forwarding 2022-08-11 11:16:02 -04:00
Chris S. Kim 534096a6ac Handle wrapped errors in isFailedPreconditionErr 2022-08-11 11:16:02 -04:00
Kyle Havlovitz 57afbb58ac
Merge pull request #13958 from hashicorp/gateway-wildcard-fix
Fix wildcard picking up services it shouldn't for ingress/terminating gateways
2022-08-08 12:54:40 -07:00
Kyle Havlovitz 2a0ab31ca4 Add some extra handling for destination deletes 2022-08-08 11:38:13 -07:00
freddygv 1e48b4f665 Update snapshot test 2022-08-08 09:17:15 -06:00
freddygv 65bcd3d84f Re-validate existing secrets at state store
Previously establishment and pending secrets were only checked at the
RPC layer. However, given that these are Check-and-Set transactions we
should ensure that the given secrets are still valid when persisting a
secret exchange or promotion.

Otherwise it would be possible for concurrent requests to overwrite each
other.
2022-08-08 09:06:07 -06:00
freddygv 67aa7ed15c Test fixes 2022-08-08 08:31:47 -06:00
freddygv 01b0cbcbd7 Use proto message for each secrets write op
Previously there was a field indicating the operation that triggered a
secrets write. Now there is a message for each operation and it contains
the secret ID being persisted.
2022-08-08 01:41:00 -06:00
Kyle Havlovitz 3f435f31ac Update ingress/terminating wildcard logic and handle destinations 2022-08-05 07:56:10 -07:00
freddygv 3a623f2e9d Inherit active secret when exchanging 2022-08-03 17:32:53 -05:00
freddygv b089472a12 Pass explicit signal with op for secrets write
Previously the updates to the peering secrets UUID table relied on
inferring what action triggered the update based on a reconciliation
against the existing secrets.

Instead we now explicitly require the operation to be given so that the
inference isn't necessary. This makes the UUID table logic easier to
reason about and fixes some related bugs.

There is also an update so that the peering secrets get handled on
snapshots/restores.
2022-08-03 17:25:12 -05:00
freddygv 544b3603e9 Avoid deleting peering secret UUIDs at dialers
Dialers do not keep track of peering secret UUIDs, so they should not
attempt to clean up data from that table when their peering is deleted.

We also now keep peer server addresses when marking peerings for
deletion. Peer server addresses are used by the ShouldDial() helper
when determining whether the peering is for a dialer or an acceptor.
We need to keep this data so that peering secrets can be cleaned up
accordingly.
2022-08-03 16:34:57 -05:00
Kyle Havlovitz fce49a1ec0 Fix wildcard picking up services it shouldn't for ingress/terminating gateways 2022-08-02 09:41:31 -07:00
Freddy 56144cf5f7
Various peering fixes (#13979)
* Avoid logging StreamSecretID
* Wrap additional errors in stream handler
* Fix flakiness in leader test and rename servers for clarity. There was
  a race condition where the peering was being deleted in the test
  before the stream was active. Now the test waits for the stream to be
  connected on both sides before deleting the associated peering.
* Run flaky test serially
2022-08-01 15:06:18 -06:00
Luke Kysow e9960dfdf3
peering: default to false (#13963)
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
2022-08-01 15:22:36 -04:00
Matt Keeler 795e5830c6
Implement/Utilize secrets for Peering Replication Stream (#13977) 2022-08-01 10:33:18 -04:00
alex 0f6354685b
block PeerName register requests (#13887)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-29 14:36:22 -07:00
Luke Kysow 17594a123e
peering: retry establishing connection more quickly on certain errors (#13938)
When we receive a FailedPrecondition error, retry that more quickly
because we expect it will resolve shortly. This is particularly
important in the context of Consul servers behind a load balancer
because when establishing a connection we have to retry until we
randomly land on a leader node.

The default retry backoff goes from 2s, 4s, 8s, etc. which can result in
very long delays quite quickly. Instead, this backoff retries in 8ms
five times, then goes exponentially from there: 16ms, 32ms, ... up to a
max of 8152ms.
2022-07-29 13:04:32 -07:00
acpana 70e052f35f
sync more acl enforcement
sync w ent at 32756f7

Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-28 12:01:52 -07:00
Ashwin Venkatesh e4aaf467eb
Add peer counts to emitted metrics. (#13930) 2022-07-27 18:34:04 -04:00
Luke Kysow a2290791b2
Merge pull request #13924 from hashicorp/lkysow/util-metric-peering
peering: don't track imported services/nodes in usage
2022-07-27 14:49:55 -07:00
Chris S. Kim 213e985d17 Reduce arm64 flakes for TestConnectCA_ConfigurationSet_ChangeKeyConfig_Primary
There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12.

testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.
2022-07-27 13:54:34 -04:00
Chris S. Kim c80ab10527 Retry checks for virtual IP metadata 2022-07-27 13:54:34 -04:00
Chris S. Kim 146dd93775 Sort slice of ServiceNames deterministically 2022-07-27 13:54:34 -04:00
Luke Kysow 92c1f30359 peering: don't track imported services/nodes in usage
Services/nodes that are imported from other peers are stored in
state. We don't want to count those as part of our own cluster's usage.
2022-07-27 09:08:51 -07:00
cskh f7858a1bda
chore: clarify the error message: service.service must not be empty (#13907)
- when register service using catalog endpoint, the key of service
  name actually should be "service". Add this information to the
  error message will help user to quickly fix in the request.
2022-07-27 10:16:46 -04:00
Chris S. Kim 1f8ae56951
Preserve PeeringState on upsert (#13666)
Fixes a bug where if the generate token is called twice, the second call upserts the zero-value (undefined) of PeeringState.
2022-07-25 14:37:56 -04:00
DanStough f690d299c9 feat: convert destination address to slice 2022-07-25 12:31:58 -04:00
freddygv 5bbc0cc615 Add ACL enforcement to peering endpoints 2022-07-25 09:34:29 -06:00
Luke Kysow d21f793b74
peering: add config to enable/disable peering (#13867)
* peering: add config to enable/disable peering

Add config:

```
peering {
  enabled = true
}
```

Defaults to true. When disabled:
1. All peering RPC endpoints will return an error
2. Leader won't start its peering establishment goroutines
3. Leader won't start its peering deletion goroutines
2022-07-22 15:20:21 -07:00
alex 7bd55578cc
peering: emit exported services count metric (#13811)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-22 12:05:08 -07:00
Eric Haberkorn e044343105
Add Cluster Peering Failover Support to Prepared Queries (#13835)
Add peering failover support to prepared queries
2022-07-22 09:14:43 -04:00
acpana b847f656a8
Rename peering internal to ~
sync ENT to 5679392c81

Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-21 10:51:05 -07:00
Daniel Upton e3bff8fb39 proxycfg-glue: server-local implementation of `PeeredUpstreams`
This is the OSS portion of enterprise PR 2352.

It adds a server-local implementation of the proxycfg.PeeredUpstreams interface
based on a blocking query against the server's state store.

It also fixes an omission in the Virtual IP freeing logic where we were never
updating the max index (and therefore blocking queries against
VirtualIPsForAllImportedServices would not return on service deletion).
2022-07-21 13:51:59 +01:00
Paul Glass a9f17c0f99
Extract AWS auth implementation out of Consul (#13760) 2022-07-19 16:26:44 -05:00
alex 64b3705a31
peering: refactor reconcile, cleanup (#13795)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-19 11:43:29 -07:00
alex 4ff097c4cf
peering: track exported services (#13784)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-18 10:20:04 -07:00
Luke Kysow 3968f21339
Add docs for peerStreamServer vs peeringServer. (#13781) 2022-07-15 12:23:05 -07:00
Luke Kysow a8721c33c5
peerstream: dialer should reconnect when stream closes (#13745)
* peerstream: dialer should reconnect when stream closes

If the stream is closed unexpectedly (i.e. when we haven't received
a terminated message), the dialer should attempt to re-establish the
stream.

Previously, the `HandleStream` would return `nil` when the stream
was closed. The caller then assumed the stream was terminated on purpose
and so didn't reconnect when instead it was stopped unexpectedly and
the dialer should have attempted to reconnect.
2022-07-15 11:58:33 -07:00
R.B. Boyer 61ebb38092
server: ensure peer replication can successfully use TLS over external gRPC (#13733)
Ensure that the peer stream replication rpc can successfully be used with TLS activated.

Also:

- If key material is configured for the gRPC port but HTTPS is not
  enabled now TLS will still be activated for the gRPC port.

- peerstream replication stream opened by the establishing-side will now
  ignore grpc.WithBlock so that TLS errors will bubble up instead of
  being awkwardly delayed or suppressed
2022-07-15 13:15:50 -05:00
alex 70ad4804b6
peering: track imported services (#13718) 2022-07-15 10:20:43 -07:00
Dan Stough 084f9d7084 feat: connect proxy xDS for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-07-14 15:27:02 -04:00
Daniel Upton 688dfe3138 proxycfg-glue: server-local implementation of `ServiceList`
This is the OSS portion of enterprise PR 2242.

This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
2022-07-14 18:22:12 +01:00
Dan Upton 34140ff3e0
grpc: rename public/private directories to external/internal (#13721)
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.

Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.

This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
2022-07-13 16:33:48 +01:00
R.B. Boyer c880728ab4
peerstream: some cosmetic refactors to make this easier to follow (#13732)
- Use some protobuf construction helper methods for brevity.
- Rename a local variable to avoid later shadowing.
- Rename the Nonce field to be more like xDS's naming.
- Be more explicit about which PeerID fields are empty.
2022-07-13 10:00:35 -05:00
R.B. Boyer ee5eb5a960
state: prohibit changing an exported tcp discovery chain in a way that would break SAN validation (#13727)
For L4/tcp exported services the mesh gateways will not be terminating
TLS. A caller in one peer will be directly establishing TLS connections
to the ultimate exported service in the other peer.

The caller will be doing SAN validation using the replicated SpiffeID
values shipped from the exporting side. There are a class of discovery
chain edits that could be done on the exporting side that would cause
the introduction of a new SpiffeID value. In between the time of the
config entry update on the exporting side and the importing side getting
updated peer stream data requests to the exported service would fail due
to SAN validation errors.

This is unacceptable so instead prohibit the exporting peer from making
changes that would break peering in this way.
2022-07-12 11:17:33 -05:00
R.B. Boyer 2c329475ce
state: prohibit exported discovery chains to have cross-datacenter or cross-partition references (#13726)
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.

This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
2022-07-12 11:03:41 -05:00
Chris S. Kim 9f5ab3ec10
Return error if ServerAddresses is empty (#13714) 2022-07-12 11:09:00 -04:00
R.B. Boyer 5b801db24b
peering: move peer replication to the external gRPC port (#13698)
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
2022-07-08 12:01:13 -05:00