Commit Graph

4533 Commits

Author SHA1 Message Date
Chris S. Kim 7e95b35102 Fix casing 2022-08-26 11:56:26 -04:00
Chris S. Kim a5e9ea6d96 Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer
# Conflicts:
#	agent/grpc-external/services/peerstream/stream_test.go
2022-08-26 10:43:56 -04:00
Chris S. Kim a8090268d4
Replace ring buffer with async version (#14314)
We need to watch for changes to peerings and update the server addresses which get served by the ring buffer.

Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.
2022-08-26 10:27:13 -04:00
alex f64af3be24
peering: add peer health metric (#14004)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-08-25 16:32:59 -07:00
Chris S. Kim 2e75833133 Exit loop when context is cancelled 2022-08-25 11:48:25 -04:00
cskh 7ee1c857c3
Fix: the inboundconnection limit filter should be placed in front of http co… (#14325)
* fix: the inboundconnection limit should be placed in front of http connection manager

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-08-24 14:13:10 -04:00
Chris S. Kim eac63fea1f Update test comment 2022-08-24 13:50:24 -04:00
Chris S. Kim 6f98c853b8 Add check for zero-length server addresses 2022-08-24 13:30:52 -04:00
skpratt c039028401
no-op: refactor usagemetrics tests for clarity and DRY cases (#14313) 2022-08-24 12:00:09 -05:00
Pablo Ruiz García 4188769c32
Added new auto_encrypt.grpc_server_tls config option to control AutoTLS enabling of GRPC Server's TLS usage
Fix for #14253

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2022-08-24 12:31:38 -04:00
Dan Upton 20c87d235f
dataplane: update envoy bootstrap params for consul-dataplane (#14017)
Contains 2 changes to the GetEnvoyBootstrapParams response to support
consul-dataplane.

Exposing node_name and node_id:

consul-dataplane will support providing either the node_id or node_name in its
configuration. Unfortunately, supporting both in the xDS meta adds a fair amount
of complexity (partly because most tables are currently indexed on node_name)
so for now we're going to return them both from the bootstrap params endpoint,
allowing consul-dataplane to exchange a node_id for a node_name (which it will
supply in the xDS meta).

Properly setting service for gateways:

To avoid the need to special case gateways in consul-dataplane, service will now
either be the destination service name for connect proxies, or the gateway
service name. This means it can be used as-is in Envoy configuration (i.e. as a
cluster name or in metric tags).
2022-08-24 12:03:15 +01:00
Daniel Upton 1cd7ec0543 proxycfg: terminate stream on irrecoverable errors
This is the OSS portion of enterprise PR 2339.

It improves our handling of "irrecoverable" errors in proxycfg data sources.

The canonical example of this is what happens when the ACL token presented by
Envoy is deleted/revoked. Previously, the stream would get "stuck" until the
xDS server re-checked the token (after 5 minutes) and terminated the stream.

Materializers would also sit burning resources retrying something that could
never succeed.

Now, it is possible for data sources to mark errors as "terminal" which causes
the xDS stream to be closed immediately. Similarly, the submatview.Store will
evict materializers when it observes they have encountered such an error.
2022-08-23 20:17:49 +01:00
Chris S. Kim 1e7a3b8d8d PR feedback to specify Node name in test mock 2022-08-23 11:51:04 -04:00
Eric Haberkorn 3d45306e1b
Cluster peering failover disco chain changes (#14296) 2022-08-23 09:13:43 -04:00
Chris S. Kim c14b166b80 Fix flakes 2022-08-22 14:45:31 -04:00
Chris S. Kim 587c57d3f4 Increase heartbeat rate to reduce test flakes 2022-08-22 14:24:05 -04:00
Chris S. Kim c68e589f26 Remove check for ResponseNonce 2022-08-22 13:55:01 -04:00
Chris S. Kim 0ae3462e61 Add missing mock assertions 2022-08-22 13:55:01 -04:00
Chris S. Kim 575a56062f Fix data race
newMockSnapshotHandler has an assertion on t.Cleanup which gets called before the event publisher is cancelled. This commit reorders the context.WithCancel so it properly gets cancelled before the assertion is made.
2022-08-22 13:55:01 -04:00
cskh e30d6bfc40
Fix: add missing ent meta for test (#14289) 2022-08-22 13:51:04 -04:00
Chris S. Kim 98d102326f Handle server addresses update as client 2022-08-22 13:42:12 -04:00
Chris S. Kim 205e873689 Send server addresses on update from server 2022-08-22 13:41:44 -04:00
Chris S. Kim 4cf54bef4e Add new subscription for server addresses 2022-08-22 13:40:25 -04:00
Chris S. Kim e1a7456a69 Cleanup unused logger 2022-08-22 13:40:23 -04:00
Chris S. Kim 9f96f98ab6 Expose external gRPC port in autopilot
The grpc_port was added to a NodeService's meta in ea58f235f5da416224ba615405269661ba1f4d8d
2022-08-22 10:07:00 -04:00
cskh a87d8f48be
fix: missing MaxInboundConnections field in service-defaults config entry (#14072)
* fix:  missing max_inbound_connections field in merge config
2022-08-19 14:11:21 -04:00
cskh 7f66dfc780
Fix: upgrade pkg imdario/merg to prevent merge config panic (#14237)
* upgrade imdario/merg to prevent merge config panic

* test: service definition takes precedence over service-defaults in merged results
2022-08-17 21:14:04 -04:00
James Hartig a5a200e0e9 Use the maximum jitter when calculating the timeout
The timeout should include the maximum possible
jitter since the server will randomly add to it's
timeout a jitter. If the server's timeout is less
than the client's timeout then the client will
return an i/o deadline reached error.

Before:
```
time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644'
rpc error making call: i/o deadline reached
real    10m11.469s
user    0m0.018s
sys     0m0.023s
```

After:
```
time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644'
[...]
real    10m35.835s
user    0m0.021s
sys     0m0.021s
```
2022-08-17 10:24:09 -04:00
Eric Haberkorn 40ce1c8288
Add `Targets` field to service resolver failovers. (#14162)
This field will be used for cluster peering failover.
2022-08-15 09:20:25 -04:00
cskh c20d016f62
fix: missing segment and partition (#14194) 2022-08-12 15:21:39 -04:00
Eric Haberkorn 11884bfb99
Refactor failover code to use Envoy's aggregate clusters (#14178) 2022-08-12 14:30:46 -04:00
cskh e7b5baa3cc
feat(telemetry): add labels to serf and memberlist metrics (#14161)
* feat(telemetry): add labels to serf and memberlist metrics
* changelog
* doc update

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-08-11 22:09:56 -04:00
Chris S. Kim 182399255b
Handle breaking change for ServiceVirtualIP restore (#14149)
Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0.

This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.
2022-08-11 14:47:10 -04:00
Chris S. Kim 55945a8231 Add test to verify forwarding 2022-08-11 11:16:02 -04:00
Chris S. Kim fbbb54fdc2 Register peerStreamServer internally to enable RPC forwarding 2022-08-11 11:16:02 -04:00
Chris S. Kim 534096a6ac Handle wrapped errors in isFailedPreconditionErr 2022-08-11 11:16:02 -04:00
cskh 647f9787f8
fix: shadowed err in retryJoin() (#14112)
- err value will be used later to surface the error message
  if r.join() returns any err.
2022-08-10 10:53:57 -04:00
skpratt 070ed3738d
Merge pull request #14056 from hashicorp/proxy-register-port-race
Refactor sidecar_service method to separate port assignment
2022-08-10 09:46:29 -05:00
skpratt 7f1f095b2f Merge branch 'main' into proxy-register-port-race 2022-08-10 08:40:45 -05:00
Chris S. Kim 79d00f59cd Close active listeners on error
If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.
2022-08-09 12:22:39 -04:00
Chris S. Kim 4de96a1f3c Add retry in TestAgentConnectCALeafCert_good 2022-08-09 11:20:37 -04:00
Kyle Havlovitz 57afbb58ac
Merge pull request #13958 from hashicorp/gateway-wildcard-fix
Fix wildcard picking up services it shouldn't for ingress/terminating gateways
2022-08-08 12:54:40 -07:00
Kyle Havlovitz 2a0ab31ca4 Add some extra handling for destination deletes 2022-08-08 11:38:13 -07:00
freddygv 1e48b4f665 Update snapshot test 2022-08-08 09:17:15 -06:00
freddygv 65bcd3d84f Re-validate existing secrets at state store
Previously establishment and pending secrets were only checked at the
RPC layer. However, given that these are Check-and-Set transactions we
should ensure that the given secrets are still valid when persisting a
secret exchange or promotion.

Otherwise it would be possible for concurrent requests to overwrite each
other.
2022-08-08 09:06:07 -06:00
freddygv 67aa7ed15c Test fixes 2022-08-08 08:31:47 -06:00
freddygv 01b0cbcbd7 Use proto message for each secrets write op
Previously there was a field indicating the operation that triggered a
secrets write. Now there is a message for each operation and it contains
the secret ID being persisted.
2022-08-08 01:41:00 -06:00
Kyle Havlovitz 3f435f31ac Update ingress/terminating wildcard logic and handle destinations 2022-08-05 07:56:10 -07:00
freddygv 3a623f2e9d Inherit active secret when exchanging 2022-08-03 17:32:53 -05:00
freddygv b089472a12 Pass explicit signal with op for secrets write
Previously the updates to the peering secrets UUID table relied on
inferring what action triggered the update based on a reconciliation
against the existing secrets.

Instead we now explicitly require the operation to be given so that the
inference isn't necessary. This makes the UUID table logic easier to
reason about and fixes some related bugs.

There is also an update so that the peering secrets get handled on
snapshots/restores.
2022-08-03 17:25:12 -05:00