Commit Graph

80 Commits

Author SHA1 Message Date
Eric Haberkorn d645fa5ea1
sidecar-proxy refactor (#17328) 2023-05-12 16:49:42 -04:00
Derek Menteer 73b65228f5
Fix issue with peer stream node cleanup. (#17235)
Fix issue with peer stream node cleanup.

This commit encompasses a few problems that are closely related due to their
proximity in the code.

1. The peerstream utilizes node IDs in several locations to determine which
nodes / services / checks should be cleaned up or created. While VM deployments
with agents will likely always have a node ID, agentless uses synthetic nodes
and does not populate the field. This means that for consul-k8s deployments, all
services were likely bundled together into the same synthetic node in some code
paths (but not all), resulting in strange behavior. The Node.Node field should
be used instead as a unique identifier, as it should always be populated.

2. The peerstream cleanup process for unused nodes uses an incorrect query for
node deregistration. This query is NOT namespace aware and results in the node
(and corresponding services) being deregistered prematurely whenever it has zero
default-namespace services and 1+ non-default-namespace services registered on
it. This issue is tricky to find due to the incorrect logic mentioned in #1,
combined with the fact that the affected services must be co-located on the same
node as the currently deregistering service for this to be encountered.

3. The stream tracker did not understand differences between services in
different namespaces and could therefore report incorrect numbers. It was
updated to utilize the full service name to avoid conflicts and return proper
results.
2023-05-08 13:13:25 -05:00
Dan Upton 3466c85cc4
server: wire up in-process Resource Service (#16978) 2023-04-18 10:03:23 +01:00
Ronald dd0e8eec14
copyright headers for agent folder (#16704)
* copyright headers for agent folder

* Ignore test data files

* fix proto files and remove headers in agent/uiserver folder

* ignore deep-copy files
2023-03-28 14:39:22 -04:00
Chris S. Kim 652b74dd37
Fix various flaky tests (#16396) 2023-02-23 14:52:18 -05:00
Derek Menteer 1c4640f0df
Fix issue with peer services incorrectly appearing as connect-enabled. (#16339)
Prior to this commit, all peer services were transmitted as connect-enabled
as long as a one or more mesh-gateways were healthy. With this change, there
is now a difference between typical services and connect services transmitted
via peering.

A service will be reported as "connect-enabled" as long as any of these
conditions are met:

1. a connect-proxy sidecar is registered for the service name.
2. a connect-native instance of the service is registered.
3. a service resolver / splitter / router is registered for the service name.
4. a terminating gateway has registered the service.
2023-02-21 13:59:36 -06:00
Matt Keeler f3c80c4eef
Protobuf Refactoring for Multi-Module Cleanliness (#16302)
Protobuf Refactoring for Multi-Module Cleanliness

This commit includes the following:

Moves all packages that were within proto/ to proto/private
Rewrites imports to account for the packages being moved
Adds in buf.work.yaml to enable buf workspaces
Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml
Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes)

Why:

In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage.
There were some recent changes to have our own ratelimiting annotations.
The two combined were not working when I was trying to use them together (attempting to rebase another branch)
Buf workspaces should be the solution to the problem
Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root.
This resulted in proto file name conflicts in the Go global protobuf type registry.
The solution to that was to add in a private/ directory into the path within the proto/ directory.
That then required rewriting all the imports.

Is this safe?

AFAICT yes
The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc)
Other than imports, there were no changes to any generated code as a result of this.
2023-02-17 16:14:46 -05:00
Dan Upton 618deae657
xds: don't attempt to load-balance sessions for local proxies (#15789)
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.

This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.

Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.

Fixes: https://github.com/hashicorp/consul/issues/15753
2023-01-18 12:33:21 -06:00
Matt Keeler 554f1e6fee
Protobuf Modernization (#15949)
* Protobuf Modernization

Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf

Marshallers (protobuf and json) needed some changes to account for different APIs.

Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.

This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.

* Fix go-mod-tidy make target to work on all modules
2023-01-11 09:39:10 -05:00
R.B. Boyer a8411976a8
peering: better represent non-passing states during peer check flattening (#15615)
During peer stream replication we flatten checks from the source cluster and build one thin overall check to hide the irrelevant details from the consuming cluster. This flattening logic did correctly flip to non-passing if there were any non-passing checks, but WHICH status it got during that was random (warn/error).

Also it didn't represent "maintenance" operations. There is an api package call AggregatedStatus which more correctly flattened check statuses.

This PR replicated the more complete logic into the peer stream package.
2022-11-30 11:29:21 -06:00
Derek Menteer a8eb047ee6
Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267)
Re-add ServerExternalAddresses parameter in GenerateToken endpoint

This reverts commit 5e156772f6a7fba5324eb6804ae4e93c091229a6
and adds extra functionality to support newer peering behaviors.
2022-11-08 14:55:18 -06:00
Derek Menteer 7bcded133e
Backport tests from ent. (#15260)
* Backport agent tests.

Original commit: 0710b2d12fb51a29cedd1119b5fb086e5c71f632
Original commit: aaedb3c28bfe247266f21013d500147d8decb7cd (partial)

* Backport test fix and reduce flaky failures.
2022-11-04 10:19:24 -05:00
Derek Menteer 261ba1e65d
Backport various fixes from ENT. (#15254)
* Regenerate golden files.

* Backport from ENT: "Avoid race"

Original commit: 5006c8c858b0e332be95271ef9ba35122453315b
Original author: freddygv

* Backport from ENT: "chore: fix flake peerstream test"

Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8
Original author: DanStough
2022-11-03 16:34:57 -05:00
malizz 24ddeac74b
convert stream status time fields to pointers (#15252) 2022-11-03 11:51:22 -07:00
Derek Menteer f704e72f3e
Prevent peering acceptor from subscribing to addr updates. (#15214) 2022-11-02 07:55:41 -05:00
cskh 57380ea752
fix(peering): nil pointer in calling handleUpdateService (#15160)
* fix(peering): nil pointer in calling handleUpdateService

* changelog
2022-10-26 11:50:34 -04:00
cskh a5acb987fa
fix(peering): replicating wan address (#15108)
* fix(peering): replicating wan address

* add changelog

* unit test
2022-10-24 15:44:57 -04:00
R.B. Boyer 87432a8dd4
chore: update golangci-lint to v1.50.1 (#15022) 2022-10-24 11:48:02 -05:00
cskh e18434bcb1
peering: skip registering duplicate node and check from the peer (#14994)
* peering: skip register duplicate node and check from the peer

* Prebuilt the nodes map and checks map to avoid repeated for loop

* use key type to struct: node id, service id, and check id
2022-10-18 16:19:24 -04:00
Freddy bbf6b17e44
Merge pull request #14981 from hashicorp/peering/dial-through-gateways 2022-10-14 09:44:56 -06:00
Derek Menteer 092e5fd074 Reset wait on ensureServerAddrSubscription 2022-10-13 15:58:26 -05:00
freddygv 437a513d9b Fix CA init error code 2022-10-13 14:58:11 -06:00
malizz 27d0181806
increase protobuf size limit for cluster peering (#14976) 2022-10-13 13:46:51 -07:00
Derek Menteer d47c9b446c Prevent consul peer-exports by discovery chain. 2022-10-13 12:45:09 -05:00
Derek Menteer ee49db9a2f Prevent the "consul" service from being exported. 2022-10-13 12:45:09 -05:00
Derek Menteer bfa4adbfce Add remote peer partition and datacenter info. 2022-10-13 10:37:41 -05:00
Freddy 4cf0bf4865
Merge pull request #14958 from hashicorp/peering/nonce 2022-10-12 08:18:15 -06:00
freddygv 4d1e7c4cbb Actually track nonce in test 2022-10-12 07:50:17 -06:00
Derek Menteer 00312bcf57 Fix incorrect backoff-wait logic. 2022-10-12 08:01:10 -05:00
freddygv c9d171c031 Add basic nonce management
This commit adds a monotonically increasing nonce to include in peering
replication response messages. Every ack/nack from the peer handling a
response will include this nonce, allowing to correlate the ack/nack
with a specific resource.

At the moment nothing is done with the nonce when it is received. In the
future we may want to add functionality such as retries on NACKs,
depending on the class of error.
2022-10-11 19:02:04 -06:00
freddygv 55b5c1a073 Fixup test 2022-10-10 13:20:14 -06:00
Chris S. Kim 9d4fb0445a Include stream-related information in peering endpoints 2022-10-10 13:20:14 -06:00
freddygv 6ef8d329d2 Require Connect and TLS to generate peering tokens
By requiring Connect and a gRPC TLS listener we can automatically
configure TLS for all peering control-plane traffic.
2022-10-07 09:06:29 -06:00
Eric Haberkorn 2178e38204
Rename `PeerName` to `Peer` on prepared queries and exported services (#14854) 2022-10-04 14:46:15 -04:00
freddygv 2c5caec97c Share mgw addrs in peering stream if needed
This commit adds handling so that the replication stream considers
whether the user intends to peer through mesh gateways.

The subscription will return server or mesh gateway addresses depending
on the mesh configuration setting. These watches can be updated at
runtime by modifying the mesh config entry.
2022-10-03 11:42:20 -06:00
Eric Haberkorn 5fd1e6daea
Add exported services event to cluster peering replication. (#14797) 2022-09-29 15:37:19 -04:00
Chris S. Kim 0148263780 PR feedback on terminated state checking 2022-09-06 10:28:20 -04:00
Chris S. Kim cd51b2f400 Properly assert for ServerAddresses replication request 2022-09-02 11:44:54 -04:00
Chris S. Kim 258c0a1bc1 Fix terminate not returning early 2022-09-02 11:44:38 -04:00
Chris S. Kim 9c157e40a3 Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer
# Conflicts:
#	agent/grpc-external/services/peerstream/stream_test.go
2022-08-30 11:09:25 -04:00
Chris S. Kim e4a154c88e Add heartbeat timeout grace period when accounting for peering health 2022-08-29 16:32:26 -04:00
Chris S. Kim a58e943502 Rename test 2022-08-29 10:34:50 -04:00
Chris S. Kim 78bf8437d8 Fix test 2022-08-29 10:20:30 -04:00
Chris S. Kim b1025f2dd9 Adjust metrics reporting for peering tracker 2022-08-26 17:34:17 -04:00
Chris S. Kim 7e95b35102 Fix casing 2022-08-26 11:56:26 -04:00
Chris S. Kim a5e9ea6d96 Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer
# Conflicts:
#	agent/grpc-external/services/peerstream/stream_test.go
2022-08-26 10:43:56 -04:00
alex f64af3be24
peering: add peer health metric (#14004)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-08-25 16:32:59 -07:00
Chris S. Kim eac63fea1f Update test comment 2022-08-24 13:50:24 -04:00
Chris S. Kim 6f98c853b8 Add check for zero-length server addresses 2022-08-24 13:30:52 -04:00
Chris S. Kim c14b166b80 Fix flakes 2022-08-22 14:45:31 -04:00