Commit graph

541 commits

Author SHA1 Message Date
Chris S. Kim e4c20ec190
Refactor client RPC timeouts (#14965)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created.

Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.
2022-10-18 15:05:09 -04:00
Chris S. Kim 58c041eb6e
Merge pull request #13388 from deblasis/feature/health-checks_windows_service
Feature: Health checks windows service
2022-10-17 09:26:19 -04:00
Dan Upton 3b9297f95a
proxycfg: rate-limit delivery of config snapshots (#14960)
Adds a user-configurable rate limiter to proxycfg snapshot delivery,
with a default limit of 250 updates per second.

This addresses a problem observed in our load testing of Consul
Dataplane where updating a "global" resource such as a wildcard
intention or the proxy-defaults config entry could starve the Raft or
Memberlist goroutines of CPU time, causing general cluster instability.
2022-10-14 15:52:00 +01:00
Freddy 909fc33271
Merge pull request #14935 from hashicorp/fix/alias-leak 2022-10-13 16:31:15 -06:00
Riddhi Shah 474d9cfcdc
Service http checks data source for agentless proxies (#14924)
Adds another datasource for proxycfg.HTTPChecks, for use on server agents. Typically these checks are performed by local client agents and there is no equivalent of this in agentless (where servers configure consul-dataplane proxies).
Hence, the data source is mostly a no-op on servers but in the case where the service is present within the local state, it delegates to the cache data source.
2022-10-12 07:49:56 -07:00
Paul Glass 8cf430140a
gRPC server metrics (#14922)
* Move stats.go from grpc-internal to grpc-middleware
* Update grpc server metrics with server type label
* Add stats test to grpc-external
* Remove global metrics instance from grpc server tests
2022-10-11 17:00:32 -05:00
freddygv 9f0ab69aef Fix alias check leak
Preivously when alias check was removed it would not be stopped nor
cleaned up from the associated aliasChecks map.

This means that any time an alias check was deregistered we would
leak a goroutine for CheckAlias.run() because the stopCh would never
be closed.

This issue mostly affects service mesh deployments on platforms where
the client agent is mostly static but proxy services come and go
regularly, since by default sidecars are registered with an alias check.
2022-10-10 16:42:29 -06:00
DanStough df94470e76 feat: xDS updates for peerings control plane through mesh gw 2022-10-07 08:46:42 -06:00
Chris Chapman 399fafb679
Making suggested changes 2022-09-30 14:51:12 -07:00
Chris Chapman c4c5f900e0
Bind a dns mux handler to gRPC proxy 2022-09-29 21:44:45 -07:00
Nick Ethier 5e4b3ef5d4
add HCP integration component (#14723)
* add HCP integration

* lint: use non-deprecated logging interface
2022-09-26 14:58:15 -04:00
Alessandro De Blasis 3b9061ab5b fix(agent): uninitialized map panic error 2022-09-21 09:25:54 +01:00
freddygv 0c3853a2d0 Add server certificate manager
This certificate manager will request a leaf certificate for server
agents and then keep them up to date.
2022-09-16 17:57:10 -06:00
Daniel Graña 13ac6356a8
[BUGFIX] Do not use interval as timeout (#14619)
Do not use interval as timeout
2022-09-15 12:39:48 -04:00
skpratt 1ae31a520a
PR #14057 follow up fix: service id parsing from sidecar id (#14541)
* fix service id parsing from sidecar id

* simplify suffix trimming
2022-09-09 09:47:10 -05:00
Dan Upton 9fe6c33c0d
xDS Load Balancing (#14397)
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.

In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.

This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.

If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.

Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.

The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
2022-09-09 15:02:01 +01:00
Derek Menteer 8efe862b76 Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports 2022-09-08 14:53:08 -05:00
Derek Menteer 75dec4c31d Remove rebuilding grpc server. 2022-09-08 13:45:44 -05:00
Derek Menteer 6aaf1c6035 Various cleanups. 2022-09-08 10:51:50 -05:00
Chris S. Kim 331c756471
Reuse http.DefaultTransport in UIMetricsProxy (#14521)
http.Transport keeps a pool of connections and should be reused when possible. We instantiate a new http.DefaultTransport for every metrics request, making large numbers of concurrent requests inefficiently spin up new connections instead of reusing open ones.
2022-09-08 11:02:05 -04:00
skpratt 02559085ad
move port and default check logic to locked step (#14057) 2022-09-06 19:35:31 -05:00
Daniel Upton 128055c44c proxycfg-glue: server-local implementation of IntentionUpstreamsDestination
This is the OSS portion of enterprise PR 2463.

Generalises the serverIntentionUpstreams type to support matching on a
service or destination.
2022-09-06 23:27:25 +01:00
Daniel Upton 4b76d8a8ff proxycfg-glue: server-local implementation of InternalServiceDump
This is the OSS portion of enterprise PR 2489.

This PR introduces a server-local implementation of the
proxycfg.InternalServiceDump interface that sources data from a blocking query
against the server's state store.

For simplicity, it only implements the subset of the Internal.ServiceDump RPC
handler actually used by proxycfg - as such the result type has been changed
to IndexedCheckServiceNodes to avoid confusion.
2022-09-06 23:27:25 +01:00
Daniel Upton 8cd6c9f95e proxycfg-glue: server-local implementation of ResolvedServiceConfig
This is the OSS portion of enterprise PR 2460.

Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig
interface that sources data from a blocking query against the server's state
store.

It moves the service config resolution logic into the agent/configentry package
so that it can be used in both the RPC handler and data source.

I've also done a little re-arranging and adding comments to call out data
sources for which there is to be no server-local equivalent.
2022-09-06 23:27:25 +01:00
Derek Menteer cb478b0e61 Address PR comments. 2022-09-01 16:54:24 -05:00
Derek Menteer b641dcf03d Expose grpc_tls via serf for cluster peering. 2022-08-29 13:43:49 -05:00
Derek Menteer 4a01d75cf8 Add separate grpc_tls port.
To ease the transition for users, the original gRPC
port can still operate in a deprecated mode as either
plain-text or TLS mode. This behavior should be removed
in a future release whenever we no longer support this.

The resulting behavior from this commit is:
  `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports.
  `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port.
  `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).
2022-08-29 13:43:43 -05:00
Alessandro De Blasis 37bd70ca59 fix(agent): removed redundant code in docker check as well 2022-08-29 18:15:59 +01:00
Alessandro De Blasis b5cf0524b7 fix(agent): removed redundant check on prev. running check 2022-08-29 17:53:39 +01:00
Alessandro De Blasis 78f2ee28b8 Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
2022-08-15 08:26:55 +01:00
Alessandro De Blasis a131741ada Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
2022-08-15 08:09:56 +01:00
Chris S. Kim 79d00f59cd Close active listeners on error
If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.
2022-08-09 12:22:39 -04:00
alex 0f6354685b
block PeerName register requests (#13887)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-29 14:36:22 -07:00
Luke Kysow d21f793b74
peering: add config to enable/disable peering (#13867)
* peering: add config to enable/disable peering

Add config:

```
peering {
  enabled = true
}
```

Defaults to true. When disabled:
1. All peering RPC endpoints will return an error
2. Leader won't start its peering establishment goroutines
3. Leader won't start its peering deletion goroutines
2022-07-22 15:20:21 -07:00
Daniel Upton f018bd6e09 proxycfg-glue: server-local implementation of ExportedPeeredServices
This is the OSS portion of enterprise PR 2377.

Adds a server-local implementation of the proxycfg.ExportedPeeredServices
interface that sources data from a blocking query against the server's
state store.
2022-07-22 15:23:23 +01:00
Daniel Upton e3bff8fb39 proxycfg-glue: server-local implementation of PeeredUpstreams
This is the OSS portion of enterprise PR 2352.

It adds a server-local implementation of the proxycfg.PeeredUpstreams interface
based on a blocking query against the server's state store.

It also fixes an omission in the Virtual IP freeing logic where we were never
updating the max index (and therefore blocking queries against
VirtualIPsForAllImportedServices would not return on service deletion).
2022-07-21 13:51:59 +01:00
R.B. Boyer 61ebb38092
server: ensure peer replication can successfully use TLS over external gRPC (#13733)
Ensure that the peer stream replication rpc can successfully be used with TLS activated.

Also:

- If key material is configured for the gRPC port but HTTPS is not
  enabled now TLS will still be activated for the gRPC port.

- peerstream replication stream opened by the establishing-side will now
  ignore grpc.WithBlock so that TLS errors will bubble up instead of
  being awkwardly delayed or suppressed
2022-07-15 13:15:50 -05:00
Dan Stough 084f9d7084 feat: connect proxy xDS for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-07-14 15:27:02 -04:00
Daniel Upton 7f69e27926 proxycfg-glue: server-local implementation of FederationStateListMeshGateways
This is the OSS portion of enterprise PR 2265.

This PR provides a server-local implementation of the
proxycfg.FederationStateListMeshGateways interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton a5a6102a3b proxycfg-glue: server-local implementation of GatewayServices
This is the OSS portion of enterprise PR 2259.

This PR provides a server-local implementation of the proxycfg.GatewayServices
interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton a280c9a10b proxycfg-glue: server-local implementation of TrustBundle and TrustBundleList
This is the OSS portion of enterprise PR 2250.

This PR provides server-local implementations of the proxycfg.TrustBundle and
proxycfg.TrustBundleList interfaces, based on local blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 70f29942f4 proxycfg-glue: server-local implementation of the Health interface
This is the OSS portion of enterprise PR 2249.

This PR introduces an implementation of the proxycfg.Health interface based on a
local materialized view of the health events.

It reuses the view and request machinery from agent/rpcclient/health, which made
it super straightforward.
2022-07-14 18:22:12 +01:00
Daniel Upton 688dfe3138 proxycfg-glue: server-local implementation of ServiceList
This is the OSS portion of enterprise PR 2242.

This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
2022-07-14 18:22:12 +01:00
Daniel Upton 599f5e2207 proxycfg-glue: server-local compiled discovery chain data source
This is the OSS portion of enterprise PR 2236.

Adds a local blocking query-based implementation of the proxycfg.CompiledDiscoveryChain interface.
2022-07-14 18:22:12 +01:00
Chris S. Kim 5d890cdbb2 Use new maps for proxycfg peered data 2022-07-13 16:05:10 -04:00
Dan Upton 34140ff3e0
grpc: rename public/private directories to external/internal (#13721)
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.

Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.

This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
2022-07-13 16:33:48 +01:00
R.B. Boyer 40c5c7eee2
server: broadcast the public grpc port using lan serf and update the consul service in the catalog with the same data (#13687)
Currently servers exchange information about their WAN serf port
and RPC port with serf tags, so that they all learn of each other's
addressing information. We intend to make larger use of the new
public-facing gRPC port exposed on all of the servers, so this PR
addresses that by passing around the gRPC port via serf tags and
then ensuring the generated consul service in the catalog has
metadata about that new port as well for ease of non-serf-based lookup.
2022-07-07 13:55:41 -05:00
Daniel Upton e1d0aff462 proxycfg: server-local intention upstreams data source
This is the OSS portion of enterprise PR 2157.

It builds on the local blocking query work in #13438 to implement the
proxycfg.IntentionUpstreams interface using server-local data.

Also moves the ACL filtering logic from agent/consul into the acl/filter
package so that it can be reused here.
2022-07-04 10:48:36 +01:00
Daniel Upton 21ea217b1d proxycfg: server-local intentions data source
This is the OSS portion of enterprise PR 2141.

This commit provides a server-local implementation of the `proxycfg.Intentions`
interface that sources data from streaming events.

It adds events for the `service-intentions` config entry type, and then consumes
event streams (via materialized views) for the service's explicit intentions and
any applicable wildcard intentions, merging them into a single list of intentions.

An alternative approach I considered was to consume _all_ intention events (via
`SubjectWildcard`) and filter out the irrelevant ones. This would admittedly
remove some complexity in the `agent/proxycfg-glue` package but at the expense
of considerable overhead from waking potentially many thousands of connect
proxies every time any intention is updated.
2022-07-04 10:48:36 +01:00
Daniel Upton 497df1ca3b proxycfg: server-local config entry data sources
This is the OSS portion of enterprise PR 2056.

This commit provides server-local implementations of the proxycfg.ConfigEntry
and proxycfg.ConfigEntryList interfaces, that source data from streaming events.

It makes use of the LocalMaterializer type introduced for peering replication,
adding the necessary support for authorization.

It also adds support for "wildcard" subscriptions (within a topic) to the event
publisher, as this is needed to fetch service-resolvers for all services when
configuring mesh gateways.

Currently, events will be emitted for just the ingress-gateway, service-resolver,
and mesh config entry types, as these are the only entries required by proxycfg
— the events will be emitted on topics named IngressGateway, ServiceResolver,
and MeshConfig topics respectively.

Though these events will only be consumed "locally" for now, they can also be
consumed via the gRPC endpoint (confirmed using grpcurl) so using them from
client agents should be a case of swapping the LocalMaterializer for an
RPCMaterializer.
2022-07-04 10:48:36 +01:00