Implements the gRPC middleware for rate-limiting as a tap.ServerInHandle
function (executed before the request is unmarshaled).
Mappings between gRPC methods and their operation type are generated by
a protoc plugin introduced by #15564.
Adds a no-op implementation of the rate-limit handler and exposes
it on the consul.Server struct.
It allows us to start working on the net/rpc and gRPC interceptors
and config (re)loading logic, without having to implement the full
handler up-front.
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
* add multilimiter and tests
* exporting LimitedEntity
* go mod tidy
* Apply suggestions from code review
Co-authored-by: John Murret <john.murret@hashicorp.com>
* add config update and rename config params
* add doc string and split config
* Apply suggestions from code review
Co-authored-by: Dan Upton <daniel@floppy.co>
* use timer to avoid go routine leak and change the interface
* add comments to tests
* fix failing test
* add prefix with config edge, refactor tests
* Apply suggestions from code review
Co-authored-by: Dan Upton <daniel@floppy.co>
* refactor to apply configs for limiters under a prefix
* add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic
* make KeyType an exported type
* split the config and limiter trees to fix race conditions in config update
* rename variables
* fix race in test and remove dead code
* fix reconcile loop to not create a timer on each loop
* add extra benchmark tests and fix tests
* fix benchmark test to pass value to func
* use a separate go routine to write limiters (#15643)
* use a separate go routine to write limiters
* Add updating limiter when another limiter is created
* fix waiter to be a ticker, so we commit more than once.
* fix tests and add tests for coverage
* unexport members and add tests
* make UpdateConfig thread safe and multi call to Run safe
* replace swith with if
* fix review comments
* replace time.sleep with retries
* fix flaky test and remove unnecessary init
* fix test races
* remove unnecessary negative test case
* remove fixed todo
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date.
One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.
The fix outlined and merged in #15253 fixed the issue as it occurs in the primary
DC. There is a similar issue that arises when vault is used as the Connect CA in a
secondary datacenter that is fixed by this PR.
Additionally: this PR adds support to run the existing suite of vault related integration
tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)
* Remove log line about server mgmt token init
Currently the server management token is only being bootstrapped in the
primary datacenter. That means that servers on the secondary datacenter
will never have this token available, and would log this line any time a
token is resolved.
Bootstrapping the token in secondary datacenters will be done in a
follow-up.
* Add changelog entry
* auto-config: relax node name validation for JWT authorization
This changes the JWT authorization logic to allow all non-whitespace,
non-quote characters when validating node names. Consul had previously
allowed these characters in node names, until this validation was added
to fix a security vulnerability with whitespace/quotes being passed to
the `bexpr` library. This unintentionally broke node names with
characters like `.` which aren't related to this vulnerability.
* Update website/content/docs/agent/config/cli-flags.mdx
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
* add leadership transfer command
* add RPC call test (flaky)
* add missing import
* add changelog
* add command registration
* Apply suggestions from code review
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
* add the possibility of providing an id to raft leadership transfer. Add few tests.
* delete old file from cherry pick
* rename changelog filename to PR #
* rename changelog and fix import
* fix failing test
* check for OperatorWrite
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
* rename from leader-transfer to transfer-leader
* remove version check and add test for operator read
* move struct to operator.go
* first pass
* add code for leader transfer in the grpc backend and tests
* wire the http endpoint to the new grpc endpoint
* remove the RPC endpoint
* remove non needed struct
* fix naming
* add mog glue to API
* fix comment
* remove dead code
* fix linter error
* change package name for proto file
* remove error wrapping
* fix failing test
* add command registration
* add grpc service mock tests
* fix receiver to be pointer
* use defined values
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
* reuse MockAclAuthorizer
* add documentation
* remove usage of external.TokenFromContext
* fix failing tests
* fix proto generation
* Apply suggestions from code review
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
* Apply suggestions from code review
* add more context in doc for the reason
* Apply suggestions from docs code review
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* regenerate proto
* fix linter errors
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
There are a few changes that needed to be made to to handle authorizing
reads for imported data:
- If the data was imported from a peer we should not attempt to read the
data using the traditional authz rules. This is because the name of
services/nodes in a peer cluster are not equivalent to those of the
importing cluster.
- If the data was imported from a peer we need to check whether the
token corresponds to a service, meaning that it has service:write
permissions, or to a local read only token that can read all
nodes/services in a namespace.
This required changes at the policyAuthorizer level, since that is the
only view available to OSS Consul, and at the enterprise
partition/namespace level.
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* Fix mesh gateway proxy-defaults not affecting upstreams.
* Clarify distinction with upstream settings
Top-level mesh gateway mode in proxy-defaults and service-defaults gets
merged into NodeService.Proxy.MeshGateway, and only gets merged with
the mode attached to an an upstream in proxycfg/xds.
* Fix mgw mode usage for peered upstreams
There were a couple issues with how mgw mode was being handled for
peered upstreams.
For starters, mesh gateway mode from proxy-defaults
and the top-level of service-defaults gets stored in
NodeService.Proxy.MeshGateway, but the upstream watch for peered data
was only considering the mesh gateway config attached in
NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway
mode via global proxy-defaults or service-defaults on the downstream
would not have an effect.
Separately, transparent proxy watches for peered upstreams didn't
consider mesh gateway mode at all.
This commit addresses the first issue by ensuring that we overlay the
upstream config for peered upstreams as we do for non-peered. The second
issue is addressed by re-using setupWatchesForPeeredUpstream when
handling transparent proxy updates.
Note that for transparent proxies we do not yet support mesh gateway
mode per upstream, so the NodeService.Proxy.MeshGateway mode is used.
* Fix upstream mesh gateway mode handling in xds
This commit ensures that when determining the mesh gateway mode for
peered upstreams we consider the NodeService.Proxy.MeshGateway config as
a baseline.
In absense of this change, setting a mesh gateway mode via
proxy-defaults or the top-level of service-defaults will not have an
effect for peered upstreams.
* Merge service/proxy defaults in cfg resolver
Previously the mesh gateway mode for connect proxies would be
merged at three points:
1. On servers, in ComputeResolvedServiceConfig.
2. On clients, in MergeServiceConfig.
3. On clients, in proxycfg/xds.
The first merge returns a ServiceConfigResponse where there is a
top-level MeshGateway config from proxy/service-defaults, along with
per-upstream config.
The second merge combines per-upstream config specified at the service
instance with per-upstream config specified centrally.
The third merge combines the NodeService.Proxy.MeshGateway
config containing proxy/service-defaults data with the per-upstream
mode. This third merge is easy to miss, which led to peered upstreams
not considering the mesh gateway mode from proxy-defaults.
This commit removes the third merge, and ensures that all mesh gateway
config is available at the upstream. This way proxycfg/xds do not need
to do additional overlays.
* Ensure that proxy-defaults is considered in wc
Upstream defaults become a synthetic Upstream definition under a
wildcard key "*". Now that proxycfg/xds expect Upstream definitions to
have the final MeshGateway values, this commit ensures that values from
proxy-defaults/service-defaults are the default for this synthetic
upstream.
* Add changelog.
Co-authored-by: freddygv <freddy@hashicorp.com>
Previously, the MergeNodeServiceWithCentralConfig method accepted a
ServiceSpecificRequest argument, of which only the Datacenter and
QueryOptions fields were used.
Digging a little deeper, it turns out these fields were only passed
down to the ComputeResolvedServiceConfig method (through the
ServiceConfigRequest struct) which didn't actually use them.
As such, not all call-sites passed a valid ServiceSpecificRequest
so it's safer to remove the argument altogether to prevent future
changes from depending on it.
Re-add ServerExternalAddresses parameter in GenerateToken endpoint
This reverts commit 5e156772f6a7fba5324eb6804ae4e93c091229a6
and adds extra functionality to support newer peering behaviors.
Allow for some message duplication in subscription events during assertions.
I'm pretty sure the subscriptions machinery allows for messages to occasionally
be duplicated instead of dropping them, as a once-and-only-once queue is a pipe
dream and you have to pick one of the other two options.
* autoencrypt: helpful error for clients with wrong dc
If clients have set a different datacenter than the servers they're
connecting with for autoencrypt, give a helpful error message.
To support Destinations on the service-defaults (for tproxy with terminating gateway), we need to now also make servers watch service-defaults config entries.
* peering: skip register duplicate node and check from the peer
* Prebuilt the nodes map and checks map to avoid repeated for loop
* use key type to struct: node id, service id, and check id
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created.
Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.
When peering through mesh gateways we expect outbound dials to peer
servers to flow through the local mesh gateway addresses.
Now when establishing a peering we get a list of dial addresses as a
ring buffer that includes local mesh gateway addresses if the local DC
is configured to peer through mesh gateways. The ring buffer includes
the mesh gateway addresses first, but also includes the remote server
addresses as a fallback.
This fallback is present because it's possible that direct egress from
the servers may be allowed. If not allowed then the leader will cycle
back to a mesh gateway address through the ring.
When attempting to dial the remote servers we retry up to a fixed
timeout. If using mesh gateways we also have an initial wait in
order to allow for the mesh gateways to configure themselves.
Note that if we encounter a permission denied error we do not retry
since that error indicates that the secret in the peering token is
invalid.
memdb's `WatchCh` method creates a goroutine that will publish to the
returned channel when the watchset is triggered or the given context
is canceled. Although this is called out in its godoc comment, it's
not obvious that this method creates a goroutine who's lifecycle you
need to manage.
In the xDS capacity controller, we were calling `WatchCh` on each
iteration of the control loop, meaning the number of goroutines would
grow on each autopilot event until there was catalog churn.
In the catalog config source, we were calling `WatchCh` with the
background context, meaning that the goroutine would keep running after
the sync loop had terminated.
* Move stats.go from grpc-internal to grpc-middleware
* Update grpc server metrics with server type label
* Add stats test to grpc-external
* Remove global metrics instance from grpc server tests
A previous commit introduced an internally-managed server certificate
to use for peering-related purposes.
Now the peering token has been updated to match that behavior:
- The server name matches the structure of the server cert
- The CA PEMs correspond to the Connect CA
Note that if Conect is disabled, and by extension the Connect CA, we
fall back to the previous behavior of returning the manually configured
certs and local server SNI.
Several tests were updated to use the gRPC TLS port since they enable
Connect by default. This means that the peering token will embed the
Connect CA, and the dialer will expect a TLS listener.
* updating to serf v0.10.1 and memberlist v0.5.0 to get memberlist size metrics and memberlist broadcast queue depth metric
* update changelog
* update changelog
* correcting changelog
* adding "QueueCheckInterval" for memberlist to test
* updating integration test containers to grab latest api
This commit adds handling so that the replication stream considers
whether the user intends to peer through mesh gateways.
The subscription will return server or mesh gateway addresses depending
on the mesh configuration setting. These watches can be updated at
runtime by modifying the mesh config entry.
This commit introduces a new ACL token used for internal server
management purposes.
It has a few key properties:
- It has unlimited permissions.
- It is persisted through Raft as System Metadata rather than in the
ACL tokens table. This is to avoid users seeing or modifying it.
- It is re-generated on leadership establishment.
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.
In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.
This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.
If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.
Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.
The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
By adding a SpiffeID for server agents, servers can now request a leaf
certificate from the Connect CA.
This new Spiffe ID has a key property: servers are identified by their
datacenter name and trust domain. All servers that share these
attributes will share a ServerURI.
The aim is to use these certificates to verify the server name of ANY
server in a Consul datacenter.
This is the OSS portion of enterprise PR 2460.
Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig
interface that sources data from a blocking query against the server's state
store.
It moves the service config resolution logic into the agent/configentry package
so that it can be used in both the RPC handler and data source.
I've also done a little re-arranging and adding comments to call out data
sources for which there is to be no server-local equivalent.
* draft commit
* add changelog, update test
* remove extra param
* fix test
* update type to account for nil value
* add test for custom passive health check
* update comments and tests
* update description in docs
* fix missing commas
To ease the transition for users, the original gRPC
port can still operate in a deprecated mode as either
plain-text or TLS mode. This behavior should be removed
in a future release whenever we no longer support this.
The resulting behavior from this commit is:
`ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports.
`ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port.
`ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).
Peerings are terminated when a peer decides to delete the peering from
their end. Deleting a peering sends a termination message to the peer
and triggers them to mark the peering as terminated but does NOT delete
the peering itself. This is to prevent peerings from disappearing from
both sides just because one side deleted them.
Previously the Delete endpoint was skipping the deletion if the peering
was not marked as active. However, terminated peerings are also
inactive.
This PR makes some updates so that peerings marked as terminated can be
deleted by users.
We need to watch for changes to peerings and update the server addresses which get served by the ring buffer.
Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.
Contains 2 changes to the GetEnvoyBootstrapParams response to support
consul-dataplane.
Exposing node_name and node_id:
consul-dataplane will support providing either the node_id or node_name in its
configuration. Unfortunately, supporting both in the xDS meta adds a fair amount
of complexity (partly because most tables are currently indexed on node_name)
so for now we're going to return them both from the bootstrap params endpoint,
allowing consul-dataplane to exchange a node_id for a node_name (which it will
supply in the xDS meta).
Properly setting service for gateways:
To avoid the need to special case gateways in consul-dataplane, service will now
either be the destination service name for connect proxies, or the gateway
service name. This means it can be used as-is in Envoy configuration (i.e. as a
cluster name or in metric tags).
Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0.
This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.
1. Create a bexpr filter for performing the filtering
2. Change the state store functions to return the raw (not aggregated)
list of ServiceNodes.
3. Move the aggregate service tags by name logic out of the state store
functions into a new function called from the RPC endpoint
4. Perform the filtering in the endpoint before aggregation.
Previously establishment and pending secrets were only checked at the
RPC layer. However, given that these are Check-and-Set transactions we
should ensure that the given secrets are still valid when persisting a
secret exchange or promotion.
Otherwise it would be possible for concurrent requests to overwrite each
other.
Previously there was a field indicating the operation that triggered a
secrets write. Now there is a message for each operation and it contains
the secret ID being persisted.
Previously the updates to the peering secrets UUID table relied on
inferring what action triggered the update based on a reconciliation
against the existing secrets.
Instead we now explicitly require the operation to be given so that the
inference isn't necessary. This makes the UUID table logic easier to
reason about and fixes some related bugs.
There is also an update so that the peering secrets get handled on
snapshots/restores.
Dialers do not keep track of peering secret UUIDs, so they should not
attempt to clean up data from that table when their peering is deleted.
We also now keep peer server addresses when marking peerings for
deletion. Peer server addresses are used by the ShouldDial() helper
when determining whether the peering is for a dialer or an acceptor.
We need to keep this data so that peering secrets can be cleaned up
accordingly.
* Avoid logging StreamSecretID
* Wrap additional errors in stream handler
* Fix flakiness in leader test and rename servers for clarity. There was
a race condition where the peering was being deleted in the test
before the stream was active. Now the test waits for the stream to be
connected on both sides before deleting the associated peering.
* Run flaky test serially
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
When we receive a FailedPrecondition error, retry that more quickly
because we expect it will resolve shortly. This is particularly
important in the context of Consul servers behind a load balancer
because when establishing a connection we have to retry until we
randomly land on a leader node.
The default retry backoff goes from 2s, 4s, 8s, etc. which can result in
very long delays quite quickly. Instead, this backoff retries in 8ms
five times, then goes exponentially from there: 16ms, 32ms, ... up to a
max of 8152ms.
There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12.
testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.
- when register service using catalog endpoint, the key of service
name actually should be "service". Add this information to the
error message will help user to quickly fix in the request.