Commit Graph

2014 Commits

Author SHA1 Message Date
Dan Upton 15c7c03fa5
grpc: switch servers and retry on error (#15892)
This is the OSS portion of enterprise PR 3822.

Adds a custom gRPC balancer that replicates the router's server cycling
behavior. Also enables automatic retries for RESOURCE_EXHAUSTED errors,
which we now get for free.
2023-01-05 10:21:27 +00:00
Florian Apolloner cb5389cc89
Allow Operator Generated bootstrap token (#14437)
Add support to provide an initial token via the bootstrap HTTP API, similar to hashicorp/nomad#12520
2023-01-04 20:19:33 +00:00
Semir Patel 8242459c66
Wire up the rate limiter to net/rpc calls (#15879) 2023-01-04 13:38:44 -06:00
Dan Upton 006138beb4
Wire in rate limiter to handle internal and external gRPC calls (#15857) 2022-12-23 13:42:16 -06:00
John Murret 2a0aeb2349
Rate Limit Handler - ensure rate limiting is not in the code path when not configured (#15819)
* Rate limiting handler - ensure configuration has changed before modifying limiters

* Updating test to validate arguments to UpdateConfig

* Removing duplicate test.  Updating mock.

* Renaming NullRateLimiter to NullRequestLimitsHandler

* Rate Limit Handler - ensure rate limiting is not in the code path when not configured

* Update agent/consul/rate/handler.go

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>

* formatting handler.go

* Rate limiting handler - ensure configuration has changed before modifying limiters

* Updating test to validate arguments to UpdateConfig

* Removing duplicate test.  Updating mock.

* adding logging for when UpdateConfig is called but the config has not changed.

* Update agent/consul/rate/handler.go

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>

* Update agent/consul/rate/handler_test.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* modifying existing variable name based on pr feedback

* updating a broken merge conflict;

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
2022-12-20 15:00:22 -07:00
John Murret 8c33d7cc0e
Rate limiting handler - ensure configuration has changed before modifying limiters (#15805)
* Rate limiting handler - ensure configuration has changed before modifying limiters

* Updating test to validate arguments to UpdateConfig

* Removing duplicate test.  Updating mock.

* adding logging for when UpdateConfig is called but the config has not changed.

* Update agent/consul/rate/handler.go

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-20 14:12:03 -07:00
Derek Menteer e25f7313e4
Fix incorrect protocol check on discovery chains with peer targets. (#15833) 2022-12-20 10:15:03 -06:00
Semir Patel 971089482c
Map net/rpc endpoints to a read/write/exempt op for rate-limiting (#15825)
Also fixed TestRequestRecorder flaky tests due to loss of precision in elapsed time in the test.
2022-12-19 16:04:52 -06:00
Nitya Dhanushkodi 8386bf19bf
extensions: refactor serverless plugin to use extensions from config entry fields (#15817)
docs: update config entry docs and the Lambda manual registration docs

Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
Co-authored-by: Eric <eric@haberkorn.co>
2022-12-19 12:19:37 -08:00
Andrew Stucki 1ff0906a3e
Add async reconciliation controller subpackage (#15534)
* Add async reconciliation controller subpackage

* Address initial feedback

* Add tests for panic assertions

* Fix comment
2022-12-16 16:49:26 -05:00
Dhia Ayachi a1ceeff461
add missing code and fix enterprise specific code (#15375)
* add missing code and fix enterprise specific code

* fix retry

* fix flaky tests

* fix linter error in test
2022-12-16 16:31:05 -05:00
Dhia Ayachi 11f245f24f
Server side rate limiter: handle the race condition for limiters tree write in multilimiter (#15767)
* change to perform all tree writes in the same go routine to avoid race condition.

* rename runStoreOnce to reconcile

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* reduce nesting

Co-authored-by: Dan Upton <daniel@floppy.co>
2022-12-14 17:32:11 +00:00
Semir Patel 1f82e82e04
Pass remote addr of incoming HTTP requests through to RPC(..) calls (#15700) 2022-12-14 09:24:22 -06:00
John Murret 700c693b33
adding config for request_limits (#15531)
* server: add placeholder glue for rate limit handler

This commit adds a no-op implementation of the rate-limit handler and
adds it to the `consul.Server` struct and setup code.

This allows us to start working on the net/rpc and gRPC interceptors and
config logic.

* Add handler errors

* Set the global read and write limits

* fixing multilimiter moving packages

* Fix typo

* Simplify globalLimit usage

* add multilimiter and tests

* exporting LimitedEntity

* Apply suggestions from code review

Co-authored-by: John Murret <john.murret@hashicorp.com>

* add config update and rename config params

* add doc string and split config

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* use timer to avoid go routine leak and change the interface

* add comments to tests

* fix failing test

* add prefix with config edge, refactor tests

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* refactor to apply configs for limiters under a prefix

* add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic

* make KeyType an exported type

* split the config and limiter trees to fix race conditions in config update

* rename variables

* fix race in test and remove dead code

* fix reconcile loop to not create a timer on each loop

* add extra benchmark tests and fix tests

* fix benchmark test to pass value to func

* server: add placeholder glue for rate limit handler

This commit adds a no-op implementation of the rate-limit handler and
adds it to the `consul.Server` struct and setup code.

This allows us to start working on the net/rpc and gRPC interceptors and
config logic.

* Set the global read and write limits

* fixing multilimiter moving packages

* add server configuration for global rate limiting.

* remove agent test

* remove added stuff from handler

* remove added stuff from multilimiter

* removing unnecessary TODOs

* Removing TODO comment from handler

* adding in defaulting to infinite

* add disabled status in there

* adding in documentation for disabled mode.

* make disabled the default.

* Add mock and agent test

* addig documentation and missing mock file.

* Fixing test TestLoad_IntegrationWithFlags

* updating docs based on PR feedback.

* Updating Request Limits mode to use int based on PR feedback.

* Adding RequestLimits struct so we have a nested struct in ReloadableConfig.

* fixing linting references

* Update agent/consul/rate/handler.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* Update agent/consul/config.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* removing the ignore of the request limits in JSON.  addingbuilder logic to convert any read rate or write rate less than 0 to rate.Inf

* added conversion function to convert request limits object to handler config.

* Updating docs to reflect gRPC and RPC are rate limit and as a result, HTTP requests are as well.

* Updating values for TestLoad_FullConfig() so that they were different and discernable.

* Updating TestRuntimeConfig_Sanitize

* Fixing TestLoad_IntegrationWithFlags test

* putting nil check in place

* fixing rebase

* removing change for missing error checks.  will put in another PR

* Rebasing after default multilimiter config change

* resolving rebase issues

* updating reference for incomingRPCLimiter to use interface

* updating interface

* Updating interfaces

* Fixing mock reference

Co-authored-by: Daniel Upton <daniel@floppy.co>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-13 13:09:55 -07:00
Derek Menteer 50a5549f8a
Fix DialedDirectly configuration for Consul dataplane. (#15760)
Fix DialedDirectly configuration for Consul dataplane.
2022-12-13 09:16:31 -06:00
Dan Upton c73707ca3c
grpc: add rate-limiting middleware (#15550)
Implements the gRPC middleware for rate-limiting as a tap.ServerInHandle
function (executed before the request is unmarshaled).

Mappings between gRPC methods and their operation type are generated by
a protoc plugin introduced by #15564.
2022-12-13 15:01:56 +00:00
Dan Upton 4894848993
server: add placeholder glue for rate limit handler (#15539)
Adds a no-op implementation of the rate-limit handler and exposes
it on the consul.Server struct.

It allows us to start working on the net/rpc and gRPC interceptors
and config (re)loading logic, without having to implement the full
handler up-front.

Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-13 11:41:54 +00:00
Dhia Ayachi b459d58e8d
add multilimiter and tests (#15467)
* add multilimiter and tests

* exporting LimitedEntity

* go mod tidy

* Apply suggestions from code review

Co-authored-by: John Murret <john.murret@hashicorp.com>

* add config update and rename config params

* add doc string and split config

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* use timer to avoid go routine leak and change the interface

* add comments to tests

* fix failing test

* add prefix with config edge, refactor tests

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* refactor to apply configs for limiters under a prefix

* add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic

* make KeyType an exported type

* split the config and limiter trees to fix race conditions in config update

* rename variables

* fix race in test and remove dead code

* fix reconcile loop to not create a timer on each loop

* add extra benchmark tests and fix tests

* fix benchmark test to pass value to func

* use a separate go routine to write limiters (#15643)

* use a separate go routine to write limiters

* Add updating limiter when another limiter is created

* fix waiter to be a ticker, so we commit more than once.

* fix tests and add tests for coverage

* unexport members and add tests

* make UpdateConfig thread safe and multi call to Run safe

* replace swith with if

* fix review comments

* replace time.sleep with retries

* fix flaky test and remove unnecessary init

* fix test races

* remove unnecessary negative test case

* remove fixed todo

Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
2022-12-08 14:42:07 -05:00
R.B. Boyer ba6b24babf
connect: ensure all vault connect CA tests use limited privilege tokens (#15669)
All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date.

One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.
2022-12-06 10:06:36 -06:00
R.B. Boyer a88d1239e3
Detect Vault 1.11+ import in secondary datacenters and update default issuer (#15661)
The fix outlined and merged in #15253 fixed the issue as it occurs in the primary
DC. There is a similar issue that arises when vault is used as the Connect CA in a
secondary datacenter that is fixed by this PR.

Additionally: this PR adds support to run the existing suite of vault related integration
tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)
2022-12-05 15:39:21 -06:00
Chris S. Kim 5d06668248
Add warn log when all ACL policies are filtered out (#15632) 2022-12-05 11:26:10 -05:00
Freddy 7641d10184
Remove log line about server mgmt token init (#15610)
* Remove log line about server mgmt token init

Currently the server management token is only being bootstrapped in the
primary datacenter. That means that servers on the secondary datacenter
will never have this token available, and would log this line any time a
token is resolved.

Bootstrapping the token in secondary datacenters will be done in a
follow-up.

* Add changelog entry
2022-11-29 17:56:03 -05:00
cskh 92e71318c1
fix(peering): increase the gRPC limit to 8MB (#15503)
* fix(peering): increase the gRPC limit to 50MB

* changelog

* update gRPC limit to 8MB
2022-11-28 17:48:43 -05:00
Chris S. Kim 4ad4cb1183
Use backport-compatible assertion (#15546)
* Use backport-compatible assertion

* Add workaround for broken apt-get
2022-11-24 11:44:20 -05:00
Chris S. Kim d146a3d542
Use rpcHoldTimeout to calculate blocking timeout (#15541)
Adds buffer to clients so that servers have time to respond to blocking queries.
2022-11-24 10:13:02 -05:00
Kyle Havlovitz f5c5d2f5c6
auto-config: relax node name validation for JWT authorization (#15370)
* auto-config: relax node name validation for JWT authorization

This changes the JWT authorization logic to allow all non-whitespace,
non-quote characters when validating node names. Consul had previously
allowed these characters in node names, until this validation was added
to fix a security vulnerability with whitespace/quotes being passed to
the `bexpr` library. This unintentionally broke node names with
characters like `.` which aren't related to this vulnerability.

* Update website/content/docs/agent/config/cli-flags.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2022-11-14 18:24:40 -06:00
Dhia Ayachi 219a3c5bd3
Leadership transfer cmd (#14132)
* add leadership transfer command

* add RPC call test (flaky)

* add missing import

* add changelog

* add command registration

* Apply suggestions from code review

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* add the possibility of providing an id to raft leadership transfer. Add few tests.

* delete old file from cherry pick

* rename changelog filename to PR #

* rename changelog and fix import

* fix failing test

* check for OperatorWrite

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* rename from leader-transfer to transfer-leader

* remove version check and add test for operator read

* move struct to operator.go

* first pass

* add code for leader transfer in the grpc backend and tests

* wire the http endpoint to the new grpc endpoint

* remove the RPC endpoint

* remove non needed struct

* fix naming

* add mog glue to API

* fix comment

* remove dead code

* fix linter error

* change package name for proto file

* remove error wrapping

* fix failing test

* add command registration

* add grpc service mock tests

* fix receiver to be pointer

* use defined values

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* reuse MockAclAuthorizer

* add documentation

* remove usage of external.TokenFromContext

* fix failing tests

* fix proto generation

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

* add more context in doc for the reason

* Apply suggestions from docs code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* regenerate proto

* fix linter errors

Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2022-11-14 15:35:12 -05:00
Freddy e96c0e1dad
Fixup authz for data imported from peers (#15347)
There are a few changes that needed to be made to to handle authorizing
reads for imported data:

- If the data was imported from a peer we should not attempt to read the
  data using the traditional authz rules. This is because the name of
  services/nodes in a peer cluster are not equivalent to those of the
  importing cluster.

- If the data was imported from a peer we need to check whether the
  token corresponds to a service, meaning that it has service:write
  permissions, or to a local read only token that can read all
  nodes/services in a namespace.

This required changes at the policyAuthorizer level, since that is the
only view available to OSS Consul, and at the enterprise
partition/namespace level.
2022-11-14 11:36:27 -07:00
Dan Stough ee56e06f22
[OSS] fix: wait and try longer to peer through mesh gw (#15328) 2022-11-10 13:54:00 -05:00
Kyle Schochenmaier 2b1e5f69e2
removes ioutil usage everywhere which was deprecated in go1.16 (#15297)
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-11-10 10:26:01 -06:00
malizz 8d2ed1999d
update ACLs for cluster peering (#15317)
* update ACLs for cluster peering

* add changelog

* Update .changelog/15317.txt

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
2022-11-09 13:02:58 -08:00
Derek Menteer 9e76d274ec
Fix mesh gateway configuration with proxy-defaults (#15186)
* Fix mesh gateway proxy-defaults not affecting upstreams.

* Clarify distinction with upstream settings

Top-level mesh gateway mode in proxy-defaults and service-defaults gets
merged into NodeService.Proxy.MeshGateway, and only gets merged with
the mode attached to an an upstream in proxycfg/xds.

* Fix mgw mode usage for peered upstreams

There were a couple issues with how mgw mode was being handled for
peered upstreams.

For starters, mesh gateway mode from proxy-defaults
and the top-level of service-defaults gets stored in
NodeService.Proxy.MeshGateway, but the upstream watch for peered data
was only considering the mesh gateway config attached in
NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway
mode via global proxy-defaults or service-defaults on the downstream
would not have an effect.

Separately, transparent proxy watches for peered upstreams didn't
consider mesh gateway mode at all.

This commit addresses the first issue by ensuring that we overlay the
upstream config for peered upstreams as we do for non-peered. The second
issue is addressed by re-using setupWatchesForPeeredUpstream when
handling transparent proxy updates.

Note that for transparent proxies we do not yet support mesh gateway
mode per upstream, so the NodeService.Proxy.MeshGateway mode is used.

* Fix upstream mesh gateway mode handling in xds

This commit ensures that when determining the mesh gateway mode for
peered upstreams we consider the NodeService.Proxy.MeshGateway config as
a baseline.

In absense of this change, setting a mesh gateway mode via
proxy-defaults or the top-level of service-defaults will not have an
effect for peered upstreams.

* Merge service/proxy defaults in cfg resolver

Previously the mesh gateway mode for connect proxies would be
merged at three points:

1. On servers, in ComputeResolvedServiceConfig.
2. On clients, in MergeServiceConfig.
3. On clients, in proxycfg/xds.

The first merge returns a ServiceConfigResponse where there is a
top-level MeshGateway config from proxy/service-defaults, along with
per-upstream config.

The second merge combines per-upstream config specified at the service
instance with per-upstream config specified centrally.

The third merge combines the NodeService.Proxy.MeshGateway
config containing proxy/service-defaults data with the per-upstream
mode. This third merge is easy to miss, which led to peered upstreams
not considering the mesh gateway mode from proxy-defaults.

This commit removes the third merge, and ensures that all mesh gateway
config is available at the upstream. This way proxycfg/xds do not need
to do additional overlays.

* Ensure that proxy-defaults is considered in wc

Upstream defaults become a synthetic Upstream definition under a
wildcard key "*". Now that proxycfg/xds expect Upstream definitions to
have the final MeshGateway values, this commit ensures that values from
proxy-defaults/service-defaults are the default for this synthetic
upstream.

* Add changelog.

Co-authored-by: freddygv <freddy@hashicorp.com>
2022-11-09 10:14:29 -06:00
Dan Upton acfdbb23a9
chore: remove unused argument from MergeNodeServiceWithCentralConfig (#15024)
Previously, the MergeNodeServiceWithCentralConfig method accepted a
ServiceSpecificRequest argument, of which only the Datacenter and
QueryOptions fields were used.

Digging a little deeper, it turns out these fields were only passed
down to the ComputeResolvedServiceConfig method (through the
ServiceConfigRequest struct) which didn't actually use them.

As such, not all call-sites passed a valid ServiceSpecificRequest
so it's safer to remove the argument altogether to prevent future
changes from depending on it.
2022-11-09 14:54:57 +00:00
Derek Menteer a8eb047ee6
Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267)
Re-add ServerExternalAddresses parameter in GenerateToken endpoint

This reverts commit 5e156772f6a7fba5324eb6804ae4e93c091229a6
and adds extra functionality to support newer peering behaviors.
2022-11-08 14:55:18 -06:00
Chris S. Kim dbe3dc96f3
Update hcp-scada-provider to fix diamond dependency problem with go-msgpack (#15185) 2022-11-07 11:34:30 -05:00
Dan Stough 3eb3cf3b0d
fix: persist peering CA updates to dialing clusters (#15243)
fix: persist peering CA updates to dialing clusters
2022-11-04 12:53:20 -04:00
Derek Menteer cad89029dd Decrease retry time for failed peering connections. 2022-10-31 14:30:27 -05:00
R.B. Boyer 879584a773
test: fix flaky TestSubscribeBackend_IntegrationWithServer_DeliversAllMessages test (#15195)
Allow for some message duplication in subscription events during assertions.

I'm pretty sure the subscriptions machinery allows for messages to occasionally
be duplicated instead of dropping them, as a once-and-only-once queue is a pipe
dream and you have to pick one of the other two options.
2022-10-31 12:10:43 -05:00
Derek Menteer 58f15db4c4 Allow peering endpoints to bypass verify_incoming. 2022-10-31 09:56:30 -05:00
Eric Haberkorn 57fb729547
Fix peering metrics bug (#15178)
This bug was caused by the peering health metric being set to NaN.
2022-10-28 10:51:12 -04:00
Luke Kysow 6b1ec05470
autoencrypt: helpful error for clients with wrong dc (#14832)
* autoencrypt: helpful error for clients with wrong dc

If clients have set a different datacenter than the servers they're
connecting with for autoencrypt, give a helpful error message.
2022-10-25 10:13:41 -07:00
Chris S. Kim ae1646706f Regenerate files according to 1.19.2 formatter 2022-10-24 16:12:08 -04:00
Iryna Shustava a3a6743e0a
proxycfg: watch service-defaults config entries (#15025)
To support Destinations on the service-defaults (for tproxy with terminating gateway), we need to now also make servers watch service-defaults config entries.
2022-10-24 12:50:28 -06:00
Chris S. Kim 06f583a7c2 Move oss-only test to its own file 2022-10-24 14:17:43 -04:00
R.B. Boyer 87432a8dd4
chore: update golangci-lint to v1.50.1 (#15022) 2022-10-24 11:48:02 -05:00
Venu Yanamandra 3dd12a2960
Update error message when restoring ENT snapshot in OSS (#15066) 2022-10-24 11:40:26 -04:00
Chris S. Kim 569c3bce88 Update expected encoding in test
go-memdb was updated in v1.3.3 to make integers in indexes sortable, which changed how integers were encoded.
2022-10-20 14:32:42 -04:00
freddygv f3548167fc Use plain TaggedAddressWAN 2022-10-19 16:32:44 -06:00
freddygv 1b589ba964 Add unit test 2022-10-19 16:26:15 -06:00
cskh c0dc93e5b8 fix: wan address isn't used by peering token 2022-10-19 16:33:25 -04:00