Commit Graph

274 Commits

Author SHA1 Message Date
Paul Banks 50c600f93b
Adding experimental support for a more efficient LogStore implementation (#16176)
* Adding experimental support for a more efficient LogStore implementation

* Adding changelog entry

* Fix go mod tidy issues
2023-02-08 16:50:22 +00:00
skpratt 77d805ceb6
Remove legacy acl tokens (#15947)
* remove legacy tokens

* Update test comment

Co-authored-by: Paul Glass <pglass@hashicorp.com>

* fix imports

* update docs for additional CLI changes

* add test case for anonymous token

* set deprecated api fields to json ignore and fix patch errors

* update changelog to breaking-change

* fix import

* update api docs to remove legacy reference

* fix docs nav data

---------

Co-authored-by: Paul Glass <pglass@hashicorp.com>
2023-01-27 09:17:07 -06:00
Poonam Jadhav c50bf92b84
feat: panic handler in rpc rate limit interceptor (#16022)
* feat: handle panic in rpc rate limit interceptor

* test: additional test cases to rpc rate limiting interceptor

* refactor: remove unused listener
2023-01-25 14:13:38 -05:00
Nathan Coleman 3e5e03aa95
Run config entry controller routines on leader (#16054) 2023-01-25 12:21:46 -06:00
Dhia Ayachi f17bc5ed73
inject logger and create logdrop sink (#15822)
* inject logger and create logdrop sink

* init sink with an empty struct instead of nil

* wrap a logger instead of a sink and add a discard logger to avoid double logging

* fix compile errors

* fix linter errors

* Fix bug where log arguments aren't properly formatted

* Move log sink construction outside of handler

* Add prometheus definition and docs for log drop counter

Co-authored-by: Daniel Upton <daniel@floppy.co>
2023-01-06 11:33:53 -07:00
Semir Patel 8242459c66
Wire up the rate limiter to net/rpc calls (#15879) 2023-01-04 13:38:44 -06:00
Dan Upton 006138beb4
Wire in rate limiter to handle internal and external gRPC calls (#15857) 2022-12-23 13:42:16 -06:00
John Murret 8c33d7cc0e
Rate limiting handler - ensure configuration has changed before modifying limiters (#15805)
* Rate limiting handler - ensure configuration has changed before modifying limiters

* Updating test to validate arguments to UpdateConfig

* Removing duplicate test.  Updating mock.

* adding logging for when UpdateConfig is called but the config has not changed.

* Update agent/consul/rate/handler.go

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-20 14:12:03 -07:00
Semir Patel 1f82e82e04
Pass remote addr of incoming HTTP requests through to RPC(..) calls (#15700) 2022-12-14 09:24:22 -06:00
John Murret 700c693b33
adding config for request_limits (#15531)
* server: add placeholder glue for rate limit handler

This commit adds a no-op implementation of the rate-limit handler and
adds it to the `consul.Server` struct and setup code.

This allows us to start working on the net/rpc and gRPC interceptors and
config logic.

* Add handler errors

* Set the global read and write limits

* fixing multilimiter moving packages

* Fix typo

* Simplify globalLimit usage

* add multilimiter and tests

* exporting LimitedEntity

* Apply suggestions from code review

Co-authored-by: John Murret <john.murret@hashicorp.com>

* add config update and rename config params

* add doc string and split config

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* use timer to avoid go routine leak and change the interface

* add comments to tests

* fix failing test

* add prefix with config edge, refactor tests

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* refactor to apply configs for limiters under a prefix

* add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic

* make KeyType an exported type

* split the config and limiter trees to fix race conditions in config update

* rename variables

* fix race in test and remove dead code

* fix reconcile loop to not create a timer on each loop

* add extra benchmark tests and fix tests

* fix benchmark test to pass value to func

* server: add placeholder glue for rate limit handler

This commit adds a no-op implementation of the rate-limit handler and
adds it to the `consul.Server` struct and setup code.

This allows us to start working on the net/rpc and gRPC interceptors and
config logic.

* Set the global read and write limits

* fixing multilimiter moving packages

* add server configuration for global rate limiting.

* remove agent test

* remove added stuff from handler

* remove added stuff from multilimiter

* removing unnecessary TODOs

* Removing TODO comment from handler

* adding in defaulting to infinite

* add disabled status in there

* adding in documentation for disabled mode.

* make disabled the default.

* Add mock and agent test

* addig documentation and missing mock file.

* Fixing test TestLoad_IntegrationWithFlags

* updating docs based on PR feedback.

* Updating Request Limits mode to use int based on PR feedback.

* Adding RequestLimits struct so we have a nested struct in ReloadableConfig.

* fixing linting references

* Update agent/consul/rate/handler.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* Update agent/consul/config.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* removing the ignore of the request limits in JSON.  addingbuilder logic to convert any read rate or write rate less than 0 to rate.Inf

* added conversion function to convert request limits object to handler config.

* Updating docs to reflect gRPC and RPC are rate limit and as a result, HTTP requests are as well.

* Updating values for TestLoad_FullConfig() so that they were different and discernable.

* Updating TestRuntimeConfig_Sanitize

* Fixing TestLoad_IntegrationWithFlags test

* putting nil check in place

* fixing rebase

* removing change for missing error checks.  will put in another PR

* Rebasing after default multilimiter config change

* resolving rebase issues

* updating reference for incomingRPCLimiter to use interface

* updating interface

* Updating interfaces

* Fixing mock reference

Co-authored-by: Daniel Upton <daniel@floppy.co>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-13 13:09:55 -07:00
Dan Upton c73707ca3c
grpc: add rate-limiting middleware (#15550)
Implements the gRPC middleware for rate-limiting as a tap.ServerInHandle
function (executed before the request is unmarshaled).

Mappings between gRPC methods and their operation type are generated by
a protoc plugin introduced by #15564.
2022-12-13 15:01:56 +00:00
Dan Upton 4894848993
server: add placeholder glue for rate limit handler (#15539)
Adds a no-op implementation of the rate-limit handler and exposes
it on the consul.Server struct.

It allows us to start working on the net/rpc and gRPC interceptors
and config (re)loading logic, without having to implement the full
handler up-front.

Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-13 11:41:54 +00:00
Dhia Ayachi 219a3c5bd3
Leadership transfer cmd (#14132)
* add leadership transfer command

* add RPC call test (flaky)

* add missing import

* add changelog

* add command registration

* Apply suggestions from code review

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* add the possibility of providing an id to raft leadership transfer. Add few tests.

* delete old file from cherry pick

* rename changelog filename to PR #

* rename changelog and fix import

* fix failing test

* check for OperatorWrite

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* rename from leader-transfer to transfer-leader

* remove version check and add test for operator read

* move struct to operator.go

* first pass

* add code for leader transfer in the grpc backend and tests

* wire the http endpoint to the new grpc endpoint

* remove the RPC endpoint

* remove non needed struct

* fix naming

* add mog glue to API

* fix comment

* remove dead code

* fix linter error

* change package name for proto file

* remove error wrapping

* fix failing test

* add command registration

* add grpc service mock tests

* fix receiver to be pointer

* use defined values

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* reuse MockAclAuthorizer

* add documentation

* remove usage of external.TokenFromContext

* fix failing tests

* fix proto generation

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

* add more context in doc for the reason

* Apply suggestions from docs code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* regenerate proto

* fix linter errors

Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2022-11-14 15:35:12 -05:00
Kyle Schochenmaier 2b1e5f69e2
removes ioutil usage everywhere which was deprecated in go1.16 (#15297)
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-11-10 10:26:01 -06:00
Chris S. Kim e4c20ec190
Refactor client RPC timeouts (#14965)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created.

Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.
2022-10-18 15:05:09 -04:00
Paul Glass 8cf430140a
gRPC server metrics (#14922)
* Move stats.go from grpc-internal to grpc-middleware
* Update grpc server metrics with server type label
* Add stats test to grpc-external
* Remove global metrics instance from grpc server tests
2022-10-11 17:00:32 -05:00
Nick Ethier 5e4b3ef5d4
add HCP integration component (#14723)
* add HCP integration

* lint: use non-deprecated logging interface
2022-09-26 14:58:15 -04:00
DanStough b37a2ba889 feat(peering): validate server name conflicts on establish 2022-09-14 11:37:30 -04:00
Dan Upton 9fe6c33c0d
xDS Load Balancing (#14397)
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.

In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.

This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.

If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.

Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.

The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
2022-09-09 15:02:01 +01:00
Derek Menteer 6aaf1c6035 Various cleanups. 2022-09-08 10:51:50 -05:00
Derek Menteer b50bc443f3 Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports 2022-09-06 10:51:04 -05:00
Derek Menteer cb478b0e61 Address PR comments. 2022-09-01 16:54:24 -05:00
Chris S. Kim e4a154c88e Add heartbeat timeout grace period when accounting for peering health 2022-08-29 16:32:26 -04:00
Derek Menteer 4a01d75cf8 Add separate grpc_tls port.
To ease the transition for users, the original gRPC
port can still operate in a deprecated mode as either
plain-text or TLS mode. This behavior should be removed
in a future release whenever we no longer support this.

The resulting behavior from this commit is:
  `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports.
  `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port.
  `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).
2022-08-29 13:43:43 -05:00
Chris S. Kim b1025f2dd9 Adjust metrics reporting for peering tracker 2022-08-26 17:34:17 -04:00
alex f64af3be24
peering: add peer health metric (#14004)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-08-25 16:32:59 -07:00
Chris S. Kim fbbb54fdc2 Register peerStreamServer internally to enable RPC forwarding 2022-08-11 11:16:02 -04:00
Matt Keeler 795e5830c6
Implement/Utilize secrets for Peering Replication Stream (#13977) 2022-08-01 10:33:18 -04:00
Luke Kysow d21f793b74
peering: add config to enable/disable peering (#13867)
* peering: add config to enable/disable peering

Add config:

```
peering {
  enabled = true
}
```

Defaults to true. When disabled:
1. All peering RPC endpoints will return an error
2. Leader won't start its peering establishment goroutines
3. Leader won't start its peering deletion goroutines
2022-07-22 15:20:21 -07:00
alex 7bd55578cc
peering: emit exported services count metric (#13811)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-22 12:05:08 -07:00
Luke Kysow 3968f21339
Add docs for peerStreamServer vs peeringServer. (#13781) 2022-07-15 12:23:05 -07:00
Dan Upton 34140ff3e0
grpc: rename public/private directories to external/internal (#13721)
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.

Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.

This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
2022-07-13 16:33:48 +01:00
R.B. Boyer 5b801db24b
peering: move peer replication to the external gRPC port (#13698)
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
2022-07-08 12:01:13 -05:00
Daniel Upton 497df1ca3b proxycfg: server-local config entry data sources
This is the OSS portion of enterprise PR 2056.

This commit provides server-local implementations of the proxycfg.ConfigEntry
and proxycfg.ConfigEntryList interfaces, that source data from streaming events.

It makes use of the LocalMaterializer type introduced for peering replication,
adding the necessary support for authorization.

It also adds support for "wildcard" subscriptions (within a topic) to the event
publisher, as this is needed to fetch service-resolvers for all services when
configuring mesh gateways.

Currently, events will be emitted for just the ingress-gateway, service-resolver,
and mesh config entry types, as these are the only entries required by proxycfg
— the events will be emitted on topics named IngressGateway, ServiceResolver,
and MeshConfig topics respectively.

Though these events will only be consumed "locally" for now, they can also be
consumed via the gRPC endpoint (confirmed using grpcurl) so using them from
client agents should be a case of swapping the LocalMaterializer for an
RPCMaterializer.
2022-07-04 10:48:36 +01:00
Dan Upton 989b22425c
Move ACLResolveResult into acl/resolver package (#13467)
Having this type live in the agent/consul package makes it difficult to
put anything that relies on token resolution (e.g. the new gRPC services)
in separate packages without introducing import cycles.

For example, if package foo imports agent/consul for the ACLResolveResult
type it means that agent/consul cannot import foo to register its service.

We've previously worked around this by wrapping the ACLResolver to
"downgrade" its return type to an acl.Authorizer - aside from the
added complexity, this also loses the resolved identity information.

In the future, we may want to move the whole ACLResolver into the
acl/resolver package. For now, putting the result type there at least,
fixes the immediate import cycle issues.
2022-06-17 10:24:43 +01:00
freddygv a5283e4361 Add leader routine to clean up peerings
Once a peering is marked for deletion a new leader routine will now
clean up all imported resources and then the peering itself.

A lot of the logic was grabbed from the namespace/partitions deferred
deletions but with a handful of simplifications:
- The rate limiting is not configurable.

- Deleting imported nodes/services/checks is done by deleting nodes with
  the Txn API. The services and checks are deleted as a side-effect.

- There is no "round rate limiter" like with namespaces and partitions.
  This is because peerings are purely local, and deleting a peering in
  the datacenter does not depend on deleting data from other DCs like
  with WAN-federated namespaces. All rate limiting is handled by the
  Raft rate limiter.
2022-06-14 15:36:50 -06:00
alex ff2ad3ba0c
peering: send leader addr (#13342)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-06 10:00:38 -07:00
alex 2d8664d384
monitor leadership in peering service (#13257)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-05-26 17:55:16 -07:00
R.B. Boyer bc10055edc
peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218)
The importing peer will need to know what SNI and SPIFFE name
corresponds to each exported service. Additionally it will need to know
at a high level the protocol in use (L4/L7) to generate the appropriate
connection pool and local metrics.

For replicated connect synthetic entities we edit the `Connect{}` part
of a `NodeService` to have a new section:

    {
      "PeerMeta": {
        "SNI": [
          "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul"
        ],
        "SpiffeID": [
          "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web"
        ],
        "Protocol": "tcp"
      }
    }

This data is then replicated and saved as-is at the importing side. Both
SNI and SpiffeID are slices for now until I can be sure we don't need
them for how mesh gateways will ultimately work.
2022-05-25 12:37:44 -05:00
Dan Upton 6bfdb48560
acl: gRPC login and logout endpoints (#12935)
Introduces two new public gRPC endpoints (`Login` and `Logout`) and
includes refactoring of the equivalent net/rpc endpoints to enable the
majority of logic to be reused (i.e. by extracting the `Binder` and
`TokenWriter` types).

This contains the OSS portions of the following enterprise commits:

- 75fcdbfcfa6af21d7128cb2544829ead0b1df603
- bce14b714151af74a7f0110843d640204082630a
- cc508b70fbf58eda144d9af3d71bd0f483985893
2022-05-04 17:38:45 +01:00
Dhia Ayachi 9dc5200155
update raft to v1.3.8 (#12844)
* update raft to v1.3.7

* add changelog

* fix compilation error

* fix HeartbeatTimeout

* fix ElectionTimeout to reload only if value is valid

* fix default values for `ElectionTimeout` and `HeartbeatTimeout`

* fix test defaults

* bump raft to v1.3.8
2022-04-25 10:19:26 -04:00
R.B. Boyer 809344a6f5
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
Matt Keeler f49adfaaf0
Implement the ServerDiscovery.WatchServers gRPC endpoint (#12819)
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
2022-04-21 12:56:18 -04:00
Riddhi Shah 1d49f5c84e
[OSS] gRPC call to get envoy bootstrap params (#12825)
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
2022-04-19 17:24:21 -07:00
Dan Upton 769d1d6e8e
ConnectCA.Sign gRPC Endpoint (#12787)
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
2022-04-14 14:26:14 +01:00
Karl Cardenas b0b197964c
Merge pull request #12562 from hashicorp/docs/blake-agent-config
docs: Agent configuration hierarchy reorganization
2022-04-12 12:33:42 -07:00
FFMMM cf7e6484aa
add more labels to RequestRecorder (#12727)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-04-12 10:50:25 -07:00
Matt Keeler 2a4ca71d3f
Move to using a shared EventPublisher (#12673)
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.

The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
2022-04-12 09:47:42 -04:00
Blake Covarrubias 3175bf6b1b Remove .html extensions from docs URLs 2022-04-11 17:38:49 -07:00
Natalie Smith cd17e98800 docs: fix yet more references to agent/options 2022-04-11 17:38:49 -07:00