Adds a user-configurable rate limiter to proxycfg snapshot delivery,
with a default limit of 250 updates per second.
This addresses a problem observed in our load testing of Consul
Dataplane where updating a "global" resource such as a wildcard
intention or the proxy-defaults config entry could starve the Raft or
Memberlist goroutines of CPU time, causing general cluster instability.
* Add metrics for consul.memberlist.node.instances, consul.memberlist.queue.broadcast, consul.memberlist.size.local, and consul.memberlist.size.remote
* Fixing last table on page that does not render properly
* fixing queue name
* Typos
* Test failing
* Convert values <1ms to decimal
* Fix test
* Update docs and test error msg
* Applied suggested changes to test case
* Changelog file and suggested changes
* Update .changelog/12905.txt
Co-authored-by: Chris S. Kim <kisunji92@gmail.com>
* suggested change - start duration with microseconds instead of nanoseconds
* fix error
* suggested change - floats
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: Chris S. Kim <kisunji92@gmail.com>
Co-authored-by: Ashlee Boyer <ashlee.boyer@hashicorp.com>
Co-authored-by: Ashlee M Boyer <43934258+ashleemboyer@users.noreply.github.com>
Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
Co-authored-by: HashiBot <62622282+hashibot-web@users.noreply.github.com>
Co-authored-by: Kevin Wang <kwangsan@gmail.com>
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.
In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.
This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.
If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.
Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.
The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
Adds guidance when upgrading a Consul service mesh deployment to 1.13 and:
- using auto-encrypt or auto-config; or
- the HTTPS port is not enabled on Consul agents
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
Ensure that the peer stream replication rpc can successfully be used with TLS activated.
Also:
- If key material is configured for the gRPC port but HTTPS is not
enabled now TLS will still be activated for the gRPC port.
- peerstream replication stream opened by the establishing-side will now
ignore grpc.WithBlock so that TLS errors will bubble up instead of
being awkwardly delayed or suppressed
In the metric documentation, some metrics are written with an extra
trailing dot character. Often this is just spurious. In some cases, it
is an allusion to the metric name having various variable dot-separated
parts (which really ought to be labels, but that's another issue).
1) Trim all the dots which are without meaning.
2) Explicitly show with placeholders in angle brackets, where there are
variable parts.
3) Remove a `json` type indicator from a code block which was not JSON
(rather Prometheus metrics) format, since my IDE was reporting the
non-JSON syntax as an error.
4) Reformat each Markdown table which I had touched with other changes
to have consistent column spacing in the source representation. As
a result, this PR has a lot of whitespace changes, so please make use
of GitHub's "ignore whitespace" checkbox in the PR diff viewer.
Let me know if you would prefer:
- to not accept the table spacing reformat in the same PR as other
changes
- to not accept the table spacing reformat ever
- to have all tables in the file reformatted, not just ones touched with
other changes