Co-authored-by: Ashlee Boyer <ashlee.boyer@hashicorp.com>
Co-authored-by: Ashlee M Boyer <43934258+ashleemboyer@users.noreply.github.com>
Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
Co-authored-by: HashiBot <62622282+hashibot-web@users.noreply.github.com>
Co-authored-by: Kevin Wang <kwangsan@gmail.com>
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.
In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.
This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.
If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.
Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.
The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
In the metric documentation, some metrics are written with an extra
trailing dot character. Often this is just spurious. In some cases, it
is an allusion to the metric name having various variable dot-separated
parts (which really ought to be labels, but that's another issue).
1) Trim all the dots which are without meaning.
2) Explicitly show with placeholders in angle brackets, where there are
variable parts.
3) Remove a `json` type indicator from a code block which was not JSON
(rather Prometheus metrics) format, since my IDE was reporting the
non-JSON syntax as an error.
4) Reformat each Markdown table which I had touched with other changes
to have consistent column spacing in the source representation. As
a result, this PR has a lot of whitespace changes, so please make use
of GitHub's "ignore whitespace" checkbox in the PR diff viewer.
Let me know if you would prefer:
- to not accept the table spacing reformat in the same PR as other
changes
- to not accept the table spacing reformat ever
- to have all tables in the file reformatted, not just ones touched with
other changes
The name of the metric as registered with the metrics library to provide
the help string, was incorrect compared with the actual code that sets
the metric value - bring them into sync.
Also, the help message was incorrect. Rather than copy the help message
from telemetry.mdx, which was correct, but felt a bit unnatural in the
way it was worded, update both of them to a new wording.
* Fixes a lint warning about t.Errorf not supporting %w
* Enable running autopilot on all servers
On the non-leader servers all they do is update the state and do not attempt any modifications.
* Fix the RPC conn limiting tests
Technically they were relying on racey behavior before. Now they should be reliable.
* Documenting the new raft_boltdb configuration options
* Add documentation around new boltdb metrics.
* Correct documentation for the consul.raft.fsm.apply metric
Add changelog to document what changed.
Add entry to telemetry section of the website to document what changed
Add docs to the usagemetric endpoint to help document the metrics in code
Replace it with an implementation that returns an error, and rename some symbols
to use a Deprecated suffix to make it clear.
Also remove the ACLRequest struct, which is no longer referenced.
* Use CodeTabs for examples in multiple formats.
* Ensure correct language on code fences.
* Use CodeBlockConfig for examples with filenames, or which need
highlighted content.
* add intermediate ca metric routine
* add Gauge config for intermediate cert
* Stop metrics routine when stopping leader
* add changelog entry
* updage changelog
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
* use variables instead of a map
* go imports sort
* Add metrics for primary and secondary ca
* start metrics routine in the right DC
* add telemetry documentation
* update docs
* extract expiry fetching in a func
* merge metrics for primary and secondary into signing ca metric
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
The query metrics are actually reported for all read queries, not only
ones that use a MinIndex to block for updates.
Also clarify the raft.apply metric is only on the leader.