Upgrade Raft to v1.3.9 for saturation metrics (#12865)
This commit is contained in:
parent
642b75b60b
commit
e4008124bd
|
@ -0,0 +1,3 @@
|
|||
```release-note:improvement
|
||||
telemetry: Added `consul.raft.thread.main.saturation` and `consul.raft.thread.fsm.saturation` metrics to measure approximate saturation of the Raft goroutines
|
||||
```
|
2
go.mod
2
go.mod
|
@ -53,7 +53,7 @@ require (
|
|||
github.com/hashicorp/hcl v1.0.0
|
||||
github.com/hashicorp/hil v0.0.0-20200423225030-a18a1cd20038
|
||||
github.com/hashicorp/memberlist v0.3.1
|
||||
github.com/hashicorp/raft v1.3.8
|
||||
github.com/hashicorp/raft v1.3.9
|
||||
github.com/hashicorp/raft-autopilot v0.1.6
|
||||
github.com/hashicorp/raft-boltdb v0.0.0-20211202195631-7d34b9fb3f42 // indirect
|
||||
github.com/hashicorp/raft-boltdb/v2 v2.2.2
|
||||
|
|
4
go.sum
4
go.sum
|
@ -368,8 +368,8 @@ github.com/hashicorp/memberlist v0.3.1/go.mod h1:MS2lj3INKhZjWNqd3N0m3J+Jxf3DAOn
|
|||
github.com/hashicorp/raft v1.1.0/go.mod h1:4Ak7FSPnuvmb0GV6vgIAJ4vYT4bek9bb6Q+7HVbyzqM=
|
||||
github.com/hashicorp/raft v1.1.1/go.mod h1:vPAJM8Asw6u8LxC3eJCUZmRP/E4QmUGE1R7g7k8sG/8=
|
||||
github.com/hashicorp/raft v1.2.0/go.mod h1:vPAJM8Asw6u8LxC3eJCUZmRP/E4QmUGE1R7g7k8sG/8=
|
||||
github.com/hashicorp/raft v1.3.8 h1:lrhx4wesQLOSv3ERX/pK4cwfzQ0J2RgzsvAkBxHe1bA=
|
||||
github.com/hashicorp/raft v1.3.8/go.mod h1:4Ak7FSPnuvmb0GV6vgIAJ4vYT4bek9bb6Q+7HVbyzqM=
|
||||
github.com/hashicorp/raft v1.3.9 h1:9yuo1aR0bFTr1cw7pj3S2Bk6MhJCsnr2NAxvIBrP2x4=
|
||||
github.com/hashicorp/raft v1.3.9/go.mod h1:4Ak7FSPnuvmb0GV6vgIAJ4vYT4bek9bb6Q+7HVbyzqM=
|
||||
github.com/hashicorp/raft-autopilot v0.1.6 h1:C1q3RNF2FfXNZfHWbvVAu0QixaQK8K5pX4O5lh+9z4I=
|
||||
github.com/hashicorp/raft-autopilot v0.1.6/go.mod h1:Af4jZBwaNOI+tXfIqIdbcAnh/UyyqIMj/pOISIfhArw=
|
||||
github.com/hashicorp/raft-boltdb v0.0.0-20171010151810-6e5ba93211ea/go.mod h1:pNv7Wc3ycL6F5oOWn+tPGo2gWD4a5X+yp/ntwdKLjRk=
|
||||
|
|
|
@ -149,6 +149,28 @@ you will need to apply a function such as InfluxDB's [`non_negative_difference()
|
|||
Sudden large changes to the `consul.client.rpc` metrics (greater than 50% deviation from baseline).
|
||||
`consul.client.rpc.exceeded` or `consul.client.rpc.failed` count > 0, as it implies that an agent is being rate-limited or fails to make an RPC request to a Consul server
|
||||
|
||||
### Raft Thread Saturation
|
||||
|
||||
| Metric Name | Description | Unit | Type |
|
||||
| :----------------------------------- | :----------------------------------------------------------------------------------------------------------------------- | :--------- | :----- |
|
||||
| `consul.raft.thread.main.saturation` | An approximate measurement of the proportion of time the main Raft goroutine is busy and unavailable to accept new work. | percentage | sample |
|
||||
| `consul.raft.thread.fsm.saturation` | An approximate measurement of the proportion of time the Raft FSM goroutine is busy and unavailable to accept new work. | percentage | sample |
|
||||
|
||||
**Why they're important:** These measurements are a useful proxy for how much
|
||||
capacity a Consul server has to accept additional write load. High saturation
|
||||
of the Raft goroutines can lead to elevated latency in the rest of the system
|
||||
and cause cluster instability.
|
||||
|
||||
**What to look for:** Generally, a server's steady-state saturation should be
|
||||
less than 50%.
|
||||
|
||||
**NOTE:** These metrics are approximate and under extremely heavy load won't
|
||||
give a perfect fine-grained view of how much headroom a server has available.
|
||||
Instead, treat them as an early warning sign.
|
||||
|
||||
** Requirements: **
|
||||
* Consul 1.13.0+
|
||||
|
||||
### Raft Replication Capacity Issues
|
||||
|
||||
| Metric Name | Description | Unit | Type |
|
||||
|
|
Loading…
Reference in New Issue