Kyle Havlovitz
35556a0eaf
Remove stale raft servers differently depending on minRaftVersion
2017-03-29 13:38:40 -07:00
Kyle Havlovitz
18c95b504a
Clean up raft servers without a corresponding serf entry
2017-03-29 12:52:00 -07:00
Kyle Havlovitz
37ea20cb44
Add advanced autopilot features
2017-03-22 15:25:16 -07:00
James Phillips
5ee1256137
Converts the stats fetch from serial to parallel and snaps the last index.
2017-03-19 20:48:42 -07:00
James Phillips
cfc01419c8
Adds a stats fetcher to make sure we don't block the autopilot loop.
2017-03-17 18:42:28 -07:00
Kyle Havlovitz
f50ce2084a
Only count healthy voters for FailureTolerance
2017-03-16 12:19:16 -07:00
Kyle Havlovitz
9c7bba7903
Tweak last_contact health logic for leader
2017-03-15 19:57:54 -07:00
Kyle Havlovitz
bc0494e396
Reorganized cluster health check loop and logic
2017-03-15 18:27:17 -07:00
Kyle Havlovitz
c40279e012
Fix an issue with changing server IDs and add a few UX enhancements around autopilot features
2017-03-15 16:09:55 -07:00
Kyle Havlovitz
3c6bece859
Add autopilot guide to the docs
2017-03-10 14:55:18 -08:00
Kyle Havlovitz
b15d67bfac
Use defers for WaitGroup and Ticker stop
2017-03-10 12:29:03 -08:00
Kyle Havlovitz
8130f9b1c1
Cleaned up and reorganized some autopilot-related code
2017-03-09 18:21:40 -08:00
Kyle Havlovitz
a5cbee0e99
Add AutopilotPolicy interface and BasicAutopilot
2017-03-08 12:26:58 -08:00
Kyle Havlovitz
a7de1e2a3b
Move RaftStats to Status endpoint
2017-03-07 13:58:06 -08:00
Kyle Havlovitz
8bcab6c6d7
Add autopilot server health tracking
...
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00