Commit graph

15 commits

Author SHA1 Message Date
Kyle Havlovitz 35556a0eaf
Remove stale raft servers differently depending on minRaftVersion 2017-03-29 13:38:40 -07:00
Kyle Havlovitz 18c95b504a
Clean up raft servers without a corresponding serf entry 2017-03-29 12:52:00 -07:00
Kyle Havlovitz 37ea20cb44
Add advanced autopilot features 2017-03-22 15:25:16 -07:00
James Phillips 5ee1256137
Converts the stats fetch from serial to parallel and snaps the last index. 2017-03-19 20:48:42 -07:00
James Phillips cfc01419c8
Adds a stats fetcher to make sure we don't block the autopilot loop. 2017-03-17 18:42:28 -07:00
Kyle Havlovitz f50ce2084a
Only count healthy voters for FailureTolerance 2017-03-16 12:19:16 -07:00
Kyle Havlovitz 9c7bba7903
Tweak last_contact health logic for leader 2017-03-15 19:57:54 -07:00
Kyle Havlovitz bc0494e396
Reorganized cluster health check loop and logic 2017-03-15 18:27:17 -07:00
Kyle Havlovitz c40279e012
Fix an issue with changing server IDs and add a few UX enhancements around autopilot features 2017-03-15 16:09:55 -07:00
Kyle Havlovitz 3c6bece859
Add autopilot guide to the docs 2017-03-10 14:55:18 -08:00
Kyle Havlovitz b15d67bfac
Use defers for WaitGroup and Ticker stop 2017-03-10 12:29:03 -08:00
Kyle Havlovitz 8130f9b1c1
Cleaned up and reorganized some autopilot-related code 2017-03-09 18:21:40 -08:00
Kyle Havlovitz a5cbee0e99
Add AutopilotPolicy interface and BasicAutopilot 2017-03-08 12:26:58 -08:00
Kyle Havlovitz a7de1e2a3b
Move RaftStats to Status endpoint 2017-03-07 13:58:06 -08:00
Kyle Havlovitz 8bcab6c6d7
Add autopilot server health tracking
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00