Commit Graph

77 Commits

Author SHA1 Message Date
James Phillips 6055a7c0bd
Switches to reliable Raft leader notifications.
This fixes #2896 by switching to the `notifyCh` instead of the `leaderCh`,
so we get all up/down events from Raft regarding leadership. We also wait
for the old leader loop to shut down before we ever consider starting a
new one, which keeps that single-threaded and fixes the panic in that issue.
2017-04-13 14:17:32 -07:00
Kyle Havlovitz 4584d4f4e7
Add nil check to operator autopilot endpoint 2017-04-13 10:43:07 -07:00
Kyle Havlovitz 31a249a42e
Add formatting to autopilot init messages 2017-04-12 18:39:03 -07:00
Kyle Havlovitz 0924d5a2de
Reorganize version check logic for autopilot 2017-04-12 17:09:57 -07:00
Kyle Havlovitz 22620d5e73
Wait to initialize autopilot until all servers are >= 0.8.0 2017-04-12 16:05:13 -07:00
James Phillips 582e6c5938
Fixes an issue where servers would delete catalog information set by
the node when they were trying to reconcile a member.
2017-03-23 15:01:46 -07:00
James Phillips 16a9f6d671
Modifies server reconcile path to not use the server's token for internal operations. 2017-03-23 13:34:30 -07:00
Kyle Havlovitz 9c7bba7903
Tweak last_contact health logic for leader 2017-03-15 19:57:54 -07:00
Kyle Havlovitz bb98e39dd4
Add tests for servers changing address/ID 2017-03-15 16:50:42 -07:00
Kyle Havlovitz c40279e012
Fix an issue with changing server IDs and add a few UX enhancements around autopilot features 2017-03-15 16:09:55 -07:00
Kyle Havlovitz 4807215fa1 Merge pull request #2788 from hashicorp/f-autopilot-2
Autopilot server health monitoring
2017-03-10 12:29:45 -08:00
James Phillips 1e313e339c
Adds token to deregister request when reconciling.
Fixes #2792.
2017-03-09 09:25:42 -08:00
Kyle Havlovitz a7de1e2a3b
Move RaftStats to Status endpoint 2017-03-07 13:58:06 -08:00
Kyle Havlovitz 8bcab6c6d7
Add autopilot server health tracking
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00
Kyle Havlovitz 23c492a74e
Rename DeadServerCleanup and make wording adjustments 2017-02-28 14:45:21 -08:00
Kyle Havlovitz c9ddee1a79
Add CAS capability to autopilot config endpoint 2017-02-24 13:08:49 -08:00
Kyle Havlovitz 56e22a719f
Add state store table and endpoints for autopilot 2017-02-23 20:32:13 -08:00
Kyle Havlovitz ae9fce0ae0
Move raft_protocol out of autopilot config 2017-02-23 13:08:40 -08:00
Kyle Havlovitz f9588b8d7f
Add raft version 2/3 compatibility 2017-02-22 12:53:32 -08:00
Kyle Havlovitz 2c9001a389
Add configurable cleanup of dead servers when a new server joins 2017-02-17 10:49:16 -08:00
James Phillips 2cbf45301e
Adds fine-grained watch support to ACL endpoints. 2017-01-25 09:58:22 -08:00
James Phillips 635afd1924
Adds fine-grained watches to health endpoints. 2017-01-25 09:58:22 -08:00
James Phillips 0891b6c74d
Adds fine-grained watches to catalog endpoints. 2017-01-25 09:58:22 -08:00
James Phillips 55cd1d975c
Adds catalog support for node IDs. 2017-01-18 14:26:42 -08:00
James Phillips 67f1fe8d0b
Adds support for a new "acl_agent_token" which is used for internal
catalog operations.
2016-12-12 14:52:27 -08:00
James Phillips fcd8bb157a
Clarifies a comment about no-op peer operations. 2016-08-08 19:19:18 -07:00
James Phillips 6c8e8271e2
Integrates Consul with new version of Raft library. 2016-08-08 19:19:17 -07:00
Sean Chittenden 7482a9207d
Chase casting types.CheckID to a string into the state_store.
It turns out the indexer can only use strings as arguments when
creating a query.  Cast `types.CheckID` to a `string` before calling
into `memdb`.

Ideally the indexer would be smart enough to do this at compile-time,
but I need to look into how to do this without reflection and the
runtime package.  For the time being statically cast `types.CheckID`
to a `string` at the call sites.
2016-06-07 16:59:02 -04:00
Sean Chittenden 5ae7835988 Rename server_details package to agent 2016-03-29 17:39:19 -07:00
Sean Chittenden f3a69c939d Refactor consul.serverParts into server_details.ServerDetails
This may be short-lived, but it also seems like this is going to lead us down a path where ServerDetails is going to evolve into a more powerful package that will encapsulate more behavior behind a coherent API.
2016-03-23 16:15:47 -07:00
James Phillips 54535d45d7 Gets rid of error prefixing in leader.go. 2015-10-20 13:37:11 -07:00
James Phillips 25d7746f38 Fixes remaining non-KV index calclulations and adds a general getWatchTables thing. 2015-10-15 14:59:09 -07:00
James Phillips 3fedffbffb Gets rid of the transitional "New" suffix on state store and RPC. 2015-10-15 14:59:09 -07:00
James Phillips 1463e6100a Integrates new state store into leader and catalog/health endpoints. 2015-10-15 14:59:09 -07:00
James Phillips edae626f36 Integrates new state store for ACLs. 2015-10-15 14:59:09 -07:00
Dale Wijnand c5168e1263 Fix a bunch of typos. 2015-09-15 13:22:08 +01:00
Anthony Scalisi 8d733b7fca remove various typos 2015-09-11 12:29:54 -07:00
Armon Dadgar 2ca41b15df consul: updating for Raft API changes 2015-05-08 11:35:52 -07:00
Armon Dadgar cbc29d1cde consul: fixing overly redundant logging 2015-05-05 13:00:47 -07:00
Armon Dadgar ea7a0134c8 consul: Do not remove ourself as raft peer in Serf event handler 2015-01-20 16:13:54 -08:00
Armon Dadgar 200b348f69 consul: Disable tombstones as follower 2015-01-05 14:58:59 -08:00
Armon Dadgar a350ec9379 consul: Mesure time for reapTombstones 2015-01-05 14:43:56 -08:00
Armon Dadgar 9152fae109 consul: First pass at tombstone reaping 2015-01-05 14:43:55 -08:00
Armon Dadgar 8681d913ba consul: Generate a raft operation to reap tombstones 2015-01-05 14:43:55 -08:00
Armon Dadgar 9f30ffbf9a consul: Leader should reset the tombstone GC clock 2015-01-05 14:43:55 -08:00
Armon Dadgar 5b6ce2ca4a consul: Setup ACLs and timers after initial barrier 2014-12-12 21:42:24 -08:00
Armon Dadgar 990ad02f83 consul: Minor cleanups 2014-12-12 15:43:34 -08:00
Atin Malaviya b623af776b Consul Session TTLs
The design of the session TTLs is based on the Google Chubby approach
(http://research.google.com/archive/chubby-osdi06.pdf). The Session
struct has an additional TTL field now. This attaches an implicit
heartbeat based failure detector. Tracking of heartbeats is done by
the current leader and not persisted via the Raft log. The implication
of this is during a leader failover, we do not retain the last
heartbeat times.

Similar to Chubby, the TTL represents a lower-bound. Consul promises
not to terminate a session before the TTL has expired, but is allowed
to extend the expiration past it. This enables us to reset the TTL on
a leader failover. The TTL is also extended when the client does a
heartbeat. Like Chubby, this means a TTL is extended on creation,
heartbeat or failover.

Additionally, because we must account for time requests are in transit
and the relative rates of clocks on the clients and servers, Consul
will take the conservative approach of internally multiplying the TTL
by 2x. This helps to compensate for network latency and clock skew
without violating the contract.

Reference: https://docs.google.com/document/d/1Y5-pahLkUaA7Kz4SBU_mehKiyt9yaaUGcBTMZR7lToY/edit?usp=sharing
2014-12-07 12:38:22 -05:00
Armon Dadgar d74f79b3fa consul: Enforce service registration ACLs 2014-11-30 21:05:15 -07:00
Armon Dadgar 5c46544e7e consul: Improve variable name 2014-10-14 11:04:43 -07:00