The design of the session TTLs is based on the Google Chubby approach
(http://research.google.com/archive/chubby-osdi06.pdf). The Session
struct has an additional TTL field now. This attaches an implicit
heartbeat based failure detector. Tracking of heartbeats is done by
the current leader and not persisted via the Raft log. The implication
of this is during a leader failover, we do not retain the last
heartbeat times.
Similar to Chubby, the TTL represents a lower-bound. Consul promises
not to terminate a session before the TTL has expired, but is allowed
to extend the expiration past it. This enables us to reset the TTL on
a leader failover. The TTL is also extended when the client does a
heartbeat. Like Chubby, this means a TTL is extended on creation,
heartbeat or failover.
Additionally, because we must account for time requests are in transit
and the relative rates of clocks on the clients and servers, Consul
will take the conservative approach of internally multiplying the TTL
by 2x. This helps to compensate for network latency and clock skew
without violating the contract.
Reference: https://docs.google.com/document/d/1Y5-pahLkUaA7Kz4SBU_mehKiyt9yaaUGcBTMZR7lToY/edit?usp=sharing
This change consolidates loading services and checks from both config
and persisted state into methods on the agent. As part of this, we
introduce optional persistence when calling RemoveCheck/RemoveService.
Fixes a bug where config reloads would kill persisted services/checks.
Also fixes an edge case:
1. A service or check is registered via the HTTP API
2. A new service or check definition with the same ID is added to config
3. Config is reloaded
The desired behavior (which this implements) is:
1. All services and checks deregistered in memory
2. All services and checks in config are registered first
3. All persisted checks are restored using the same logic as the agent
start sequence, which prioritizes config over persisted, and removes
any persistence files if new config counterparts are present.
* In connection pool, there is no guarantee that .reap() cannot execute the same time as .Shutdown() is called. It also did not benefit to eval shutdown when a select is run on the shutdown channel.
* In server, same principle applies to handleConsulConn. Since we also have a shutdown channel, it makes more to use this than to loop on a bool variable.