This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
I'm torn on this. It's useful from a UX perspective for an operator to
be able to type in something that's short. At the same time, by
enforcing an `8` character length, we reduced the probability of a user
depending on the behavior and having it suddenly stop working in the
future when a duplicate prefix is injected into the environment.
lookup returned nil.
Add a TODO to note where a future point of logging should occur once a
logger is present and a few additional comments to explain the program
flow.
Assuming the following output from a consul agent:
```
==> Consul agent running!
Version: 'v0.7.3-43-gc5e140c-dev (c5e140c+CHANGES)'
Node ID: '40e4a748-2192-161a-0510-9bf59fe950b5'
Node name: 'myhost'
```
it is now possible to lookup nodes by their Node Name or Node ID, or a
prefix match of the Node ID, with the following caveats re: the prefix
match:
1) first eight digits of the Node ID are a required minimum (eight was
chosen as an arbitrary number)
2) the length of the Node ID must be an even number or no result will be
returned.
```
% dig @127.0.0.1 -p 8600 myhost.node.dc1.consul.
myhost.node.dc1.consul. 0 IN A 127.0.0.1
% dig @127.0.0.1 -p 8600 40e4a748-2192-161a-0510-9bf59fe950b5.node.dc1.consul.
40e4a748-2192-161a-0510-9bf59fe950b5.node.dc1.consul. 0 IN A 127.0.0.1
% dig @127.0.0.1 -p 8600 40e4a748.node.dc1.consul.
40e4a748.node.dc1.consul. 0 IN A 127.0.0.1
% dig @127.0.0.1 -p 8600 40e4a74821.node.dc1.consul.
40e4a74821.node.dc1.consul. 0 IN A 127.0.0.1
% dig @127.0.0.1 -p 8600 40e4a748-21.node.dc1.consul.
40e4a748-21.node.dc1.consul. 0 IN A 127.0.0.1
```
Previously the blocking functions all closed over the state store from
their first query, with would not have worked properly when a restore
occurred. This makes sure they get a frest state store pointer each time,
and that pointer is synchronized with the abandon watch.