Now that serf node join events are decoupled from rebalancing activities completely, remove the complixity of draining the channel and ensuring only one go routine was rebalancing the server list.
Now that we're no longer initializing a notification channel, we can remove the config load/save from `Start()`
In FindServer this is a useful warning hinting why its call failed. RPC returns error and leaves it to the higher level caller to do whatever it wants. As an operator, I'd have the detail necessary to know why the RPC call(s) failed.
In an earlier version there was a channel to notify when a new server was added, however this has long since been removed. Just default to the sane value of 2min before the first rebalance calc takes place.
Pointed out by: slackpad
Fantastic observation by slackpad. This was left over from when there was a boolean for health in the server struct (vs current strategy where we use server position in the list and rely on serf to cleanup the stale members).
Pointed out by: slackpad
Change the signature so it returns a value so that this can be tested externally with mock data. See the sample table in TestServerManagerInternal_refreshServerRebalanceTimer() for the rate at which it will back off. This function is mostly used to not cripple large clusters in the event of a partition.
Rely on Serf for liveliness. In the event of a failure, simply cycle the server to the end of the list. If the server is unhealthy, Serf will reap the dead server.
Additional simplifications:
*) Only rebalance servers based on timers, not when a new server is readded to the cluster.
*) Back out the failure count in server_details.ServerDetails
Instead of blocking the RPC call path and performing a potentially expensive calculation (including a call to `c.LANMembers()`), introduce a channel to request a rebalance. Some events don't force a reshuffle, instead the extend the duration of the current rebalance window because the environment thrashed enough to redistribute a client's load.
Relocated to its own package, server_manager. This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary. More work is needed to be done here
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.
This commit brings in a background task that proactively manages the server list and:
*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed
This is a WIP, more work in testing needs to be completed.
Relocated to its own package, server_manager. This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary. More work is needed to be done here
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.
This commit brings in a background task that proactively manages the server list and:
*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed
This is a WIP, more work in testing needs to be completed.
Relocated to its own package, server_manager. This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary. More work is needed to be done here
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.
This commit brings in a background task that proactively manages the server list and:
*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed
This is a WIP, more work in testing needs to be completed.
Relocated to its own package, server_manager. This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary. More work is needed to be done here
This may be short-lived, but it also seems like this is going to lead us down a path where ServerDetails is going to evolve into a more powerful package that will encapsulate more behavior behind a coherent API.
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.
This commit brings in a background task that proactively manages the server list and:
*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed
This is a WIP, more work in testing needs to be completed.
A server is not normally disabled, but in the event of an RPC error, we want to mark a server as down to allow for fast failover to a different server. This value must be an int in order to support atomic operations.
Additionally, this is the preliminary work required to bring up a server in a disabled state. RPC health checks in the future could mark the server as alive, thereby creating an organic "slow start" feature for Consul.
Expanding the domain of lastServer beyond RPC() changes the meaning of this variable. Rename accordingly to match the intent coming in a subsequent commit: a background thread will be in charge of rotating preferredServer.
It is theoretically possible that the number of queued serf events can back up. If this happens, emit a warning message if there are more than 200 events in queue.
Most notably, this can happen if `c.consulServerLock` is held for an "extended period of time". The probability of anyone ever seeing this log message is hopefully low to nonexistent, but if it happens, the warning message indicating a large number of serf events fired while a lock was held is likely to be helpful (vs serf mysteriously blocking when attempting to add an event to a channel).
Prior to this change, prepared queries had the following behavior for
ACLs, which will need to change to support templates:
1. A management token, or a token with read access to the service being
queried needed to be provided in order to create a prepared query.
2. The token used to create the prepared query was stored with the query
in the state store and used to execute the query.
3. A management token, or the token used to create the query needed to be
supplied to perform and CRUD operations on an existing prepared query.
This was pretty subtle and complicated behavior, and won't work for
templates since the service name is computed at execution time. To solve
this, we introduce a new "prepared-query" ACL type, where the prefix
applies to the query name for static prepared query types and to the
prefix for template prepared query types.
With this change, the new behavior is:
1. A management token, or a token with "prepared-query" write access to
the query name or (soon) the given template prefix is required to do
any CRUD operations on a prepared query, or to list prepared queries
(the list is filtered by this ACL).
2. You will no longer need a management token to list prepared queries,
but you will only be able to see prepared queries that you have access
to (you get an empty list instead of permission denied).
3. When listing or getting a query, because it was easy to capture
management tokens given the past behavior, this will always blank out
the "Token" field (replacing the contents as <hidden>) for all tokens
unless a management token is supplied. Going forward, we should
discourage people from binding tokens for execution unless strictly
necessary.
4. No token will be captured by default when a prepared query is created.
If the user wishes to supply an execution token then can pass it in via
the "Token" field in the prepared query definition. Otherwise, this
field will default to empty.
5. At execution time, we will use the captured token if it exists with the
prepared query definition, otherwise we will use the token that's passed
in with the request, just like we do for other RPCs (or you can use the
agent's configured token for DNS).
6. Prepared queries with no name (accessible only by ID) will not require
ACLs to create or modify (execution time will depend on the service ACL
configuration). Our argument here is that these are designed to be
ephemeral and the IDs are as good as an ACL. Management tokens will be
able to list all of these.
These changes enable templates, but also enable delegation of authority to
manage the prepared query namespace.
Introduce a low-level background connection expiration mechanism wherein connections will be recycled periodically based on the size and health of the cluster.
For the vast majority of consul users, this will mean an average connection age of 150s. For 10K node clusters it will take ~3min for clusters to rebalance their connections. In the pathological case for a 100K cluster where 99K clients are in the minority talking to 1x server it will take ~26min to rebalance all connections.
It's possibe for clients recovering from a parititon to become fixated on a single server until the server or agent is restarted. This is of particular interest to long-running environments with stable agents, where `allow_stale` is true, and partitions occur periodically.
Increase the max idle time for agents talking to servers from 30s to 127s in order to allow for the reuse of connections that are being initiated by cron.
127s was chosen as the first prime above 120s (arbitrarily chose to use a prime) with the intent of reusing connections who are used by once-a-minute cron(8) jobs *and* who use a 60s jitter window (e.g. in vixie cron job execution can drift by up to 59s per job, or 119s for a once-a-minute cron job).
Consolidate code duplication and tests into a single lib package. Most of these functions were from various **/util.go functions that couldn't be imported due to cyclic imports. The consul/lib package is intended to be a terminal node in an import DAG and a place to stash various consul-only helper functions. Pulled in hashicorp/go-uuid instead of consolidating UUID access.
* A batch of updates is done all in a single transaction.
* We no longer need to get an update to kick things, there's a periodic flush.
* If incoming updates overwhelm the configured flush rate they will be dumped with an error.