Sean Chittenden
b3a8e2f115
Factor out the shuffle server
2016-03-26 19:19:04 -07:00
Sean Chittenden
766ddae165
Revise comments re: cycleServer
...
Improve the comments to discuss what happens presently. Add a note to consider possibly calling to TestConsulServer proactively.
2016-03-26 18:53:13 -07:00
Sean Chittenden
ac1d42e9d8
Comment why the interface is needed: cyclic import
2016-03-26 18:38:35 -07:00
Sean Chittenden
a9b3dba05f
Add a struct key type for server_details
2016-03-26 17:58:12 -07:00
Sean Chittenden
22aa08d407
Merge pull request #1873 from hashicorp/f-rebalance-worker-0.7
...
Periodically rebalance the servers that agents talk to
2016-03-25 15:03:18 -07:00
Sean Chittenden
496f05b561
Add additional checks
2016-03-25 14:40:46 -07:00
Sean Chittenden
c18158aac3
Delete the right tag
...
"role" != "consul"
2016-03-25 14:31:48 -07:00
Sean Chittenden
b44554f882
Don't pass in sm, server manager is already in scope
...
Go closures are implicitly capturing lambdas.
2016-03-25 14:10:09 -07:00
Sean Chittenden
2713899a5b
Trim residual complexity from server join notifications
...
Now that serf node join events are decoupled from rebalancing activities completely, remove the complixity of draining the channel and ensuring only one go routine was rebalancing the server list.
Now that we're no longer initializing a notification channel, we can remove the config load/save from `Start()`
2016-03-25 14:06:35 -07:00
Sean Chittenden
b3298ce4c3
Only log in FindServers
...
In FindServer this is a useful warning hinting why its call failed. RPC returns error and leaves it to the higher level caller to do whatever it wants. As an operator, I'd have the detail necessary to know why the RPC call(s) failed.
2016-03-25 13:58:50 -07:00
Sean Chittenden
f024272ab2
Initialize the rebalancce to clientRPCMinReuseDuration
...
In an earlier version there was a channel to notify when a new server was added, however this has long since been removed. Just default to the sane value of 2min before the first rebalance calc takes place.
Pointed out by: slackpad
2016-03-25 13:46:18 -07:00
Sean Chittenden
485b153db2
Guard against very small or negative rates
...
Pointed out by: slackpad
2016-03-25 13:31:55 -07:00
Sean Chittenden
89311a5859
Use range vs for
...
Returning a new array vs mutating an array in place so we can use range now.
2016-03-25 13:08:08 -07:00
Sean Chittenden
643997623e
Comment updates
2016-03-25 13:06:59 -07:00
Sean Chittenden
072f34cf02
Only rotate server list with more than one server
...
Fantastic observation by slackpad. This was left over from when there was a boolean for health in the server struct (vs current strategy where we use server position in the list and rely on serf to cleanup the stale members).
Pointed out by: slackpad
2016-03-25 12:54:36 -07:00
Sean Chittenden
aadd274a13
Relocate saveServerConfig next to getServerConfig
...
Requested by: slackpad
2016-03-25 12:41:22 -07:00
Sean Chittenden
cf271e7f65
Clarify that ConsulClusterInfo is an interface over serf
...
An interface was used to break a cyclic import dependency.
2016-03-25 12:38:40 -07:00
Sean Chittenden
973d924ab4
Reword comment after moving code into new packages
2016-03-25 12:34:46 -07:00
Sean Chittenden
78ec9f241d
Change initialReblaanaceTimeout to a time.Duration
...
Pointed out by: @slackpad
2016-03-25 12:34:12 -07:00
Sean Chittenden
328728c88a
Negative check: test an invalid condition
2016-03-25 12:22:33 -07:00
Sean Chittenden
22e546ff32
Test to make sure bootstrap is missing
2016-03-25 12:20:12 -07:00
Sean Chittenden
5f035da4f1
Be more Go idiomatic w/ variable names: s/valid/ok/g
...
Cargo culting is bad, m'kay?
Pointy Hat: sean-
2016-03-25 12:14:24 -07:00
Sean Chittenden
e041c3905d
Fix stale comment
...
Pointed out by: @slackpad
2016-03-25 12:00:40 -07:00
Sean Chittenden
45fc7c362e
Add a comment for Client serverMgr
2016-03-25 11:59:27 -07:00
James Phillips
98140b4cf0
Merge pull request #1876 from hashicorp/f-tls-helper
...
Adds TLS config helper to API client.
2016-03-24 11:34:24 -07:00
James Phillips
c64f9c13d9
Improves the comment for the Address field.
2016-03-24 11:33:44 -07:00
James Phillips
a7a7bee668
Merge pull request #1877 from hashicorp/api-constants
...
Added some constants in the api for check health statuses
2016-03-24 11:29:11 -07:00
Diptanu Choudhury
ad085de7b7
Added some constants in the api for check health statuses
2016-03-24 11:26:07 -07:00
James Phillips
512cb6ebf7
Adds TLS config helper to API client.
2016-03-24 11:24:18 -07:00
Sean Chittenden
cdffb38035
Add CHANGELOG entry re: agent rebalancing
2016-03-23 22:36:12 -07:00
Sean Chittenden
5873b7e28e
Correct a bogus goimport rewrite for tests
2016-03-23 22:35:49 -07:00
Sean Chittenden
dcc64d91c6
Test ServerManager.refreshServerRebalanceTimer
...
Change the signature so it returns a value so that this can be tested externally with mock data. See the sample table in TestServerManagerInternal_refreshServerRebalanceTimer() for the rate at which it will back off. This function is mostly used to not cripple large clusters in the event of a partition.
2016-03-23 22:10:50 -07:00
Sean Chittenden
8e3b3d766d
Add a handful more unit tests to the public interface
2016-03-23 22:10:50 -07:00
Sean Chittenden
d5f72e8c07
Rename GetNumServers to NumServers()
...
Matches the style of the rest of the repo
2016-03-23 22:10:50 -07:00
Sean Chittenden
9de9cf90f1
Rename NewServerManger to just New
...
Follow go style recommendations now that this has been refactored out of the consul package and doesn't need the qualifier in the name.
2016-03-23 22:10:50 -07:00
Sean Chittenden
7faea986a0
Rename FindHealthyServer() to FindServer()
...
There is no guarantee the server coming back is healthy. It's apt to be healthy by virtue of its place in the server list, but it's not guaranteed.
2016-03-23 22:10:50 -07:00
Sean Chittenden
18885e3214
cycleServer is a pure function, save the result
2016-03-23 22:10:50 -07:00
Sean Chittenden
4ec9ed4de2
Missed unit test cruft
2016-03-23 22:10:50 -07:00
Sean Chittenden
b906e40811
Update comments to reflect reality
2016-03-23 22:10:50 -07:00
Sean Chittenden
1a09a5b2cf
Remove additional cruft from ServerManager's channels
...
No longer needed code.
2016-03-23 22:10:50 -07:00
Sean Chittenden
c980d492c6
Emulate a TryLock using atomic.CompareAndSwap
...
Prevent possible queueing behind serverConfigLock in the event that a server fails on a busy host.
2016-03-23 22:10:50 -07:00
Sean Chittenden
102dcafe76
Make use of interfaces
...
Use an interface instead of serf.Serf as arg to NewServerManager. Bonus points for improved testability.
Pointed out by: @slackpad
2016-03-23 22:10:50 -07:00
Sean Chittenden
231768faea
Simplify error handling
...
Rely on Serf for liveliness. In the event of a failure, simply cycle the server to the end of the list. If the server is unhealthy, Serf will reap the dead server.
Additional simplifications:
*) Only rebalance servers based on timers, not when a new server is readded to the cluster.
*) Back out the failure count in server_details.ServerDetails
2016-03-23 22:10:50 -07:00
Sean Chittenden
0c519aa90d
Unbreak client tests by reverting to original test
...
Debugging code crept into the actual test and hung out for much longer than it should have.
2016-03-23 22:10:50 -07:00
Sean Chittenden
26e51376d9
Introduce asynchronous management of consul server lists
...
Instead of blocking the RPC call path and performing a potentially expensive calculation (including a call to `c.LANMembers()`), introduce a channel to request a rebalance. Some events don't force a reshuffle, instead the extend the duration of the current rebalance window because the environment thrashed enough to redistribute a client's load.
2016-03-23 22:10:50 -07:00
Sean Chittenden
6ed37d1d8d
Comment nits
2016-03-23 22:10:50 -07:00
Sean Chittenden
32c24b5447
Update Serf to include `serf.NumNodes()`
2016-03-23 22:10:50 -07:00
Sean Chittenden
c8ab3ae4cb
Use saveServerConfig vs atomic.Value.Store(config)
2016-03-23 22:10:50 -07:00
Sean Chittenden
12377e80e6
Commit a handful of refactoring && copy/paste-o fixes
2016-03-23 22:10:50 -07:00
Sean Chittenden
c1c17f158b
Mutate copies of serverCfg.servers, not original
...
Removing any ambiguity re: ownership of the mutated server lists is a win for maintenance and debugging.
2016-03-23 22:10:50 -07:00