Hrishikesh Barua
b7d1582bd9
Get latest updates
2016-03-29 17:21:32 +05:30
Sean Chittenden
177f64134e
Refactor out recocileServerList anon function
...
Add testing to reconcileServerList and test various server sizes.
Test that a percentage of nodes fail their Ping (50% in testing atm)
2016-03-29 02:45:38 -07:00
Sean Chittenden
6609ee5d51
Teach fauxConnPool to fail a pct of the time
...
50% failure rate seems legit as a starting point w/ 100 servers.
2016-03-28 14:53:29 -07:00
Sean Chittenden
7d26f7bfa7
Call NotifyFailedServers to rotate the server list
2016-03-28 14:12:41 -07:00
Sean Chittenden
6a987062b9
Add log line re: server manager backing off and sleeping
...
This is useful in situations where the RPC rotate duration is greater than 1µs. WTB exponential backoff of logging so we don't spam forever.
2016-03-28 14:04:04 -07:00
Sean Chittenden
689b79aef3
Remove old debugging lines of questionable future value
2016-03-28 14:02:53 -07:00
Sean Chittenden
0b0a07a280
Shuffle in place
...
Don't create a copy and save the copy, not necessary any more.
2016-03-28 14:02:27 -07:00
Sean Chittenden
e230b3a3b7
Nuke unnecessary comment
...
See above function comments for details
2016-03-28 13:57:36 -07:00
Sean Chittenden
34a29a2107
Move FIXME comment to the right call site
2016-03-28 13:49:55 -07:00
Sean Chittenden
b38d3d71c8
Rename the ConnPoolPinger interface to Pinger
2016-03-28 13:46:01 -07:00
Sean Chittenden
d6b4345375
Return error from PingConsulServer
...
In order to report why a Ping failed, change the signature of PingConsulServers to include an error message.
2016-03-28 13:38:58 -07:00
Sean Chittenden
6c9fb06511
Change the definition of the ServerDetails struct key
...
Use only the serf Name for now. Leaving the plumbing for now.
2016-03-28 12:53:19 -07:00
Sean Chittenden
2bcff6bac4
Correct the comment to match reality
2016-03-28 12:32:30 -07:00
Sean Chittenden
fc1edea1ef
Rename serverCfg to sc for consistency
2016-03-28 12:06:26 -07:00
Sean Chittenden
988b05700d
Add a quick length check
...
Verify that AddServer behaved as expected
2016-03-28 11:38:12 -07:00
Sean Chittenden
7181e42ba8
Switch the order of ServerDetails.String()
...
It's more natrual to have the network first. I think I flipped the order accidentally.
2016-03-28 11:37:25 -07:00
Hrishikesh Barua
52359c2dc3
Added help text for -dev option #1804 for zsh completion.
2016-03-28 19:32:02 +05:30
Hrishikesh Barua
abd23a60d2
Added help text for -dev option #1804 for zsh completion.
2016-03-28 19:07:55 +05:30
Sean Chittenden
dca8fd2643
Move rebalance log statement from INFO to DEBUG
2016-03-27 01:32:04 -07:00
Sean Chittenden
180edd8e7b
Chase the API bump re: refreshServerRebalanceTimer
...
If it works in prod, why shouldn't it work in the tests?
2016-03-27 00:04:52 -07:00
Sean Chittenden
9b5dd7a785
Move initialization of the rebalanceTimer to New()
2016-03-27 00:03:48 -07:00
Sean Chittenden
86d1bad541
Add a test for ConnPool.PingConsulServer
...
Spin up 5x servers, join and ping each server
2016-03-26 23:52:06 -07:00
Sean Chittenden
f903005080
Expose ServerManager.ResetRebalanceTimer
...
Move the rebalance timer from ServerManager.Start's stack to struct ServerManager. This makes it possible to shuffle during tests without actually waiting >120s.
2016-03-26 23:41:01 -07:00
Sean Chittenden
2ba281bc5a
Logging improvements
...
Comment out noisly loggers for the time being.
Improve the final logging statement to be useful and hint what the next active server for the client is going to be.
2016-03-26 22:41:08 -07:00
Sean Chittenden
fab3981b1d
Standardize the log message based on the package
...
This log statement used to belong in the consul package but has since moved to the server manager package.
2016-03-26 22:29:00 -07:00
Sean Chittenden
c6d9c42d9f
Reduce the error level from Fatal when unit testing
2016-03-26 22:07:09 -07:00
Sean Chittenden
4747cf3cab
Start server rebalance task after init'ing Serf
...
Now that there is no longer an event loop driven directly by Serf, start the ServerManager task after Serf has been setup. When testing and adjusting timers and timeouts to unreasonably low values, it's possible to tickle a race condition where Serf's NumNodes() would fail because Serf had not been initialized.
2016-03-26 22:04:41 -07:00
Sean Chittenden
2ddf82d9d8
Catch up to a few renames
2016-03-26 19:32:11 -07:00
Sean Chittenden
640ced7c11
Use empty string for addr in ServerDetails.String()
2016-03-26 19:30:04 -07:00
Sean Chittenden
e0f29c17cd
Guard against a nil ServerDetails.Addr
...
It's not clear how or why this would ever be nil, but some of the unit tests produce a nil addr. Be defensive.
2016-03-26 19:29:31 -07:00
Sean Chittenden
2d9982eb27
Proactively ping server before rotation
...
Before shuffling the server list, proactively ping the next server in the list to establish the connection and verify the remote endpoint is healthy.
2016-03-26 19:28:13 -07:00
Sean Chittenden
b3a8e2f115
Factor out the shuffle server
2016-03-26 19:19:04 -07:00
Sean Chittenden
766ddae165
Revise comments re: cycleServer
...
Improve the comments to discuss what happens presently. Add a note to consider possibly calling to TestConsulServer proactively.
2016-03-26 18:53:13 -07:00
Sean Chittenden
ac1d42e9d8
Comment why the interface is needed: cyclic import
2016-03-26 18:38:35 -07:00
Sean Chittenden
a9b3dba05f
Add a struct key type for server_details
2016-03-26 17:58:12 -07:00
Sean Chittenden
22aa08d407
Merge pull request #1873 from hashicorp/f-rebalance-worker-0.7
...
Periodically rebalance the servers that agents talk to
2016-03-25 15:03:18 -07:00
Sean Chittenden
496f05b561
Add additional checks
2016-03-25 14:40:46 -07:00
Sean Chittenden
c18158aac3
Delete the right tag
...
"role" != "consul"
2016-03-25 14:31:48 -07:00
Sean Chittenden
b44554f882
Don't pass in sm, server manager is already in scope
...
Go closures are implicitly capturing lambdas.
2016-03-25 14:10:09 -07:00
Sean Chittenden
2713899a5b
Trim residual complexity from server join notifications
...
Now that serf node join events are decoupled from rebalancing activities completely, remove the complixity of draining the channel and ensuring only one go routine was rebalancing the server list.
Now that we're no longer initializing a notification channel, we can remove the config load/save from `Start()`
2016-03-25 14:06:35 -07:00
Sean Chittenden
b3298ce4c3
Only log in FindServers
...
In FindServer this is a useful warning hinting why its call failed. RPC returns error and leaves it to the higher level caller to do whatever it wants. As an operator, I'd have the detail necessary to know why the RPC call(s) failed.
2016-03-25 13:58:50 -07:00
Sean Chittenden
f024272ab2
Initialize the rebalancce to clientRPCMinReuseDuration
...
In an earlier version there was a channel to notify when a new server was added, however this has long since been removed. Just default to the sane value of 2min before the first rebalance calc takes place.
Pointed out by: slackpad
2016-03-25 13:46:18 -07:00
Sean Chittenden
485b153db2
Guard against very small or negative rates
...
Pointed out by: slackpad
2016-03-25 13:31:55 -07:00
Sean Chittenden
89311a5859
Use range vs for
...
Returning a new array vs mutating an array in place so we can use range now.
2016-03-25 13:08:08 -07:00
Sean Chittenden
643997623e
Comment updates
2016-03-25 13:06:59 -07:00
Sean Chittenden
072f34cf02
Only rotate server list with more than one server
...
Fantastic observation by slackpad. This was left over from when there was a boolean for health in the server struct (vs current strategy where we use server position in the list and rely on serf to cleanup the stale members).
Pointed out by: slackpad
2016-03-25 12:54:36 -07:00
Sean Chittenden
aadd274a13
Relocate saveServerConfig next to getServerConfig
...
Requested by: slackpad
2016-03-25 12:41:22 -07:00
Sean Chittenden
cf271e7f65
Clarify that ConsulClusterInfo is an interface over serf
...
An interface was used to break a cyclic import dependency.
2016-03-25 12:38:40 -07:00
Sean Chittenden
973d924ab4
Reword comment after moving code into new packages
2016-03-25 12:34:46 -07:00
Sean Chittenden
78ec9f241d
Change initialReblaanaceTimeout to a time.Duration
...
Pointed out by: @slackpad
2016-03-25 12:34:12 -07:00