Commit graph

171 commits

Author SHA1 Message Date
Alex Dadgar 4d99696f02 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Veselkov Konstantin 05666113a4 remove golint warnings 2018-01-28 22:40:13 +04:00
Kyle Havlovitz 0e76d62846
Reset clusterHealth when autopilot starts 2018-01-23 12:52:28 -08:00
Kyle Havlovitz 6d1dbe6cc4
Move autopilot health loop into leader operations 2018-01-23 11:17:41 -08:00
James Phillips 62e97a6602
Fixes a go fmt cleanup. 2017-12-20 13:43:38 -08:00
Kyle Havlovitz 74b0c58831
Fix vet error 2017-12-18 18:04:42 -08:00
Kyle Havlovitz dfc165a47b
Move autopilot initializing to oss file 2017-12-18 18:02:44 -08:00
Kyle Havlovitz 044c38aa7b
Move autopilot setup to a separate file 2017-12-18 16:55:51 -08:00
Kyle Havlovitz 9e1ba6fb4e
Make some final tweaks to autopilot package 2017-12-18 12:26:47 -08:00
Kyle Havlovitz 6b58df5898
Merge pull request #3737 from hashicorp/autopilot-refactor
Move autopilot to a standalone package
2017-12-15 14:09:40 -08:00
James Phillips 262cbbd9ca
Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak
fix globalRPC goroutine leak
2017-12-14 17:54:19 -08:00
Kyle Havlovitz 798aca92c5
Expose IsPotentialVoter for advanced autopilot logic 2017-12-13 17:53:51 -08:00
Kyle Havlovitz a4ac148077
Merge branch 'master' into autopilot-refactor 2017-12-13 11:54:32 -08:00
Kyle Havlovitz 6c985132de
A few last autopilot adjustments 2017-12-13 11:19:17 -08:00
Kyle Havlovitz 77d92bf15c
More autopilot reorganizing 2017-12-13 10:57:37 -08:00
James Phillips 984de6e2e0
Adds TODOs referencing #3744. 2017-12-13 10:52:06 -08:00
Kyle Havlovitz f347c8a531
More refactoring to make autopilot consul-agnostic 2017-12-12 17:46:28 -08:00
Kyle Havlovitz 8546a1d3c6
Move autopilot to a standalone package 2017-12-11 16:45:33 -08:00
James Phillips 32b64575d1
Moves Serf helper into lib to fix import cycle in consul-enterprise. 2017-12-07 16:57:58 -08:00
James Phillips c16cce80bb
Turns of intent queue warnings and enables dynamic queue sizing. 2017-12-07 16:27:06 -08:00
Wei Wei 04531ff0fb fix globalRPC goroutine leak
Signed-off-by: Wei Wei <weiwei.inf@gmail.com>
2017-12-05 11:53:30 +08:00
James Phillips c4bc89a187
Creates a registration mechanism for snapshot and restore. 2017-11-29 18:36:53 -08:00
James Phillips 8571555703
Begins split out of snapshots from the main FSM class. 2017-11-29 18:36:53 -08:00
James Phillips 4eaee8e0ba
Creates a registration mechanism for FSM commands. 2017-11-29 18:36:53 -08:00
James Phillips 3e7ea1931c
Moves the FSM into its own package.
This will help make it clearer what happens when we add some registration
plumbing for the different operations and snapshots.
2017-11-29 18:36:53 -08:00
James Phillips 7f3783f4be
Resolves an FSM snapshot TODO.
This adds checks for sink write calls before we continue the refactor, which
will resolve the other TODO comment we deleted as part of this change.
2017-11-29 18:36:53 -08:00
James Phillips 5a24d37ac0
Creates a registration mechanism for schemas.
This also splits out the registration into the table-specific source
files.
2017-11-29 18:36:52 -08:00
James Phillips 36bb30e67a
Creates a registration mechanism for RPC endpoints. 2017-11-29 18:36:52 -08:00
James Phillips ba56669ea8
Renames stubs to be more consistent. 2017-11-29 18:36:52 -08:00
James Phillips 56552095c9
Sheds monotonic time info so tombstone GC bins work properly. 2017-11-29 10:34:24 -08:00
James Phillips 8656b7a3e9
Gives back the lock before writing to the expire channel.
The lock isn't needed after we clean up the expire bin, and as seen
in #3700 we can get into a deadlock waiting to place the expire index
into the channel while holding this lock.

Fixes #3700
2017-11-19 16:24:16 -08:00
James Phillips 8210523b1b
Moves the LAN event handler after the router is created.
Fixes #3680
2017-11-10 12:26:48 -08:00
James Phillips bfbbfb62ca
Revert "Adds a small sleep to make sure we are in the next GC bucket." 2017-11-08 22:18:37 -08:00
James Phillips d6328a5bf8
Adds a sleep to make sure we are in the next GC bucket, ups time.
Fixes #3670
2017-11-08 22:02:40 -08:00
James Phillips 91824375be
Skips the tombstone GC test in Travis for now.
Related to #3670
2017-11-08 20:14:20 -08:00
James Phillips b94ba8aeb4
Removes bogus getPort() in favor of freeport. 2017-11-08 19:55:50 -08:00
James Phillips 444a345a3a
Tightens timing up and reorders GC test to be less flaky. 2017-11-08 15:09:29 -08:00
James Phillips e00624425b
Doubles the GC timing. 2017-11-08 15:01:11 -08:00
James Phillips 8eb91777d9
Opens up test timing a little more. 2017-11-08 14:01:19 -08:00
James Phillips d45c2a01f1
Shifts off a gran boundary to help make test less flaky. 2017-11-08 13:57:17 -08:00
James Phillips 757e353334
Opens up the tombstone GC test timing. 2017-11-08 13:43:39 -08:00
Kyle Havlovitz 068ca11eb8
Move check definition to a sub-struct 2017-11-01 14:54:46 -07:00
Kyle Havlovitz bc3ba5f873
Merge branch 'master' into esm-changes 2017-11-01 11:37:48 -07:00
Kyle Havlovitz 83524f44c4
Merge pull request #3622 from hashicorp/coordinate-node-endpoint
agent: add /v1/coordianate/node/:node endpoint
2017-11-01 11:35:50 -07:00
Kyle Havlovitz 9909b661ac
Fill out the tests around coordinate/node functionality 2017-10-31 15:36:44 -07:00
Kyle Havlovitz fd4d9f1c16
Factor out registerNodes function 2017-10-31 13:34:49 -07:00
James Phillips c6e0366c02
Relaxes Autopilot promotion logic. (#3623)
* Relaxes Autopilot promotion logic.

When we defaulted the Raft protocol version to 3 in #3477 we made
the numPeers() routine more strict to only count voters (this is
more conservative and more correct). This had the side effect of
breaking rolling updates because it's at odds with the Autopilot
non-voter promotion logic.

That logic used to wait to only promote to maintain an odd quorum
of servers. During a rolling update (add one new server, wait, and
then kill an old server) the dead server cleanup would still count
the old server as a peer, which is conservative and the right thing
to do, and no longer count the non-voter. This would wait to promote,
so you could get into a stalemate. It is safer to promote early than
remove early, so by promoting as soon as possible we have chosen
that as the solution here.

Fixes #3611

* Gets rid of unnecessary extra not-a-voter check.
2017-10-31 15:16:56 -05:00
Kyle Havlovitz 496dd7ab5b
Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes 2017-10-26 19:20:24 -07:00
Kyle Havlovitz f80e70271d
Added Coordinate.Node rpc endpoint and client api method 2017-10-26 19:16:40 -07:00
Kyle Havlovitz 84a07ea113
Expose SkipNodeUpdate field and some health check info in the http api 2017-10-25 19:37:30 +02:00