Commit Graph

36 Commits

Author SHA1 Message Date
James Phillips bc29610124 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
Kyle Havlovitz f0aa65754b Wait for agent joins to finish in TestClient_RPC 2016-10-25 17:48:11 -07:00
Kyle Havlovitz 2379b822ad Add wait logic to TestClient_RPC_Pool 2016-10-25 17:48:11 -07:00
Chavez c9602c561c Add description to rpc test client pool member failure message 2016-04-01 19:17:38 -07:00
Sean Chittenden 4984b6111d Gratuitous rename 1/2
Reduce cognative load and perform an overdue rename.  No functional change.

Rename the `server_manager` package to `servers`.  Rename the `ServerManager` package to `Manager`.  In `client`, rename `serverMgr` to `servers`.
2016-03-29 16:12:00 -07:00
Sean Chittenden 177f64134e Refactor out recocileServerList anon function
Add testing to reconcileServerList and test various server sizes.

Test that a percentage of nodes fail their Ping (50% in testing atm)
2016-03-29 02:45:38 -07:00
Sean Chittenden 7d26f7bfa7 Call NotifyFailedServers to rotate the server list 2016-03-28 14:12:41 -07:00
Sean Chittenden d6b4345375 Return error from PingConsulServer
In order to report why a Ping failed, change the signature of PingConsulServers to include an error message.
2016-03-28 13:38:58 -07:00
Sean Chittenden 86d1bad541 Add a test for ConnPool.PingConsulServer
Spin up 5x servers, join and ping each server
2016-03-26 23:52:06 -07:00
Sean Chittenden 5873b7e28e Correct a bogus goimport rewrite for tests 2016-03-23 22:35:49 -07:00
Sean Chittenden d5f72e8c07 Rename GetNumServers to NumServers()
Matches the style of the rest of the repo
2016-03-23 22:10:50 -07:00
Sean Chittenden 0c519aa90d Unbreak client tests by reverting to original test
Debugging code crept into the actual test and hung out for much longer than it should have.
2016-03-23 22:10:50 -07:00
Sean Chittenden 26e51376d9 Introduce asynchronous management of consul server lists
Instead of blocking the RPC call path and performing a potentially expensive calculation (including a call to `c.LANMembers()`), introduce a channel to request a rebalance.  Some events don't force a reshuffle, instead the extend the duration of the current rebalance window because the environment thrashed enough to redistribute a client's load.
2016-03-23 22:10:50 -07:00
Sean Chittenden 3730eaf6df Commit miss re: consuls variable rename 2016-03-23 16:24:29 -07:00
Sean Chittenden b33648ca5c Move consul.serverConfig out of the consul package
Relocated to its own package, server_manager.  This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary.  More work is needed to be done here
2016-03-23 16:16:22 -07:00
Sean Chittenden 09d4c6439c Refactor out the management of Consul servers
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.

This commit brings in a background task that proactively manages the server list and:

*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed

This is a WIP, more work in testing needs to be completed.
2016-03-23 16:15:47 -07:00
Sean Chittenden 2a0c12460d Commit miss re: consuls variable rename 2016-03-23 16:13:49 -07:00
Ryan Uber 33741f9156 consul: initial pass at refactoring RPC using net-rpc-msgpackrpc 2015-10-13 22:15:23 -07:00
James Phillips 0efc49b510 Fixes #1165 by having threads wait for any outstanding connect to finish. 2015-08-12 18:48:15 -07:00
Ryan Uber 6d27334ee1 consul: fixes for events and related tests 2015-07-14 11:38:26 -07:00
Ryan Uber e6923a4832 consul: always fire events from server nodes 2015-06-18 18:13:29 -07:00
Armon Dadgar a66a765ca9 consul: Adding merge delegate to prevent mixing clusters 2015-01-06 15:48:46 -08:00
Ryan Uber 2661bbfa27 consul: more tests, remove unused KeyManager() method 2014-11-19 16:37:40 -08:00
Armon Dadgar 9eddff083a consul: Testing user events 2014-08-26 19:26:55 -07:00
William Tisäter 95e7752721 Wait for joining member in `TestClient_JoinLAN` 2014-05-27 00:55:20 +02:00
William Tisäter 405dfd99c1 Fix tests on Go 1.3 and greater
Go 1.3 and greater require ServerName or InsecureSkipVerify to be set.

https://codereview.appspot.com/67010043/
2014-05-27 00:47:47 +02:00
William Tisäter 0ddc86bbcc Revert "Remove `WaitForResult` from two failing tests"
This reverts commit f0842409cde194ce2102ff2baf3de821cb6ac9d5.
2014-05-26 23:49:23 +02:00
William Tisäter f990e9fcb2 Remove `WaitForResult` from two failing tests 2014-05-26 23:44:37 +02:00
Armon Dadgar 9470f9a5a2 gofmt 2014-05-20 16:25:29 -07:00
William Tisäter cb54c4cb49 Remove all sleeps from `client_test.go` 2014-05-09 02:29:51 +02:00
Armon Dadgar bd0cc256b7 consul: Adding client/server TLS tests 2014-04-07 15:07:00 -07:00
Armon Dadgar 0588a9cb3f Fixing some tests 2013-12-31 16:46:56 -08:00
Armon Dadgar 8fee0cd50a Changing the Join interfaces to match Serf 2013-12-30 12:20:17 -08:00
Armon Dadgar 5b0ca256ff Testing Client RPC to server 2013-12-19 15:08:55 -08:00
Armon Dadgar 8b35f6b22a Test joining a Serf pool 2013-12-19 14:56:38 -08:00
Armon Dadgar 9249bb65a5 Adding client tests 2013-12-19 14:54:32 -08:00