Commit Graph

220 Commits

Author SHA1 Message Date
James Phillips 9b43b31900
Forces user-supplied node IDs to lower case for consistency. 2017-03-13 19:51:56 -07:00
Kyle Havlovitz 8bcab6c6d7
Add autopilot server health tracking
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00
Kyle Havlovitz 44f0b08db7 Merge pull request #2771 from hashicorp/f-autopilot
Autopilot dead server cleanup, config, and raft version compatibility
2017-02-28 15:04:16 -08:00
Kyle Havlovitz 23c492a74e
Rename DeadServerCleanup and make wording adjustments 2017-02-28 14:45:21 -08:00
Kyle Havlovitz 3b67c50c1d
Remove the RPC client interface and update docs 2017-02-28 13:41:09 -08:00
Kyle Havlovitz 56e22a719f
Add state store table and endpoints for autopilot 2017-02-23 20:32:13 -08:00
Kyle Havlovitz f9588b8d7f
Add raft version 2/3 compatibility 2017-02-22 12:53:32 -08:00
Kyle Havlovitz f48cd78bc5 Merge pull request #2699 from hashicorp/f-tls-min-version
Add TLSMinVersion to config options
2017-02-01 16:31:53 -05:00
Kyle Havlovitz 086975ac0a
Add TLSMinVersion to config options 2017-02-01 16:20:33 -05:00
Sean Chittenden 9e2557c53c
Re-cherry-pick 71d807f607589f2eb4fea4e83e3876d122c8afc0 and e2320d69b6b155d8223758415aabafc60a0e9d3b. 2017-02-01 10:27:04 -08:00
James Phillips d1824ce361 Revert "Adds gopsutil in the loop when trying to make the node ID." 2017-01-31 19:13:49 -08:00
James Phillips e7b683dad8
Gets rid of a goto. 2017-01-31 19:02:25 -08:00
James Phillips d9b999109a
Adds gopsutil into node ID process and attempts to use host ID, if availabile. 2017-01-31 08:51:33 -08:00
Kyle Havlovitz cc271aa436
Merge branch 'master' into f-prepared-query-nodemeta 2017-01-23 20:17:48 -05:00
Kyle Havlovitz 3cd054e81f
Add tests for node meta in prepared queries and update docs 2017-01-23 19:17:30 -05:00
James Phillips 96bff003b7
Adds basic support for node IDs. 2017-01-17 22:47:59 -08:00
Kyle Havlovitz 51a1834667
Add -node-meta to agent command line options 2017-01-11 16:09:04 -05:00
Kyle Havlovitz d77890a011
Validate metadata config earlier and handle multiple filters 2017-01-11 15:12:03 -05:00
Kyle Havlovitz 6b5cf20b1c
Fix formatting 2017-01-09 13:49:33 -08:00
Kyle Havlovitz aee766baba
Add meta key validations and more tests 2017-01-09 11:21:49 -08:00
Kyle Havlovitz b25f4c7d35
Add tests for node metadata functionality 2017-01-05 17:21:56 -08:00
Kyle Havlovitz 2a423c6e2c
Add support for setting node metadata fields 2017-01-05 14:10:26 -08:00
James Phillips 927aa3789a
Adds ACL management support to the agent. 2016-12-14 07:07:41 -08:00
James Phillips 8db53b4ac2
Adds complete ACL coverage for /v1/coordinate/nodes and Coordinate.Update RPC. 2016-12-12 14:52:27 -08:00
James Phillips 67f1fe8d0b
Adds support for a new "acl_agent_token" which is used for internal
catalog operations.
2016-12-12 14:52:27 -08:00
James Phillips 2b19a5340c
Removes the exception for the "consul" service in the catalog. 2016-12-07 17:58:23 -08:00
Sean Chittenden 1ca285221a
Run all known addresses through go-sockaddr/template.
The following is now possible:

```
$ consul agent -dev -client="{{GetPrivateIP}}" -bind='{{GetInterfaceIP "en0"}}'
```
2016-12-02 16:35:38 +11:00
Kyle Havlovitz dd05afb32e Add reload/leave http endpoints (#2516) 2016-11-30 13:29:42 -05:00
Seth Vargo 1c55429a38
Add an API method for determining the best status
Given a list of HealthChecks, this determines the "best" status for the
collective group. This is useful for nodes and services, which may have
multiple checks associated with them.
2016-11-29 18:41:46 -05:00
Kyle Havlovitz ccab51b07c Add logWriter to agent Create() method 2016-11-28 18:36:26 -05:00
Kyle Havlovitz d91854f3b0 Add monitor http endpoint 2016-11-28 18:36:26 -05:00
Kyle Havlovitz 9b6847d13f Use uuids in persist temp files to avoid race (#2494) 2016-11-09 15:22:53 -08:00
Kyle Havlovitz b170282d0f Improve logging when deregistering a nonexistent service (#2492)
Log a warning instead of a success message when attempting to deregister a nonexistent service. In Consul 0.8 this can be changed to giving an error outright, but for now we can keep the idempotent delete behavior.
2016-11-09 16:56:54 -05:00
Kyle Havlovitz d71cdf9dc2 Merge pull request #2480 from hashicorp/b-atomic-writes
Atomic writes for persisting service/check state
2016-11-07 15:36:35 -05:00
Kyle Havlovitz 06fac8a918
Add a note about not calling sync for persistCheckState 2016-11-07 15:24:31 -05:00
Kyle Havlovitz 6b6601093c
Call fsync() for saving check/service state 2016-11-07 13:51:03 -05:00
Kyle McCullough 697ea0f8eb Add setting to skip ssl certificate verification for HTTP checks (#1984)
* http check: add setting to skip ssl certificate verification

* update http check documentation

* fix typo in documentation

* Add TLSSkipVerify to agent api
2016-11-03 13:17:30 -07:00
James Phillips 18926728d4 Supports WAN and LAN Serf Bind Addresses. (#2468)
* * adding cli config and config file support for specifying the serf wan and lan bind addresses
* updating documentation for serf wan and lan options
Fixes #2007

* Cleans up some small things from #2380.

* Uses the bind default for the agent test for Serf WAN and LAN.
2016-11-03 12:58:58 -07:00
James Phillips bc29610124 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
James Phillips dace771f06 Merge pull request #2389 from hashicorp/jbs-2019
Lower Service tag DNS warning to DEBUG for #2019
2016-10-24 17:05:02 -07:00
Brian Shumate 36e1a194c2
Lower Service tag DNS warning to DEBUG for #2019 2016-10-05 08:45:01 -04:00
Adam Wolfe Gordon ae5bd0f2cc agent: Stop reaping child processes (resolves #1988)
The consul docker image now uses dumb-init to reap child processes, so
there's no need to reap them ourselves.
2016-10-04 09:36:41 -06:00
James Phillips b339b0d2fc
Adds performance tuning capability for Raft, detuned defaults, and supplemental docs. 2016-08-24 21:58:37 -07:00
James Phillips db9a72dde1
Merge branch 'master' into f-deregister-critical 2016-08-16 12:53:21 -07:00
James Phillips 607595f99e
Cleans up based on code review feedback. 2016-08-16 12:52:30 -07:00
James Phillips 817d7e93e4
Adds an "lan" tagged address so we have a way to get them all.
If we didn't have this, then there would be no way to know the LAN
address if address translation was turned on.
2016-08-16 10:49:03 -07:00
James Phillips b4f981c837
Adds ability to deregister a service based on critical check state longer than a timeout. 2016-08-16 01:00:26 -07:00
James Phillips 9cece515c0
Adds basic ACL replication plumbing. 2016-08-03 21:24:04 -07:00
Cameron Davison 40d1c279dd
atomic write service state and checks files, fixes #1221 2016-08-03 10:34:12 -05:00
Sean Chittenden c90fc057d4
Give log reviewers a hint as to which check is failing 2016-06-20 15:25:21 -07:00