Commit Graph

310 Commits

Author SHA1 Message Date
James Phillips 037f72813b Merge pull request #3105 from hashicorp/test-stability
Tweaks things for better test stability.
2017-06-03 10:09:12 -07:00
Frank Schroeder 92b2192b75 agent: fix graceful http server shutdown 2017-06-03 19:08:50 +02:00
Frank Schroeder 6262d85dab agent: move config reloading into the agent 2017-06-03 19:08:50 +02:00
Frank Schroeder 02ce45e5bb agent: move parseMetaPair to config and export 2017-06-03 19:08:50 +02:00
Frank Schroeder 63d336e11c agent: move retry join into agent 2017-06-03 19:08:50 +02:00
Frank Schroeder 488beeb208 agent: add GossipEncrypted function
Remove dead code and expose a GossipEncrypted() method
on the agent instead of accessing the internal delegate.
2017-06-03 19:08:50 +02:00
James Phillips 26677b4117
Cleans up an unrelated vet error. 2017-06-02 21:07:15 -07:00
Frank Schroeder b06c969deb
agent: use agent logger for consul client and deps 2017-05-31 11:05:02 +02:00
Frank Schroeder 4e6e9cb05a
agent: set tcpKeepAliveListener when there is no error 2017-05-31 10:24:32 +02:00
Frank Schroeder c587561572
agent: simplify shutdown timeout logging
Note that we are currently timing out every time. This still
needs to be investigated.
2017-05-31 09:41:58 +02:00
Frank Schroeder b220133785
agent: set proto to http since we are logging it now 2017-05-31 09:41:16 +02:00
Frank Schroeder bb8a21c385
Address review comments 2017-05-31 01:05:21 +02:00
Frank Schroeder 45a9c2be8b
agent: refactor DNS and HTTP server
* refactor DNS server to be ready for multiple bind addresses
* drop tcpKeepAliveListener since it is default for the HTTP servers
* add startup timeout watcher for HTTP servers identical to DNS server
2017-05-31 00:29:29 +02:00
Frank Schroeder 2b627b95dd
agent: fix logging
* use agent logger for consul/serf/raft/dns/agent/...
* support optional id for concurrent tests
2017-05-31 00:29:27 +02:00
Frank Schroeder ae5270a1e1
agent: print more useful shutdown message 2017-05-31 00:29:27 +02:00
Frank Schroeder ad153f0e76
test: fix data race with endpoints 2017-05-31 00:29:26 +02:00
Frank Schroeder 449eceec11
agent: shutdown delegate if created
When the TestAgent shuts down a half-started agent
the delegate may not have been created at this point.
2017-05-31 00:29:26 +02:00
Frank Schroeder 48bff52614
test: address pull request comments 2017-05-31 00:29:25 +02:00
Frank Schroeder 76fbe43f56
agent: simplify agent creation
This patch creates an agent with just a config struct
and allows for other fields to be set as required.
2017-05-31 00:29:23 +02:00
Frank Schroeder 8d9f5b9a64
agent: move http/dns endpoints into agent
Move the HTTP and DNS endpoints into the agent and control
their lifespan via the agent.

This removes the requirement to manage HTTP and DNS servers
indpendent of the agent since the agent is mostly useless
without an endpoint and the endpoints without the agent.
2017-05-31 00:29:23 +02:00
Kyle Havlovitz e196576c07 Add settings for upshifting to encrypted gossip (#3079) 2017-05-30 08:51:37 -07:00
Frank Schroeder 9f4286b1ce
agent: support custom check id and name
This patch adds support for a custom check id and name when
registering a service.

This is achieved by adding a CheckID and a Name field to the
CheckType structure which is used to register checks with a
service and when returning health check definitions.

CheckDefinition is a superset of CheckType which duplicates
some of the fields of CheckType. This patch decouples these
two structures by removing the embedding of CheckType in
CheckDefinition.

Fixes #3047
2017-05-17 20:17:08 +02:00
Frank Schroeder 69125e3a58 agent: move isAddrANY to separate package 2017-05-15 22:44:43 +02:00
Frank Schroeder d6eb1d434f agent: Replace client/server with delegate interface
This patch adds a new internal interface clientServer
which defines the common methods of consul.Client and
consul.Server. This allows to replace the following
code

    if a.server != nil {
        a.server.do()
    } else {
        a.client.do()
    }

with

    a.delegate.do()

In case a specific type is required a type check can
be performed:

    if srv, ok := a.delegate.(*consul.Server); ok {
        srv.doSrv()
    }
2017-05-15 18:35:38 +02:00
Frank Schroeder e350b7aafa agent: allow "::" as IPv6 bind address
Fixes #2285
2017-05-15 17:51:33 +02:00
Frank Schroeder 3e4dc6e133 agent: use bind address as src unless INADDR_ANY
Use the bind address as source address for outgoing
RPC connections unless it is INADDR_ANY.

The current code uses the advertise address which will
not work in certain environments where the advertise
address is not routable in the network of the agent,
e.g. NAT environment, container... After all, that is
the purpose of the advertise address.

See #2822
2017-05-11 00:34:14 +02:00
Frank Schroeder 27e951d213 agent: use helper for INADDR_ANY 2017-05-11 00:34:14 +02:00
Kyle Havlovitz e4af6583f5 Add a path for transitioning to TLS on an existing cluster (#3001)
Fixes #1705
2017-05-10 14:25:48 -07:00
Frank Schroeder f894a4cb7d Use bind address as source for outgoing connections (#2822)
This patch configures consul to use the bind address as the
source address for outgoing connections.

Fixes #2822
2017-05-04 01:41:47 +02:00
Frank Schroeder c772cecaab Do not modify config after creation II
Move code for finding the advertise address via a
template into consulConfig() so that the config
object is not modified after creation.
2017-05-04 01:41:47 +02:00
Frank Schroeder 6b96c9ff91 Do not modify config after creation
Make sure the RPCAdvertise address is always set
so that the configuration does not have to be modified
after creation.
2017-05-04 01:41:47 +02:00
Frank Schroeder a0b98948d8 Cleanup agent config 2017-05-04 01:41:47 +02:00
Kyle Havlovitz b5ed2ba536 Add separate option for verifying incoming HTTPS traffic (#2974)
* Add separate option for verifying incoming HTTPS traffic
2017-04-28 16:15:55 -07:00
Kyle Havlovitz 42cf797ad9 Add TLS cipher suite options and CA path support (#2963)
This patch adds options to configure the available
TLS cipher suites and adds support for a path
for multiple CA certificates.

Fixes #2959
2017-04-27 01:29:39 -07:00
Frank Schroeder 9de4555c0c golint: Untangle if blocks with return in else 2017-04-25 09:26:13 -07:00
Frank Schroeder 9f8f258d4d Remove duplicate constants
This patch removes duplicate internal copies of constants in the structs
package which are also defined in the api package. The api.KVOp type
with all its values for the TXN endpoint and the api.HealthXXX constants
are now used throughout the codebase.

This resulted in some circular dependencies in the testutil package
which have been resolved by copying code and constants and moving the
WaitForLeader function into a separate testrpc package.
2017-04-20 09:54:49 -07:00
James Phillips 20bf47d2b4
Adds a new -disable-host-node-id option to help when testing with containers.
Fixes #2877.
2017-04-12 22:07:18 -07:00
Alex Dadgar 13f026dec4 Hash host ID so its stable and well distributed
This PR takes the host ID and runs it through a hash so that it is well
distributed. This makes it so that machines that report similar host IDs
are easily distinguished.

Instances of similar IDs occur on EC2 where the ID is prefixed and on
motherboards created in the same batch.
2017-04-10 11:57:24 -07:00
James Phillips f2211d2489
Keeps the service and check tokens around for deregistration.
We fixed a few related issues while we were in here. We now only let
services register checks with a matching token, and we also close out
service and check delete operations if the catalog deregister claims
it doesn't know about the ID of the service or check being deleted.
2017-03-24 17:15:20 -07:00
Kyle Havlovitz 37ea20cb44
Add advanced autopilot features 2017-03-22 15:25:16 -07:00
James Phillips 9b43b31900
Forces user-supplied node IDs to lower case for consistency. 2017-03-13 19:51:56 -07:00
Kyle Havlovitz 8bcab6c6d7
Add autopilot server health tracking
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00
Kyle Havlovitz 44f0b08db7 Merge pull request #2771 from hashicorp/f-autopilot
Autopilot dead server cleanup, config, and raft version compatibility
2017-02-28 15:04:16 -08:00
Kyle Havlovitz 23c492a74e
Rename DeadServerCleanup and make wording adjustments 2017-02-28 14:45:21 -08:00
Kyle Havlovitz 3b67c50c1d
Remove the RPC client interface and update docs 2017-02-28 13:41:09 -08:00
Kyle Havlovitz 56e22a719f
Add state store table and endpoints for autopilot 2017-02-23 20:32:13 -08:00
Kyle Havlovitz f9588b8d7f
Add raft version 2/3 compatibility 2017-02-22 12:53:32 -08:00
Kyle Havlovitz f48cd78bc5 Merge pull request #2699 from hashicorp/f-tls-min-version
Add TLSMinVersion to config options
2017-02-01 16:31:53 -05:00
Kyle Havlovitz 086975ac0a
Add TLSMinVersion to config options 2017-02-01 16:20:33 -05:00
Sean Chittenden 9e2557c53c
Re-cherry-pick 71d807f607589f2eb4fea4e83e3876d122c8afc0 and e2320d69b6b155d8223758415aabafc60a0e9d3b. 2017-02-01 10:27:04 -08:00
James Phillips d1824ce361 Revert "Adds gopsutil in the loop when trying to make the node ID." 2017-01-31 19:13:49 -08:00
James Phillips e7b683dad8
Gets rid of a goto. 2017-01-31 19:02:25 -08:00
James Phillips d9b999109a
Adds gopsutil into node ID process and attempts to use host ID, if availabile. 2017-01-31 08:51:33 -08:00
Kyle Havlovitz cc271aa436
Merge branch 'master' into f-prepared-query-nodemeta 2017-01-23 20:17:48 -05:00
Kyle Havlovitz 3cd054e81f
Add tests for node meta in prepared queries and update docs 2017-01-23 19:17:30 -05:00
James Phillips 96bff003b7
Adds basic support for node IDs. 2017-01-17 22:47:59 -08:00
Kyle Havlovitz 51a1834667
Add -node-meta to agent command line options 2017-01-11 16:09:04 -05:00
Kyle Havlovitz d77890a011
Validate metadata config earlier and handle multiple filters 2017-01-11 15:12:03 -05:00
Kyle Havlovitz 6b5cf20b1c
Fix formatting 2017-01-09 13:49:33 -08:00
Kyle Havlovitz aee766baba
Add meta key validations and more tests 2017-01-09 11:21:49 -08:00
Kyle Havlovitz b25f4c7d35
Add tests for node metadata functionality 2017-01-05 17:21:56 -08:00
Kyle Havlovitz 2a423c6e2c
Add support for setting node metadata fields 2017-01-05 14:10:26 -08:00
James Phillips 927aa3789a
Adds ACL management support to the agent. 2016-12-14 07:07:41 -08:00
James Phillips 8db53b4ac2
Adds complete ACL coverage for /v1/coordinate/nodes and Coordinate.Update RPC. 2016-12-12 14:52:27 -08:00
James Phillips 67f1fe8d0b
Adds support for a new "acl_agent_token" which is used for internal
catalog operations.
2016-12-12 14:52:27 -08:00
James Phillips 2b19a5340c
Removes the exception for the "consul" service in the catalog. 2016-12-07 17:58:23 -08:00
Sean Chittenden 1ca285221a
Run all known addresses through go-sockaddr/template.
The following is now possible:

```
$ consul agent -dev -client="{{GetPrivateIP}}" -bind='{{GetInterfaceIP "en0"}}'
```
2016-12-02 16:35:38 +11:00
Kyle Havlovitz dd05afb32e Add reload/leave http endpoints (#2516) 2016-11-30 13:29:42 -05:00
Seth Vargo 1c55429a38
Add an API method for determining the best status
Given a list of HealthChecks, this determines the "best" status for the
collective group. This is useful for nodes and services, which may have
multiple checks associated with them.
2016-11-29 18:41:46 -05:00
Kyle Havlovitz ccab51b07c Add logWriter to agent Create() method 2016-11-28 18:36:26 -05:00
Kyle Havlovitz d91854f3b0 Add monitor http endpoint 2016-11-28 18:36:26 -05:00
Kyle Havlovitz 9b6847d13f Use uuids in persist temp files to avoid race (#2494) 2016-11-09 15:22:53 -08:00
Kyle Havlovitz b170282d0f Improve logging when deregistering a nonexistent service (#2492)
Log a warning instead of a success message when attempting to deregister a nonexistent service. In Consul 0.8 this can be changed to giving an error outright, but for now we can keep the idempotent delete behavior.
2016-11-09 16:56:54 -05:00
Kyle Havlovitz d71cdf9dc2 Merge pull request #2480 from hashicorp/b-atomic-writes
Atomic writes for persisting service/check state
2016-11-07 15:36:35 -05:00
Kyle Havlovitz 06fac8a918
Add a note about not calling sync for persistCheckState 2016-11-07 15:24:31 -05:00
Kyle Havlovitz 6b6601093c
Call fsync() for saving check/service state 2016-11-07 13:51:03 -05:00
Kyle McCullough 697ea0f8eb Add setting to skip ssl certificate verification for HTTP checks (#1984)
* http check: add setting to skip ssl certificate verification

* update http check documentation

* fix typo in documentation

* Add TLSSkipVerify to agent api
2016-11-03 13:17:30 -07:00
James Phillips 18926728d4 Supports WAN and LAN Serf Bind Addresses. (#2468)
* * adding cli config and config file support for specifying the serf wan and lan bind addresses
* updating documentation for serf wan and lan options
Fixes #2007

* Cleans up some small things from #2380.

* Uses the bind default for the agent test for Serf WAN and LAN.
2016-11-03 12:58:58 -07:00
James Phillips bc29610124 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
James Phillips dace771f06 Merge pull request #2389 from hashicorp/jbs-2019
Lower Service tag DNS warning to DEBUG for #2019
2016-10-24 17:05:02 -07:00
Brian Shumate 36e1a194c2
Lower Service tag DNS warning to DEBUG for #2019 2016-10-05 08:45:01 -04:00
Adam Wolfe Gordon ae5bd0f2cc agent: Stop reaping child processes (resolves #1988)
The consul docker image now uses dumb-init to reap child processes, so
there's no need to reap them ourselves.
2016-10-04 09:36:41 -06:00
James Phillips b339b0d2fc
Adds performance tuning capability for Raft, detuned defaults, and supplemental docs. 2016-08-24 21:58:37 -07:00
James Phillips db9a72dde1
Merge branch 'master' into f-deregister-critical 2016-08-16 12:53:21 -07:00
James Phillips 607595f99e
Cleans up based on code review feedback. 2016-08-16 12:52:30 -07:00
James Phillips 817d7e93e4
Adds an "lan" tagged address so we have a way to get them all.
If we didn't have this, then there would be no way to know the LAN
address if address translation was turned on.
2016-08-16 10:49:03 -07:00
James Phillips b4f981c837
Adds ability to deregister a service based on critical check state longer than a timeout. 2016-08-16 01:00:26 -07:00
James Phillips 9cece515c0
Adds basic ACL replication plumbing. 2016-08-03 21:24:04 -07:00
Cameron Davison 40d1c279dd
atomic write service state and checks files, fixes #1221 2016-08-03 10:34:12 -05:00
Sean Chittenden c90fc057d4
Give log reviewers a hint as to which check is failing 2016-06-20 15:25:21 -07:00
Sean Chittenden ff45f8c8ff
Revert "Move `structs.CheckID` to a new top-level package, `types`."
This reverts commit 2bbd52e3b44ff1b60939a8400264d534662d6d51.
2016-06-07 16:59:02 -04:00
Sean Chittenden a4554b945c
Move `structs.CheckID` to a new top-level package, `types`.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden cd68cd3868
Move `structs.CheckID` to a new top-level package, `types`.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden 0857e93d0b
Float a type balloon. Some strings are square pegs in round holes.
This experiment was brought about because of variable naming
confusion where name and checkIDs were interchanged.  Gave CheckID
an Qualified Type Name and chased downstream changes.
2016-06-07 16:59:02 -04:00
James Phillips 0f7f07d2f5 Fixes some bad error returns in the persist service and check paths. 2016-04-26 15:03:26 -07:00
James Phillips 03b0c196e0 Merge pull request #1762 from mshean/script-timeout
Add Timeout field to CheckMonitor
2016-04-24 23:08:06 -07:00
Matt Shean 7e617d6086 add Timeout field to CheckMonitor 2016-04-20 11:41:30 -07:00
James Phillips 32389a9822 Makes reap time configurable for LAN and WAN. 2016-04-11 00:38:25 -07:00
Wim 508bc796a8 Allow [::] as a bind address (binds to first public IPv6 address) 2016-03-18 23:59:44 +01:00
James Phillips 79bd1fd4bb Sets up config for more address tags down the road, renames struct members. 2016-02-07 10:37:34 -08:00