Commit Graph

8 Commits

Author SHA1 Message Date
Frank Schröder 9b2e3c2091 agent: add option to discard health output (#3562)
* agent: add option to discard health output

In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.

* agent: discard output also when adding a check

* agent: add test for discard check output

* agent: update docs

* go vet

* Adds discard_check_output to reloadable config table.

* Updates the change log.
2017-10-10 17:04:52 -07:00
James Phillips d1ad538345 Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
Kyle Havlovitz 0063516e5e
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Preetha Appan d5acfc3982 Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in. 2017-10-02 17:10:21 -05:00
James Phillips d677999258
Adds a comment about Datacenter and NodeName being stable interfaces
in the runtime config strucutre.
2017-09-27 11:59:22 -07:00
Frank Schröder cda0eacff1 Recursive sanitize (#3505)
* vendor: add github.com/sergi/go-diff/diffmatchpatch for diff'ing test output

* config: refactor Sanitize to recursively clean runtime config and format complex fields

* Removes an extra int cast.

* Adds a top-level check test case for sanitization.
2017-09-27 11:47:40 -07:00
Frank Schröder 707f8e329a Metrics service prefix (#3498)
* metrics: replace statsite_prefix with service_prefix

The metrics prefix isn't statsite specific and is in fact used
for all metrics providers. Since we are deprecating fields
anyway we should fix this one as well.

Fixes #3293

* Updates docs and sorts telemetry section.

* Renames to "metrics_prefix" to disambiguate with Consul services.

* Updates the change log.
2017-09-26 17:49:55 -07:00
Frank Schröder 69a088ca85 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00