Commit graph

282 commits

Author SHA1 Message Date
Ryan Slade 6f05ea91a3 Replace time.Now().Sub(x) with time.Since(x) 2017-10-17 20:38:24 +02:00
James Phillips 39f2359804 Fixes an XSS issue with unescaped node names. (#3578)
* Fixes an XSS issue with node names in the tomography graph.

* Updates built-in static web assets.

* Updates the change log.
2017-10-16 09:12:36 -07:00
James Phillips fdd08c78a9 Adds a brief wait and poll period to update check status after a timeout. (#3573)
* Adds a brief wait and poll period to update the check status
if we get stucking waiting for the processes to terminate.

Fixes #3570

* Jumps out of timeout case and includes script output.
2017-10-12 13:49:46 -07:00
James Phillips e9670761f9
Cleans up some drift between the OSS and Enterprise trees. 2017-10-11 15:53:07 -07:00
Kyle Havlovitz eea2bd2753 Kill check processes after the timeout is reached (#3567)
* Kill check processes after the timeout is reached

Kill the subprocess spawned by a script check once the timeout is reached. Previously Consul just marked the check critical and left the subprocess around.

Fixes #3565.

* Set err to non-nil when timeout occurs

* Fix check timeout test

* Kill entire process subtree on check timeout

* Add a docs note about windows subprocess termination
2017-10-11 11:57:39 -07:00
Frank Schroeder c4215bc04f
config: remove redundant code 2017-10-11 10:16:21 +02:00
Frank Schroeder 8cda75454a
config: fix check for segment.port <= 0 and add test 2017-10-11 10:15:55 +02:00
James Phillips a16dbc0212
Adds check to make sure port is given so we avoid a nil bind address. 2017-10-10 18:11:21 -07:00
James Phillips 275e83de08
Removes obsolete segment stub. 2017-10-10 17:21:32 -07:00
Frank Schröder 9b2e3c2091 agent: add option to discard health output (#3562)
* agent: add option to discard health output

In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.

* agent: discard output also when adding a check

* agent: add test for discard check output

* agent: update docs

* go vet

* Adds discard_check_output to reloadable config table.

* Updates the change log.
2017-10-10 17:04:52 -07:00
preetapan f6066f8305 Fixes agent error handling when check definition is invalid. Distingu… (#3560)
* Fixes agent error handling when check definition is invalid. Distinguishes between empty checks vs invalid checks

* Made CheckTypes return Checks from service definition struct rather than a new copy, and other changes from code review. This also errors when json payload contains empty structs

* Simplify and improve validate method, and make sure that CheckTypes always returns a new copy of validated check definitions

* Tweaks some small style things and error messages.

* Updates the change log.
2017-10-10 16:54:06 -07:00
Frank Schröder fa22ad4573 config: add generic method to translate between CamelCase and snake_case (#3557)
* doc: document discrepancy between id and CheckID

* doc: document enable_tag_override change

* config: add TranslateKeys helper

TranslateKeys makes it easier to map between different representations
of internal structures. It allows to recursively map alias keys to
canonical keys in structured maps.

* config: use TranslateKeys for config file

This also adds support for 'enabletagoverride' and removes
the need for a separate CheckID alias field.

* config: remove dead code

* agent: use TranslateKeys for FixupCheckType

* agent: translate enable_tag_override during service registration

* doc: add '.hcl' as valid extension

* config: map ScriptArgs to args

* config: add comment for TranslateKeys
2017-10-10 16:40:59 -07:00
James Phillips d1ad538345 Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
Preetha Appan 25e64b5362 Fix unit test after dns library upgrade to account for correct data length 2017-10-06 17:40:17 -05:00
James Phillips a1db119d02 Fixes handling of stop channel and failed barrier attempts. (#3546)
* Fixes handling of stop channel and failed barrier attempts.

There were two issues here. First, we needed to not exit when there
was a timeout trying to write the barrier, because Raft might not
step down, so we'd be left as the leader but having run all the step
down actions.

Second, we didn't close over the stopCh correctly, so it was possible
to nil that out and have the leaderLoop never exit. We close over it
properly AND sequence the nil-ing of it AFTER the leaderLoop exits for
good measure, so the code is more robust.

Fixes #3545

* Cleans up based on code review feedback.

* Tweaks comments.

* Renames variables and removes comments.
2017-10-06 07:54:49 -07:00
Victor Boivie 77f7008363 Minor typo (boostrap) 2017-10-05 16:28:48 +02:00
James Phillips 97b580f593
Adds script warning and fixes Docker args recognition. 2017-10-04 21:41:27 -07:00
Kyle Havlovitz dde743700f Merge pull request #3535 from hashicorp/metric-docs
Update metric names and add a legacy config flag
2017-10-04 17:39:16 -07:00
Kyle Havlovitz d5fec6b7ac
Add a test for legacy metrics with a whitelist filter 2017-10-04 17:27:57 -07:00
Kyle Havlovitz be04bfed34 Clean up subprocess handling and make shell use optional (#3509)
* Clean up handling of subprocesses and make using a shell optional

* Update docs for subprocess changes

* Fix tests for new subprocess behavior

* More cleanup of subprocesses

* Minor adjustments and cleanup for subprocess logic

* Makes the watch handler reload test use the new path.

* Adds check tests for new args path, and updates existing tests to use new path.

* Adds support for script args in Docker checks.

* Fixes the sanitize unit test.

* Adds panic for unknown watch type, and reverts back to Run().

* Adds shell option back to consul lock command.

* Adds shell option back to consul exec command.

* Adds shell back into consul watch command.

* Refactors signal forwarding and makes Windows-friendly.

* Adds a clarifying comment.

* Changes error wording to a warning.

* Scopes signals to interrupt and kill.

This avoids us trying to send SIGCHILD to the dead process.

* Adds an error for shell=false for consul exec.

* Adds notes about the deprecated script and handler fields.

* De-nests an if statement.
2017-10-04 16:48:00 -07:00
Kyle Havlovitz 0063516e5e
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Frank Schröder b2c4dc4360 Provide stable config for agent/self (#3532)
* config: provide stable config for /v1/agent/self (#3530)

This patch adds a stable subset of the previous Config struct to the
agent/self response. The actual runtime configuration is moved into
DebugConfig and will be documented to change.

Fixes #3530

* config: fix tests

* doc: update api documentation for /v1/agent/self
2017-10-04 10:43:17 -07:00
James Phillips 6529c505a5 Merge pull request #3531 from hashicorp/pr-3521-slackpad
ui: Use monospace font for textarea controls.
2017-10-04 09:53:41 -07:00
James Phillips 539285cf1f
Updates checked in web assets to pick up CSS change.
Closes #3521
2017-10-04 09:52:15 -07:00
Preetha Appan f38d20eb40 Remove extra newline 2017-10-03 15:19:31 -05:00
Preetha Appan 3c81e2db7c Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment. 2017-10-03 15:15:56 -05:00
Preetha Appan d5acfc3982 Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in. 2017-10-02 17:10:21 -05:00
Frank Schroeder 6b3a957c5e use ports from derived addresses 2017-09-29 20:26:43 +02:00
Frank Schroeder 8d8e2523eb config: drop advertise_addrs
Fixes #3516
2017-09-29 20:26:43 +02:00
Frank Schroeder f0efe2a3de
Fix tests after config refactor 2017-09-28 12:32:46 +02:00
Patrick Sodré 55c2746963
Implement encodeKVasRFC1464 function 2017-09-28 12:32:46 +02:00
Patrick Sodré d880634cfa
Add RFC1464 tests 2017-09-28 12:32:45 +02:00
Patrick Sodré 7083f9fb14
Turn encodeKVasRFC1464 into a plain function 2017-09-28 12:32:45 +02:00
Patrick Sodré be258a3315
Use verify for NodeLookup CNAME, and TXT tests 2017-09-28 12:32:45 +02:00
Patrick Sodré 8982719f5b
Refactor formatTxtRecords as encodeKVasRFC1464
- Move the logic of rfc1035 out of the encoding function
  - Left basic version of encodingKV as 'k=v'
2017-09-28 12:32:45 +02:00
Patrick Sodré a16e0f7419
Fix editorial suggestions 2017-09-28 12:32:45 +02:00
Patrick Sodré 4b2d1546fa
Remove redundant check of Node.Meta size 2017-09-28 12:32:45 +02:00
Patrick Sodré b8369b54fb
Return Node.Meta info using the DNS interface 2017-09-28 12:32:45 +02:00
Patrick Sodré b8905dd065
Add test for NoteLookup ANY request 2017-09-28 12:32:45 +02:00
Patrick Sodré 354765c549
Add test for querying Node.Meta with DNS TXT
- Lookup TXT records using recursive lookups
  - Expect TXT record equal to value if key starts with rfc1035-
  - Expect TXT record in rfc1464 otherwise, i.e. (k=v)

ref #2709
2017-09-28 12:32:45 +02:00
Frank Schröder 5f6d0fd8c5 fail early when advertise addr is set to ANY (#3507) 2017-09-27 13:57:55 -07:00
Frank Schröder beb803f0d9 only detect advertise address if derived value is any (#3506)
* only detect advertise address if derived value is any

* determine detect function only when advertise addr is any
2017-09-27 12:59:47 -07:00
James Phillips d677999258
Adds a comment about Datacenter and NodeName being stable interfaces
in the runtime config strucutre.
2017-09-27 11:59:22 -07:00
Frank Schröder cda0eacff1 Recursive sanitize (#3505)
* vendor: add github.com/sergi/go-diff/diffmatchpatch for diff'ing test output

* config: refactor Sanitize to recursively clean runtime config and format complex fields

* Removes an extra int cast.

* Adds a top-level check test case for sanitization.
2017-09-27 11:47:40 -07:00
James Phillips 330ce87851
Gets rid of flaky clause in stats fetcher unit test.
Given how the rutine is coded we can still get data so this wasn't
a reliable thing to check.
2017-09-26 20:53:06 -07:00
preetapan 783e24be64 Issue 3452 (#3500)
* Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf

* Get address from node table in the state store rather than from service address

* Fix incorrect lookup by checkname instead of node name

* Make sure that serverlookup is called with the right address format, added unit test.

* Address code review comments

* Tweaks style stuff.
2017-09-26 20:49:41 -07:00
Frank Schröder 707f8e329a Metrics service prefix (#3498)
* metrics: replace statsite_prefix with service_prefix

The metrics prefix isn't statsite specific and is in fact used
for all metrics providers. Since we are deprecating fields
anyway we should fix this one as well.

Fixes #3293

* Updates docs and sorts telemetry section.

* Renames to "metrics_prefix" to disambiguate with Consul services.

* Updates the change log.
2017-09-26 17:49:55 -07:00
James Phillips 3130fcaccc Merge pull request #3501 from hashicorp/snapshot-test-hang
Cleans up some edge cases in TestSnapshot_Forward_Leader.
2017-09-26 14:08:33 -07:00
James Phillips 4b17c9618f
Cleans up some edge cases in TestSnapshot_Forward_Leader.
These could cause the tests to hang.
2017-09-26 14:07:28 -07:00
Kyle Havlovitz 3460506264 Fix watch error when http & https are disabled (#3493)
Remove an error in watch reloading that happens when http and https
are both disabled, and use an https address for running watches if
no http addresses are present.

Fixes #3425.
2017-09-26 13:47:27 -07:00