Commit Graph

1053 Commits

Author SHA1 Message Date
Pierre Souchay c3713dbbf1 Performance optimization for services having more than 2k records 2018-03-08 00:26:41 +01:00
Pierre Souchay 1085d5a7b4 Avoid issue with compression of DNS messages causing overflow 2018-03-07 23:33:41 +01:00
Pierre Souchay 241c7e5f5f Cleaner Unit tests from suggestions from @preetapan 2018-03-07 18:24:41 +01:00
Pierre Souchay b672707552 64000 max limit to DNS messages since there is overhead
Added debug log to give information about truncation.
2018-03-07 16:14:41 +01:00
Pierre Souchay 06afb4d02c [BUGFIX] do not break when TCP DNS answer exceeds 64k
It will avoid having discovery broken when having large number
of instances of a service (works with SRV and A* records).

Fixes https://github.com/hashicorp/consul/issues/3850
2018-03-07 10:08:06 +01:00
Mitchell Hashimoto fbac58280e
agent/consul/fsm: begin using testify/assert 2018-03-06 09:48:15 -08:00
Pierre Souchay 09970479b5 Allow to control the number of A/AAAA Record returned by DNS
This allows to have randomized resource records (i.e. each
answer contains only one IP, but the IP changes every request) for
A, AAAA records.

It will fix https://github.com/hashicorp/consul/issues/3355 and
https://github.com/hashicorp/consul/issues/3937

See https://github.com/hashicorp/consul/issues/3937#issuecomment-370610509
for details.

It basically add a new option called `a_record_limit` and will not
return more than a_record_limit when performing A, AAAA or ANY DNS
requests.

The existing `udp_answer_limit` option is still working but should
be considered as deprecated since it works only with DNS clients
not supporting EDNS.
2018-03-06 02:07:42 +01:00
Edd Steel 413cb3d3b5
Re-use defined endpoints for tests 2018-03-03 11:19:18 -08:00
Paul Banks 628dcc9793
Merge pull request #3899 from pierresouchay/fix_blocking_queries_index
Services Indexes modified per service instead of using a global Index
2018-03-02 16:24:43 +00:00
Pierre Souchay 85b73f8163 Simplified error handling for maxIndexForService
* added unit tests to ensure service index is properly garbage collected
* added Upgrade from Version 1.0.6 to higher section in documentation
2018-03-01 14:09:36 +01:00
Paul Banks f7ecbce39a
Fix test running in non-bash shells 2018-02-22 14:06:06 +00:00
Paul Banks 0ee77a5e02
Merge pull request #3900 from hashicorp/fix-monitor-sigint-3891
Fixes #3891: agent monitor no longer unresponsive before logs stream.
2018-02-21 21:28:33 +00:00
Preetha Appan 77d35f1829
Remove extra newline 2018-02-21 13:21:47 -06:00
Preetha Appan 573500dc51
Unit test that calls revokeLeadership twice to make sure its idempotent 2018-02-21 12:48:53 -06:00
Preetha Appan bd270b02ba
Make sure revokeLeadership is called if establishLeadership errors 2018-02-21 12:33:22 -06:00
Alex Dadgar 535842004c Test autopilots start/stop idempotency 2018-02-21 10:19:30 -08:00
Alex Dadgar 4d99696f02 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Pierre Souchay e6d85cb36a Fixed comments for function maxIndexForService 2018-02-20 23:57:28 +01:00
Pierre Souchay b26ea3c230 [Revert] Only update services if tags are different
This patch did give some better results, but break watches on
the services of a node.

It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.

Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay 903e866835 Only update services if tags are different 2018-02-20 23:08:04 +01:00
Pierre Souchay 56d5c0bf22 Enable Raft index optimization per service name on health endpoint
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Paul Banks 69ebbf3e79
Fixes #3891: agent monitor no longer unresponsive before logs stream.
The root cause is actually that the agent's streaming HTTP API didn't flush until the first log line was found which commonly was pretty soon since the default level is INFO. In cases where there were no logs immediately due to level for instance, the client gets stuck in the HTTP code waiting on a response packet from the server before we enter the loop that checks the shutdown channel from the signal handler.

This fix flushes the initial status immediately on the streaming endpoint which lets the client code get into it's expected state where it's listening for shutdown or log lines.
2018-02-19 21:53:10 +00:00
Pierre Souchay ec1b278595 Get only first service to test whether we have to cleanup index of a service 2018-02-19 22:44:49 +01:00
Pierre Souchay 523feb0be4 Fixed comment about raftIndex + use test.Helper() 2018-02-19 19:30:25 +01:00
Pierre Souchay 4c188c1d08 Services Indexes modified per service instead of using a global Index
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.

It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Edd Steel 35c2083422
Clarify comments 2018-02-17 17:46:11 -08:00
Edd Steel 61be181f6f Test every endpoint for OPTIONS/MethodNotFound 2018-02-17 17:34:13 -08:00
Edd Steel 6c33163959 Allow endpoints to handle OPTIONS/MethodNotFound themselves 2018-02-17 17:34:03 -08:00
Edd Steel 4dc9d2ebd7
Initialise `allowedMethods` in init() 2018-02-17 17:31:24 -08:00
Kyle Havlovitz ea452c6032
Fix the coordinate update endpoint not passing the ACL token 2018-02-15 11:58:02 -08:00
Edd Steel 40eefc9f7d
Support OPTIONS requests
- register endpoints with supported methods
- support OPTIONS requests, indicating supported methods
- extract method validation (error 405) from individual endpoints
- on 405 where multiple methods are allowed, create a single Allow
  header with comma-separated values, not multiple Allow headers.
2018-02-12 10:15:31 -08:00
Andrei Burd dbb010c865 adding human readability for dns requests debug log (#3751) 2018-02-11 09:02:28 -06:00
Pierre Souchay 824b72cf90 Merge remote-tracking branch 'origin/master' into service_metadata 2018-02-11 13:20:49 +01:00
Pierre Souchay e99bf584c9 Fixed TestSanitize unit test 2018-02-11 12:11:11 +01:00
James Phillips 37cf6583db
Fixes a panic on TCP-based DNS lookups.
This came in via the monkey patch in #3861.

Fixes #3877
2018-02-08 17:57:41 -08:00
Pierre Souchay f2df4005fe Added unit tests for structs and fixed PartialClone() 2018-02-09 01:37:45 +01:00
James Phillips 4f3b4d0e55
Addresses additional state mutations.
Did a sweep of 84d6ac2d51
and checked them all.
2018-02-07 07:02:10 -08:00
James Phillips ca461f8890
Fixes all the racy output-side updates to tags. 2018-02-06 20:35:55 -08:00
James Phillips e7dd7b2d13
Adds a more robust unit test for index churn. 2018-02-06 20:35:38 -08:00
Pierre Souchay 3acc5b58d4 Added support for Service Metadata 2018-02-07 01:54:42 +01:00
James Phillips 41e3fcf205
Makes server manager shift away from failed servers from Serf events.
Because this code was doing pointer equality checks, it would work for
the case of a failed attempted RPC because the objects are from the
manager itself:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/consul/rpc.go#L283-L302

But the pointer check would always fail for events coming in from the
Serf path because the server object is newly-created:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/router/serf_adapter.go#L14-L40

This means that we didn't proactively shift RPC traffic away from a
failed server, we'd have to wait for an RPC to fail, which exposes
the error to the calling client.

By switching over to a name check vs. a pointer check we get the correct
behavior. We added a DEBUG log as well to help observe this behavior during
integrated testing.

Related to #3863 since the fix here needed the same logic duplicated, owing
to the complicated atomic stuff.

/cc @dadgar for a heads up in case this also affects Nomad.
2018-02-05 17:56:00 -08:00
James Phillips c718459e49
Adds a before/after test for #3845. 2018-02-05 16:18:29 -08:00
James Phillips 5b245c0201
Merge pull request #3845 from 42wim/tagfix
Fix service tags not added to health check. Part two
2018-02-05 16:18:00 -08:00
Kyle Havlovitz 46745eb89b
Add enterprise default config section 2018-02-05 13:33:59 -08:00
James Phillips 0aa05cc5f0
Merge pull request #3855 from hashicorp/pr-3782-slackpad
Adds support for gRPC health checks.
2018-02-02 17:57:27 -08:00
James Phillips 1a08e8c0f1
Changes "TLS" to "GRPCUseTLS" since it only applies to GRPC checks. 2018-02-02 17:29:34 -08:00
Wim 5cc76cce09 Fix service tags not added to health check. Part two 2018-01-29 20:32:44 +01:00
Veselkov Konstantin c2395d9bd0 fix refactoring 2018-01-28 22:53:30 +04:00
Veselkov Konstantin c4ad54e057 fix refactoring 2018-01-28 22:48:21 +04:00
Veselkov Konstantin 05666113a4 remove golint warnings 2018-01-28 22:40:13 +04:00
James Phillips 443250c76c
Improves user lookup error message.
Closes #3188
Closes #3184
2018-01-26 07:56:44 -08:00
Kyle Havlovitz 32dbb51c3b
Remove nonvoter from metadata.Server 2018-01-25 17:08:03 -08:00
James Phillips 38f5b2e7ce
Gets rid of named return parameters.
This wasn't wrong before but we don't generally use this style in
Consul.
2018-01-25 14:29:50 -08:00
James Phillips 1acaaecbdd
Moves non-stdlib includes into their own section. 2018-01-25 14:26:15 -08:00
Kyle Havlovitz 0e76d62846
Reset clusterHealth when autopilot starts 2018-01-23 12:52:28 -08:00
Kyle Havlovitz 6d1dbe6cc4
Move autopilot health loop into leader operations 2018-01-23 11:17:41 -08:00
James Phillips a4c3a3433c
Updates web assets to latest. 2018-01-22 14:46:07 -08:00
Kyle Havlovitz c4528a6110
Merge pull request #3821 from hashicorp/persist-file-handling
Add graceful handling of malformed persisted service/check files.
2018-01-22 12:31:33 -08:00
Kyle Havlovitz bb068b4c93
Merge pull request #3820 from hashicorp/serfwan-port-fix
Enforce a valid port for the Serf WAN since it can't be disabled.
2018-01-19 15:40:56 -08:00
James Phillips 77ab587ae1
Moves the coordinate fetch after the ACL check. 2018-01-19 15:25:22 -08:00
Kyle Havlovitz b651253cb2
Don't remove the files, just log an error 2018-01-19 14:25:51 -08:00
Kyle Havlovitz f191eb2df3
Enforce a valid port for the Serf WAN since it can't be disabled.
Fixes #3817
2018-01-19 14:22:23 -08:00
Kyle Havlovitz 17ec4a9394
Add graceful handling of malformed persisted service/check files.
Previously a change was made to make the file writing atomic,
but that wasn't enough to cover something like an OS crash so we
needed something here to handle the situation more gracefully.

Fixes #1221.
2018-01-19 14:07:36 -08:00
James Hartig 81d0ffc959 Resolve symlinks in config directory
Docker/Openshift/Kubernetes mount the config file as a symbolic link and
IsDir returns true if the file is a symlink. Before calling IsDir, the
symlink should be resolved to determine if it points at a file or
directory.

Fixes #3753
2018-01-12 15:43:38 -05:00
James Phillips ca43623734
Adds the NodeID field back to the /v1/agent/self Config block.
Fixes #3778
2018-01-10 15:17:54 -08:00
James Phillips ff2aae98f4
Adds more info about how to fix the private IP error.
Closes #3790
2018-01-10 09:53:41 -08:00
James Phillips e282e9285c
Fixes crash where body was optional for PQ endpoint (it is not).
Fixes #3791
2018-01-10 09:33:49 -08:00
Dmytro Kostiuchenko a45f6ad740 Add gRPC health-check #3073 2018-01-04 16:42:30 -05:00
Diptanu Choudhury f597d66392 Using labels 2017-12-21 20:30:29 -08:00
Diptanu Choudhury ac50568a1a Added telemetry around Catalog APIs 2017-12-21 16:35:12 -08:00
James Phillips 1cd94dec8a
Updates the checked in web assets. 2017-12-20 19:51:04 -08:00
James Phillips bfce81d721
Updates the built-in web assets. 2017-12-20 17:48:51 -08:00
James Phillips 8fd94a2e7c
Wraps HTTP mux to ban all non-printable characters from paths. 2017-12-20 15:47:53 -08:00
James Phillips 6f09cf48db
Updates the built-in web UI assets. 2017-12-20 13:43:52 -08:00
James Phillips 62e97a6602
Fixes a `go fmt` cleanup. 2017-12-20 13:43:38 -08:00
Kyle Havlovitz 74b0c58831
Fix vet error 2017-12-18 18:04:42 -08:00
Kyle Havlovitz dfc165a47b
Move autopilot initializing to oss file 2017-12-18 18:02:44 -08:00
Kyle Havlovitz 044c38aa7b
Move autopilot setup to a separate file 2017-12-18 16:55:51 -08:00
Kyle Havlovitz 9e1ba6fb4e
Make some final tweaks to autopilot package 2017-12-18 12:26:47 -08:00
Kyle Havlovitz 6b58df5898
Merge pull request #3737 from hashicorp/autopilot-refactor
Move autopilot to a standalone package
2017-12-15 14:09:40 -08:00
James Phillips 262cbbd9ca
Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak
fix globalRPC goroutine leak
2017-12-14 17:54:19 -08:00
James Phillips 518ab954bc
Merge pull request #3642 from yfouquet/master
[Fix] Service tags not added to health checks
2017-12-14 13:59:39 -08:00
James Phillips 47cd775b3d
Works around mapstructure behavior to enable sessions with no checks.
Fixes #3732
2017-12-14 09:07:56 -08:00
Kyle Havlovitz 798aca92c5
Expose IsPotentialVoter for advanced autopilot logic 2017-12-13 17:53:51 -08:00
James Phillips 68c94a5047
Changes maps to merge vs. overwrite when processing configs.
Fixes #3716
2017-12-13 16:06:01 -08:00
Kyle Havlovitz a4ac148077
Merge branch 'master' into autopilot-refactor 2017-12-13 11:54:32 -08:00
Kyle Havlovitz 6c985132de
A few last autopilot adjustments 2017-12-13 11:19:17 -08:00
Kyle Havlovitz 77d92bf15c
More autopilot reorganizing 2017-12-13 10:57:37 -08:00
James Phillips 984de6e2e0
Adds TODOs referencing #3744. 2017-12-13 10:52:06 -08:00
James Phillips 63011dd393
Copies the autopilot settings from the runtime config.
Fixes #3730
2017-12-13 10:32:05 -08:00
Kyle Havlovitz f347c8a531
More refactoring to make autopilot consul-agnostic 2017-12-12 17:46:28 -08:00
Yoann Fouquet f4f7db0059 [Fix] Service tags not added to health checks
Since commit 9685bdcd0ba4b4b3adb04f9c1dd67d637ca7894e, service tags are added to the health checks.
Otherwise, when adding a service, tags are not added to its check.

In updateSyncState, we compare the checks of the local agent with the checks of the catalog.
It appears that the service tags are different (missing in one case), and so the check is synchronized.
That increase the ModifyIndex periodically when nothing changes.

Fixed it by adding serviceTags to the check.

Note that the issue appeared in version 0.8.2.
Looks related to #3259.
2017-12-12 13:39:37 +01:00
Kyle Havlovitz 8546a1d3c6
Move autopilot to a standalone package 2017-12-11 16:45:33 -08:00
James Phillips 32b64575d1
Moves Serf helper into lib to fix import cycle in consul-enterprise. 2017-12-07 16:57:58 -08:00
James Phillips c16cce80bb
Turns of intent queue warnings and enables dynamic queue sizing. 2017-12-07 16:27:06 -08:00
Wei Wei 04531ff0fb fix globalRPC goroutine leak
Signed-off-by: Wei Wei <weiwei.inf@gmail.com>
2017-12-05 11:53:30 +08:00
James Phillips c4bc89a187
Creates a registration mechanism for snapshot and restore. 2017-11-29 18:36:53 -08:00
James Phillips 8571555703
Begins split out of snapshots from the main FSM class. 2017-11-29 18:36:53 -08:00
James Phillips 4eaee8e0ba
Creates a registration mechanism for FSM commands. 2017-11-29 18:36:53 -08:00
James Phillips 3e7ea1931c
Moves the FSM into its own package.
This will help make it clearer what happens when we add some registration
plumbing for the different operations and snapshots.
2017-11-29 18:36:53 -08:00
James Phillips 7f3783f4be
Resolves an FSM snapshot TODO.
This adds checks for sink write calls before we continue the refactor, which
will resolve the other TODO comment we deleted as part of this change.
2017-11-29 18:36:53 -08:00
James Phillips 5a24d37ac0
Creates a registration mechanism for schemas.
This also splits out the registration into the table-specific source
files.
2017-11-29 18:36:52 -08:00
James Phillips 36bb30e67a
Creates a registration mechanism for RPC endpoints. 2017-11-29 18:36:52 -08:00
James Phillips 8f802411c4
Creates HTTP endpoint registry. 2017-11-29 18:36:52 -08:00
James Phillips 8da7f0ff7e
Moves coordinate disabled logic down into endpoints.
Similar rationale to the previous change for ACLs.
2017-11-29 18:36:52 -08:00
James Phillips cf30d409c9
Moves ACL disabled response logic down into endpoints.
This lets us make the registration of endpoints less fancy, on the
road to adding a registration mechanism.
2017-11-29 18:36:52 -08:00
James Phillips 6234f0bd46
Renames "segments" to "segment" to be consistent with other files. 2017-11-29 18:36:52 -08:00
James Phillips ba56669ea8
Renames stubs to be more consistent. 2017-11-29 18:36:52 -08:00
James Phillips 56552095c9
Sheds monotonic time info so tombstone GC bins work properly. 2017-11-29 10:34:24 -08:00
James Phillips 8656b7a3e9
Gives back the lock before writing to the expire channel.
The lock isn't needed after we clean up the expire bin, and as seen
in #3700 we can get into a deadlock waiting to place the expire index
into the channel while holding this lock.

Fixes #3700
2017-11-19 16:24:16 -08:00
James Phillips ae85cc4070
Skips files with unknown extensions when not forcing a format.
Fixes #3685
2017-11-10 18:06:07 -08:00
James Phillips d5bf4e9c6e
Adds a snapshot agent stub to the config structure.
Fixes #3678
2017-11-10 13:50:45 -08:00
James Phillips 50cdff36e5
Cleans up check logging.
There were places where we still didn't have the script vs. args sorted
correctly so changed all the logging to be just based on check IDs and
also made everything uniform.

Also removed some annoying debug logging, and moved some of the large output
logging to TRACE level.

Closes #3602
2017-11-10 12:48:44 -08:00
James Phillips 8210523b1b
Moves the LAN event handler after the router is created.
Fixes #3680
2017-11-10 12:26:48 -08:00
James Phillips bfbbfb62ca
Revert "Adds a small sleep to make sure we are in the next GC bucket." 2017-11-08 22:18:37 -08:00
James Phillips d6328a5bf8
Adds a sleep to make sure we are in the next GC bucket, ups time.
Fixes #3670
2017-11-08 22:02:40 -08:00
James Phillips 91824375be
Skips the tombstone GC test in Travis for now.
Related to #3670
2017-11-08 20:14:20 -08:00
James Phillips c060df20de
Adds missing os import. 2017-11-08 20:02:22 -08:00
James Phillips b94ba8aeb4
Removes bogus getPort() in favor of freeport. 2017-11-08 19:55:50 -08:00
James Phillips 04a7907a7e
Skips IPv6 test in Travis. 2017-11-08 18:28:45 -08:00
James Phillips c52824bab7
Adds a longer retry period for the AE deferred output test.
There's some justification in the comments about this and a TODO to
improve this later.

Fixes #3668
2017-11-08 18:10:13 -08:00
James Phillips 444a345a3a
Tightens timing up and reorders GC test to be less flaky. 2017-11-08 15:09:29 -08:00
James Phillips e00624425b
Doubles the GC timing. 2017-11-08 15:01:11 -08:00
James Phillips 8eb91777d9
Opens up test timing a little more. 2017-11-08 14:01:19 -08:00
James Phillips d45c2a01f1
Shifts off a gran boundary to help make test less flaky. 2017-11-08 13:57:17 -08:00
James Phillips 757e353334
Opens up the tombstone GC test timing. 2017-11-08 13:43:39 -08:00
James Phillips 532cafe0af
Adds enable_agent_tls_for_checks configuration option which allows (#3661)
HTTP health checks for services requiring 2-way TLS to be checked
using the agent's credentials.
2017-11-07 18:22:09 -08:00
James Phillips 9de2d8921f
Saves the cycled server list after a failed ping when rebalancing. (#3662)
Fixes #3463
2017-11-07 18:13:23 -08:00
James Phillips d938493671
Double-books the HTTP metrics w/ and w/o the "consul" prefix.
Fixes #3654
2017-11-07 16:32:45 -08:00
James Phillips 8709f65afd
Adds HTTP/2 support to Consul's HTTPS server. (#3657)
* Refactors the HTTP listen path to create servers in the same spot.

* Adds HTTP/2 support to Consul's HTTPS server.

* Vendors Go HTTP/2 library and associated deps.
2017-11-07 15:06:59 -08:00
James Phillips 021373d72e
Makes the metrics ACL test call the right endpoint.
This also required setting up a proper in-mem sink so we don't get
metrics-related errors.

Fixes #3655
2017-11-06 21:50:04 -08:00
Preetha Appan ae9e204b3a Sets tty in docker client back to true, as a potential fix for docker exec weirdness 2017-11-05 09:44:55 -06:00
Kyle Havlovitz 068ca11eb8
Move check definition to a sub-struct 2017-11-01 14:54:46 -07:00
Kyle Havlovitz bc3ba5f873
Merge branch 'master' into esm-changes 2017-11-01 11:37:48 -07:00
Kyle Havlovitz 83524f44c4
Merge pull request #3622 from hashicorp/coordinate-node-endpoint
agent: add /v1/coordianate/node/:node endpoint
2017-11-01 11:35:50 -07:00
Kyle Havlovitz 3542d7fcb6
Remove redundant lines from coordinate test 2017-11-01 11:25:33 -07:00
Kyle Havlovitz 9909b661ac
Fill out the tests around coordinate/node functionality 2017-10-31 15:36:44 -07:00
Frank Schröder 3cb1cd3723 config: add -config-format option (#3626)
* config: refactor ReadPath(s) methods without side-effects

Return the sources instead of modifying the state.

* config: clean data dir before every test

* config: add tests for config-file and config-dir

* config: add -config-format option

Starting with Consul 1.0 all config files must have a '.json' or '.hcl'
extension to make it unambigous how the data should be parsed. Some
automation tools generate temporary files by appending a random string
to the generated file which obfuscates the extension and prevents the
file type detection.

This patch adds a -config-format option which can be used to override
the auto-detection behavior by forcing all config files or all files
within a config directory independent of their extension to be
interpreted as of this format.

Fixes #3620
2017-10-31 17:30:01 -05:00
Frank Schröder 56561523cf vendor: update go-discover (#3634)
* vendor: update go-discover

Pull in providers:

 * Aliyun (Alibaba Cloud)
 * Digital Ocean
 * OpenStack (os)
 * Scaleway

* doc: use ... instead of xxx

* doc: strip trailing whitespace

* doc: add docs for aliyun, digitalocean, os and scaleway

* agent: fix test
2017-10-31 17:03:54 -05:00
Kyle Havlovitz fd4d9f1c16
Factor out registerNodes function 2017-10-31 13:34:49 -07:00
James Phillips c6e0366c02
Relaxes Autopilot promotion logic. (#3623)
* Relaxes Autopilot promotion logic.

When we defaulted the Raft protocol version to 3 in #3477 we made
the numPeers() routine more strict to only count voters (this is
more conservative and more correct). This had the side effect of
breaking rolling updates because it's at odds with the Autopilot
non-voter promotion logic.

That logic used to wait to only promote to maintain an odd quorum
of servers. During a rolling update (add one new server, wait, and
then kill an old server) the dead server cleanup would still count
the old server as a peer, which is conservative and the right thing
to do, and no longer count the non-voter. This would wait to promote,
so you could get into a stalemate. It is safer to promote early than
remove early, so by promoting as soon as possible we have chosen
that as the solution here.

Fixes #3611

* Gets rid of unnecessary extra not-a-voter check.
2017-10-31 15:16:56 -05:00
Frank Schroeder 82a52d3b50
docker: fix failing test 2017-10-31 09:26:34 +01:00
Frank Schroeder ed1b1b54cd
docker: render errors with %v since they can be nil 2017-10-31 09:19:20 +01:00
Kyle Havlovitz 2c7f7799bb
Add tests around coordinate update endpoint 2017-10-26 20:12:54 -07:00
Kyle Havlovitz 496dd7ab5b
Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes 2017-10-26 19:20:24 -07:00
Kyle Havlovitz f80e70271d
Added Coordinate.Node rpc endpoint and client api method 2017-10-26 19:16:40 -07:00
Frank Schroeder 87206133be
agent: add /v1/coordianate/node/:node endpoint
This patch adds a /v1/coordinate/node/:node endpoint to get the network
coordinates for a single node in the network.

Since Consul Enterprise supports network segments it is still possible
to receive mutiple entries for a single node - one per segment.
2017-10-26 14:24:42 +02:00
Frank Schroeder 712447026f
docker: add comment about "connection reset by peer" error 2017-10-26 12:14:19 +02:00
Frank Schroeder 7d05e55734
docker: stop previous check on replace 2017-10-26 12:03:07 +02:00
Frank Schroeder bf98779d84
docker: close idle connections on stop 2017-10-26 12:02:39 +02:00
Frank Schroeder 0a9d2a367e
docker: do not alloc a tty since this is not interactive 2017-10-26 11:56:54 +02:00
Frank Schroeder b1a5a6b64d
docker: make sure to log the error when we fall through 2017-10-26 11:56:36 +02:00
Frank Schroeder b907c4611d
docker: ignore "connection reset by peer"
The Docker agent closes the connection during read after we have
read the body. This causes a "connection reset by peer" even though
the command was successful.

We ignore that error here since we got the correct status code
and a response body.
2017-10-26 11:56:08 +02:00
Kyle Havlovitz 16908be034
Add deregister critical service field and refactor duration parsing 2017-10-25 19:17:41 -07:00
Kyle Havlovitz ab3dac2379
Added coordinate update http endpoint 2017-10-25 19:37:30 +02:00
Kyle Havlovitz 7d82ece118
Added remaining HTTP health check fields to structs 2017-10-25 19:37:30 +02:00
Kyle Havlovitz 84a07ea113
Expose SkipNodeUpdate field and some health check info in the http api 2017-10-25 19:37:30 +02:00
Frank Schroeder 2719194b6d
fix go vet issue 2017-10-25 19:30:35 +02:00
Frank Schroeder 1eb3d0e0d4
replace custom unique id with a UUID 2017-10-25 19:30:35 +02:00
Frank Schroeder 1dab004335
Decouple the code that executes checks from the agent 2017-10-25 11:18:07 +02:00
Frank Schroeder 1d2ae14719
local state: fix go vet issue 2017-10-23 10:56:05 +02:00
Frank Schroeder a818414bb6
local state: remove stale comment 2017-10-23 10:56:05 +02:00
Frank Schroeder 329fdc40a8
local state: make test more robust 2017-10-23 10:56:05 +02:00
Frank Schroeder f5a3d73b27
local state: clone check to avoid side effect 2017-10-23 10:56:05 +02:00
Frank Schroeder b36613e7ff
local state: use synchronized access to internal maps 2017-10-23 10:56:05 +02:00
Frank Schroeder e5318061d1
ae: do not trigger on Resume while holding the lock 2017-10-23 10:56:05 +02:00
Frank Schroeder 7d5dfa9c53
ae: add remaining test cases 2017-10-23 10:56:05 +02:00
Frank Schroeder ae7269458c
ae: refactor StateSyncer to state machine for better testing 2017-10-23 10:56:05 +02:00
Frank Schroeder a72b68c562
ae: add test that we run a full before a partial sync 2017-10-23 10:56:05 +02:00
Frank Schroeder 27bc11f005
ae: make control flow more explicit 2017-10-23 10:56:05 +02:00
Frank Schroeder 29435004f6
ae: fix typo in constructor name 2017-10-23 10:56:05 +02:00
Frank Schroeder 1bb1a6787e
ae: add test for resume triggering SyncChanges 2017-10-23 10:56:05 +02:00
Frank Schroeder 6de645a8b8
ae: add test for ifNotPausedRun 2017-10-23 10:56:05 +02:00
Frank Schroeder 4f79ee3188
ae: make stagger function pluggable for testing 2017-10-23 10:56:05 +02:00
Frank Schroeder 5d6089b6fa
ae: restore previous pause/resume behavior 2017-10-23 10:56:04 +02:00
Frank Schroeder 6064a2a764
ae: ensure that syncs are blocked when paused 2017-10-23 10:56:04 +02:00
Frank Schroeder f187c37c27
local state: rename Add{Check,Service}State to Set{Check,Service}State 2017-10-23 10:56:04 +02:00
Frank Schroeder 209e67b2f9
local state: move Metadata methods together 2017-10-23 10:56:04 +02:00
Frank Schroeder 9513a042be
local state: update documentation of updateSyncState 2017-10-23 10:56:04 +02:00
Frank Schroeder 2e3b72d2c3
local state: update comments 2017-10-23 10:56:04 +02:00
Frank Schroeder da604495a0
local state: address review comments
* move non-blocking notification mechanism into ae.Trigger
* move Pause/Resume into separate type
2017-10-23 10:56:04 +02:00
Frank Schroeder c39bc770b3
local state: refactor TestAgentAntiEntropy_EnableTagOverride
Make intent clearer by being more explicit and adding some comments.
Use verify.Values to compare service entries.
2017-10-23 10:56:04 +02:00
Frank Schroeder e1358a541d
local state: fix TestAgentAntiEntropy_EnableTagOverride
The test had a race condition where it relied on the first service to be
synced to the remote catalog which sometimes failed.
2017-10-23 10:56:04 +02:00
Frank Schroeder b3195006b1
local state: rename tests 2017-10-23 10:56:04 +02:00
Frank Schroeder d9a4b440a8
local state: drop retry loops from tests
Since the tests are now using synchronous calls for state syncing
we no longer need to use retry loops to wait for the changes to
propagate.
2017-10-23 10:56:04 +02:00
Frank Schroeder 462de8b2cb
agent: skip non-sensical TestCatalogRegister
It is not clear what this test is supposed to verify.
2017-10-23 10:56:04 +02:00
Frank Schroeder 32c2d1b217
local state: fix anti-entropy state tests
The anti-entropy tests relied on the side-effect of the StartSync()
method to perform a full sync instead of a partial sync. This lead to
multiple anti-entropy go routines being started unnecessary retry loops.

This change changes the behavior to perform synchronous full syncs when
necessary removing the need for all of the time.Sleep and most of the
retry loops.
2017-10-23 10:56:04 +02:00
Frank Schroeder 6b966e48ce
local state: fix test with updated error message 2017-10-23 10:56:04 +02:00
Frank Schroeder 632c43e575
local state: fix failing tests 2017-10-23 10:56:03 +02:00
Frank Schroeder ea92ee308a
local state: tests compile 2017-10-23 10:56:03 +02:00
Frank Schroeder 7289576988
local state: replace multi-map state with structs
The state of the service and health check records was spread out over
multiple maps guarded by a single lock. Access to the maps has to happen
in a coordinated effort and the tests often violated this which made
them brittle and racy.

This patch replaces the multiple maps with a single one for both checks
and services to make the code less fragile.

This is also necessary since moving the local state into its own package
creates circular dependencies for the tests. To avoid this the tests can
no longer access internal data structures which they should not be doing
in the first place.

The tests still don't compile but this is a ncessary step in that
direction.
2017-10-23 10:56:03 +02:00
Frank Schroeder bc7571cccf
local state: move to separate package
This patch moves the local state to a separate package to further
decouple it from the agent code.

The code compiles but the tests do not yet.
2017-10-23 10:56:03 +02:00
Frank Schroeder b363586813
agent: simplify some loops 2017-10-23 10:56:03 +02:00
Frank Schroeder 218cd4d501
agent: refactor sync loop to linear flow of control 2017-10-23 10:56:03 +02:00
Frank Schroeder 8cc4ee886d
agent: cleanup StateSyncer
This patch cleans up the state syncer code by renaming fields, adding
helpers and documentation.
2017-10-23 10:56:03 +02:00
Frank Schroeder e7051da8d1
agent: decouple anti-entropy from local state
The anti-entropy code manages background synchronizations of the local
state on a regular basis or on demand when either the state has changed
or a new consul server has been added.

This patch moves the anti-entropy code into its own package and
decouples it from the local state code since they are performing
two different functions.

To simplify code-review this revision does not make any optimizations,
renames or refactorings. This will happen in subsequent commits.
2017-10-23 10:56:03 +02:00
Frank Schroeder c2a3f5e007
Merge pull request #3585 from hashicorp/document-runtime-config
Moving the previous `agent/config.go` documentation to
`agent/config/runtime.go`.
2017-10-23 10:51:22 +02:00
Frank Schroeder f3ecef290c
Merge pull request #3598 from hashicorp/issue-3397-error-with-extra-flags
The `consul agent` command was ignoring extra command line arguments
which can lead to confusion when the user has for example forgotten to
add a dash in front of an argument or is not using an `=` when setting
boolean flags to `true`. `-bootstrap true` is not the same as
`-bootstrap=true`, for example.

Since all command line flags are known and we don't expect unparsed
arguments we can return an error. However, this may make it slightly
more difficult in the future if we ever wanted to have these kinds of
arguments.

Fixes #3397
2017-10-23 10:47:04 +02:00
Frank Schroeder 6517d7720a
Merge pull request #3600 from hashicorp/support-go-sockaddr-for-dns-recursors
DNS recursors can be added through go-sockaddr templates. Entries
are deduplicated while the order is maintained.

Originally proposed by @taylorchu

See #2932
2017-10-23 10:45:36 +02:00
Frank Schroeder 0ec82b0445
Revert "config: add support for go-sockaddr templates for DNS recursors"
This reverts commit 72bee6284d44e0ed3e18e6819188f1d32528478c.
2017-10-23 10:08:35 +02:00
Frank Schroeder 39d1ceffb5
Revert "config: do not allow an ANY address as DNS recursor"
This reverts commit 1db8d3cb00bec980480d525571cea76c6c49c3e7.
2017-10-23 10:08:35 +02:00
Frank Schroeder 759350b61f
Revert "agent: decouple anti-entropy from local state"
This reverts commit a842dc9c2bf00855ef93211232da36b2d91eab5b.
2017-10-23 10:08:35 +02:00
Frank Schroeder 0d0233acd7
Revert "agent: cleanup StateSyncer"
This reverts commit b7136e100bef727144c202ae55f81152ac6a8b5f.
2017-10-23 10:08:35 +02:00
Frank Schroeder 113ba546e7
Revert "agent: refactor sync loop to linear flow of control"
This reverts commit 7a2af206ea964fc0846f9b80c10ea9d91cb3c99e.
2017-10-23 10:08:35 +02:00
Frank Schroeder a96798f441
Revert "agent: simplify some loops"
This reverts commit b5dbad910c63b29dadf10731808d6891067a2bfa.
2017-10-23 10:08:34 +02:00
Frank Schroeder 443fe8e4db
Revert "local state: move to separate package"
This reverts commit d447e823c63720c74bb02459a985724f035f023e.
2017-10-23 10:08:34 +02:00
Frank Schroeder 435b442c8b
Revert "local state: replace multi-map state with structs"
This reverts commit ccbae7da5bceeb2328ab7993a8badbf2e72a4597.
2017-10-23 10:08:34 +02:00
Frank Schroeder 138aa25280
Revert "local state: tests compile"
This reverts commit 1af52bf7be02d952e16e14209899a9715451f7ba.
2017-10-23 10:08:34 +02:00
Frank Schroeder 571637cf7d
Revert "local state: fix failing tests"
This reverts commit 76682da4a048b623a1c201eb56710639437a9ed9.
2017-10-23 10:08:34 +02:00
Frank Schroeder 80d9df69e4
Revert "local state: fix test with updated error message"
This reverts commit e9149f64d9afb38246f9432edd806321c1eefb83.
2017-10-23 10:08:34 +02:00
Frank Schroeder ded6f79b6a
Revert "local state: fix anti-entropy state tests"
This reverts commit f8e20cd9960e19bbe61e258c445250723870816f.
2017-10-23 10:08:34 +02:00
Frank Schroeder 9b40a3cde6
Revert "agent: skip non-sensical TestCatalogRegister"
This reverts commit ce1f35373e6ad1486cb83398fce70f5c54f910d3.
2017-10-23 10:08:34 +02:00
Frank Schroeder 5aa77fb9e4
Revert "local state: drop retry loops from tests"
This reverts commit 2bdba8ab06d1c9dd99d5e7cf8370c94b4f7adfaa.
2017-10-23 10:08:34 +02:00
Frank Schroeder d7bb81a940
Revert "local state: rename tests"
This reverts commit ff62eaf0634a4c09377c53d4623685437f217b49.
2017-10-23 10:08:34 +02:00
Frank Schroeder 3d67ce9000
Revert "local state: fix TestAgentAntiEntropy_EnableTagOverride"
This reverts commit 86f7ea601342d6f3ceb9d0dc74bd5b33dae0b8d8.
2017-10-23 10:08:34 +02:00
Frank Schroeder 1bd73d2a6e
Revert "local state: refactor TestAgentAntiEntropy_EnableTagOverride"
This reverts commit c28e23eac8ada7a668b13e9a4a3d8066457488ef.
2017-10-23 10:08:33 +02:00
Frank Schroeder c72d21813b
Revert "local state: address review comments"
This reverts commit 1d315075b15647db7fcd42986c9c5673cbb77a77.
2017-10-23 10:08:33 +02:00
Frank Schroeder 4177bad4f3
Revert "local state: update comments"
This reverts commit 42188164f885188e3bc8cff70ea5aeb47d633b83.
2017-10-23 10:08:33 +02:00
Frank Schroeder d1e514cedc
Revert "local state: update documentation of updateSyncState"
This reverts commit e86521e637d742bce1e460b6b960037cef3578ed.
2017-10-23 10:08:33 +02:00
Frank Schroeder 133b23fb77
Revert "local state: move Metadata methods together"
This reverts commit 9bc8127728a62beb94b28849070b6ac35c181404.
2017-10-23 10:08:33 +02:00
Frank Schroeder 67135cc33e
Revert "local state: rename Add{Check,Service}State to Set{Check,Service}State"
This reverts commit 9280841a80d98b253a8f23967875e45e5e37e093.
2017-10-23 10:08:33 +02:00
Frank Schroeder 79405b7ce3
Revert "ae: ensure that syncs are blocked when paused"
This reverts commit ffb265dd939cefd7e865d624d0a4ba81f88e4505.
2017-10-23 10:08:33 +02:00
Frank Schroeder cb28151210
Revert "ae: restore previous pause/resume behavior"
This reverts commit 126046be2366afde97fdf01d0dceece2977dd6d6.
2017-10-23 10:08:33 +02:00
Frank Schroeder ec186ca7ba
Revert "ae: make stagger function pluggable for testing"
This reverts commit 066ad01c3804a39161c4d46147aa2eb86e48cd67.
2017-10-23 10:08:33 +02:00
Frank Schroeder 272b729ad9
Revert "ae: add test for ifNotPausedRun"
This reverts commit f5177ef332b8396aea368b7dea18a976e029a54f.
2017-10-23 10:08:33 +02:00
Frank Schroeder 54cebeda74
Revert "ae: add test for resume triggering SyncChanges"
This reverts commit cd0262744deab9294af17e4039ed2e2a27818cde.
2017-10-23 10:08:32 +02:00
Frank Schroeder 7a92839663
Revert "ae: fix typo in constructor name"
This reverts commit e88f49e2ccc79f77b5395de28d8fbdfc6eeb36f6.
2017-10-23 10:08:32 +02:00
Frank Schroeder 4bdf161d5c
Revert "ae: make control flow more explicit"
This reverts commit b9a8b53d5257d679c427422c324e6f124664db60.
2017-10-23 10:08:32 +02:00
Frank Schroeder 6eae18ac63
Revert "ae: add test that we run a full before a partial sync"
This reverts commit 8158cec82969a1262de793ee46e13f63cc2954e6.
2017-10-23 10:08:32 +02:00
Frank Schroeder 701ed61230
Revert "ae: refactor StateSyncer to state machine for better testing"
This reverts commit 8a45365f68147dd13c1232bc574bd54e4a12e478.
2017-10-23 10:08:32 +02:00
Frank Schroeder 27b8e55dc4
Revert "ae: add remaining test cases"
This reverts commit c32915bb4ff28b1670e88edaf0bbb9779dc6e2bc.
2017-10-23 10:08:32 +02:00
Frank Schroeder c27a2f1c66
Revert "ae: do not trigger on Resume while holding the lock"
This reverts commit bd00814301bbf5db037e739e91a0b66ebb564821.
2017-10-23 10:08:32 +02:00
Frank Schroeder 655a24e383
Revert "local state: use synchronized access to internal maps"
This reverts commit 39a2d8d25e629823e183e384e8414171edcf4164.
2017-10-23 10:08:32 +02:00
Frank Schroeder fe0f7c961d
Revert "local state: clone check to avoid side effect"
This reverts commit af1243c7251fe6291145bbe4f4dacd374779c425.
2017-10-23 10:08:32 +02:00
Frank Schroeder a6e55af1e0
Revert "local state: make test more robust"
This reverts commit f9267380dbcdb2746393eee5067e5037dff2b184.
2017-10-23 10:08:32 +02:00
Frank Schroeder 64d9529d94
Revert "local state: remove stale comment"
This reverts commit 35f4acdddc38f0371543e12b0d2ccb27a0a64869.
2017-10-23 10:08:32 +02:00
Frank Schroeder aaf6e376e2
Revert "config: document telemetry options"
This reverts commit 7fede4472d4db87b4652253510556906821ac756.
2017-10-23 10:08:31 +02:00
Frank Schroeder dcb70f7def
Revert "config: document http options"
This reverts commit 5f598574483e59af377ae447857d40872563ac87.
2017-10-23 10:08:31 +02:00
Frank Schroeder 1ce9c1ba45
Revert "config: document dns options"
This reverts commit 5e57e9273ea6e27eb79ed531189fd3c69e19c7b2.
2017-10-23 10:08:31 +02:00
Frank Schroeder 3638b2932d
Revert "config: document autopilot options"
This reverts commit 7685ef409c9340227e3969a86e0bfa627eafdd0f.
2017-10-23 10:08:31 +02:00
Frank Schroeder 32c8d7e0db
Revert "config: document acl options"
This reverts commit 7396bd31fd1c35347d0c88284ea358bfbd9ca948.
2017-10-23 10:08:31 +02:00
Frank Schroeder 83bfe0d223
Revert "config: document config options"
This reverts commit a38c69d7844a4e4a9d54051470977c7446fc8e57.
2017-10-23 10:08:31 +02:00
Frank Schroeder ff42a00ef1
Revert "config: document more acl options"
This reverts commit 2682ce0b825528f379487cbd11c2c2a321fd94e8.
2017-10-23 10:08:31 +02:00
Frank Schroeder 6ae5c0594a
Revert "config: document more config options"
This reverts commit bf81cdea6eb90ae099201c545910780ccd020540.
2017-10-23 10:08:31 +02:00
Frank Schroeder 2d96f700c0
Revert "config: document more config options"
This reverts commit 9864609201f333cd90af9535e2e4d4c7c1e65c4a.
2017-10-23 10:08:31 +02:00
Frank Schroeder bd72f3676d
Revert "config: document remaining config options"
This reverts commit 0ed4561f12cf52a44ab1f1887d4f4bfd27df77f4.
2017-10-23 10:08:31 +02:00
Frank Schroeder 64ac9f4f42
Revert "config: address review comments"
This reverts commit f6e9ad99ec6d230c1eb7f655d2a83e5a4e33807d.
2017-10-23 10:08:31 +02:00
Frank Schroeder c5c586c7f1
Revert "fix go vet issue"
This reverts commit 078e50b635846920a17a3b06c678b2b63b40677a.
2017-10-23 10:00:26 +02:00
Frank Schröder b580f7a533 Revert "config: rename test struct field to args"
This reverts commit f2a291a4704c9954c441567860ca740ac0f58308.
2017-10-23 09:58:37 +02:00
Frank Schröder 0f0b510f37 Revert "config: return error on extra command line arguments (#3397)"
This reverts commit ce935cef55f7298e85843b3ca17ad13415e40e0b.
2017-10-23 09:58:37 +02:00
Frank Schroeder ef3108abc3
fix go vet issue 2017-10-23 08:13:52 +02:00
Frank Schroeder 24689c4f5e config: return error on extra command line arguments (#3397)
The `consul agent` command was ignoring extra command line arguments
which can lead to confusion when the user has for example forgotten to
add a dash in front of an argument or is not using an `=` when setting
boolean flags to `true`. `-bootstrap true` is not the same as
`-bootstrap=true`, for example.

Since all command line flags are known and we don't expect unparsed
arguments we can return an error. However, this may make it slightly
more difficult in the future if we ever wanted to have these kinds of
arguments.

Fixes #3397
2017-10-23 08:08:09 +02:00
Frank Schroeder c8ed3044de config: rename test struct field to args 2017-10-23 08:08:09 +02:00
Frank Schroeder c624c72d5c
config: return error on extra command line arguments (#3397)
The `consul agent` command was ignoring extra command line arguments
which can lead to confusion when the user has for example forgotten to
add a dash in front of an argument or is not using an `=` when setting
boolean flags to `true`. `-bootstrap true` is not the same as
`-bootstrap=true`, for example.

Since all command line flags are known and we don't expect unparsed
arguments we can return an error. However, this may make it slightly
more difficult in the future if we ever wanted to have these kinds of
arguments.

Fixes #3397
2017-10-23 08:07:48 +02:00
Frank Schroeder c1d249c681
config: rename test struct field to args 2017-10-23 08:07:48 +02:00
Frank Schroeder 88a1f3aa53 config: address review comments 2017-10-23 08:06:26 +02:00
Frank Schroeder 97277de196 config: document remaining config options 2017-10-23 08:06:26 +02:00
Frank Schroeder 638bc0e0ac config: document more config options 2017-10-23 08:06:26 +02:00
Frank Schroeder f326eae23e config: document more config options 2017-10-23 08:06:26 +02:00
Frank Schroeder 6332f4d6a7 config: document more acl options 2017-10-23 08:06:26 +02:00
Frank Schroeder 61c8ee564a config: document config options 2017-10-23 08:06:26 +02:00
Frank Schroeder 71112a6510 config: document acl options 2017-10-23 08:06:26 +02:00
Frank Schroeder e00869647c config: document autopilot options 2017-10-23 08:06:26 +02:00
Frank Schroeder 05c1038d58 config: document dns options 2017-10-23 08:06:26 +02:00
Frank Schroeder a17d711b39 config: document http options 2017-10-23 08:06:26 +02:00
Frank Schroeder f95b8f5060 config: document telemetry options 2017-10-23 08:06:26 +02:00
Frank Schroeder 44cc334fc4
config: address review comments 2017-10-23 08:05:47 +02:00
Frank Schroeder 28083cb330
config: document remaining config options 2017-10-23 08:04:03 +02:00
Frank Schroeder b2c2386d75
config: document more config options 2017-10-23 08:04:03 +02:00
Frank Schroeder 7b1f4d5838
config: document more config options 2017-10-23 08:04:03 +02:00
Frank Schroeder b01702f129
config: document more acl options 2017-10-23 08:04:03 +02:00
Frank Schroeder 32ed645df5
config: document config options 2017-10-23 08:04:03 +02:00
Frank Schroeder 3d68185206
config: document acl options 2017-10-23 08:04:03 +02:00
Frank Schroeder 4fc973a256
config: document autopilot options 2017-10-23 08:04:03 +02:00
Frank Schroeder 66e1b9fb64
config: document dns options 2017-10-23 08:04:03 +02:00
Frank Schroeder dc4e90e6c0
config: document http options 2017-10-23 08:04:03 +02:00
Frank Schroeder 018b8d66d6
config: document telemetry options 2017-10-23 08:04:03 +02:00
Frank Schroeder 39dddd433e local state: remove stale comment 2017-10-23 08:03:18 +02:00
Frank Schroeder 779bf04aa3 local state: make test more robust 2017-10-23 08:03:18 +02:00
Frank Schroeder e16c8f3982 local state: clone check to avoid side effect 2017-10-23 08:03:18 +02:00
Frank Schroeder f1028d2486 local state: use synchronized access to internal maps 2017-10-23 08:03:18 +02:00
Frank Schroeder 606e13a569 ae: do not trigger on Resume while holding the lock 2017-10-23 08:03:18 +02:00
Frank Schroeder 82e9494f7f ae: add remaining test cases 2017-10-23 08:03:18 +02:00
Frank Schroeder cf6e6c72ac ae: refactor StateSyncer to state machine for better testing 2017-10-23 08:03:18 +02:00
Frank Schroeder ba8cb5e3d0 ae: add test that we run a full before a partial sync 2017-10-23 08:03:18 +02:00
Frank Schroeder 0d73333c8e ae: make control flow more explicit 2017-10-23 08:03:18 +02:00
Frank Schroeder 0cfe0a42f1 ae: fix typo in constructor name 2017-10-23 08:03:18 +02:00
Frank Schroeder 25ee03e165 ae: add test for resume triggering SyncChanges 2017-10-23 08:03:18 +02:00
Frank Schroeder f61ad4efcd ae: add test for ifNotPausedRun 2017-10-23 08:03:18 +02:00
Frank Schroeder c2ab582d47 ae: make stagger function pluggable for testing 2017-10-23 08:03:18 +02:00
Frank Schroeder f352952fb3 ae: restore previous pause/resume behavior 2017-10-23 08:03:18 +02:00
Frank Schroeder c1badf1034 ae: ensure that syncs are blocked when paused 2017-10-23 08:03:18 +02:00
Frank Schroeder a4d03797ae local state: rename Add{Check,Service}State to Set{Check,Service}State 2017-10-23 08:03:18 +02:00
Frank Schroeder 7414fe0a2a local state: move Metadata methods together 2017-10-23 08:03:18 +02:00
Frank Schroeder 3e953f67da local state: update documentation of updateSyncState 2017-10-23 08:03:18 +02:00
Frank Schroeder ecea35d955 local state: update comments 2017-10-23 08:03:18 +02:00
Frank Schroeder 40e17f9f01 local state: address review comments
* move non-blocking notification mechanism into ae.Trigger
* move Pause/Resume into separate type
2017-10-23 08:03:18 +02:00
Frank Schroeder cfc8bd092f local state: refactor TestAgentAntiEntropy_EnableTagOverride
Make intent clearer by being more explicit and adding some comments.
Use verify.Values to compare service entries.
2017-10-23 08:03:18 +02:00
Frank Schroeder b288c1eb9b local state: fix TestAgentAntiEntropy_EnableTagOverride
The test had a race condition where it relied on the first service to be
synced to the remote catalog which sometimes failed.
2017-10-23 08:03:18 +02:00
Frank Schroeder 71e4d7cd1c local state: rename tests 2017-10-23 08:03:18 +02:00
Frank Schroeder f43972fdc2 local state: drop retry loops from tests
Since the tests are now using synchronous calls for state syncing
we no longer need to use retry loops to wait for the changes to
propagate.
2017-10-23 08:03:18 +02:00
Frank Schroeder 3d10312cd6 agent: skip non-sensical TestCatalogRegister
It is not clear what this test is supposed to verify.
2017-10-23 08:03:18 +02:00
Frank Schroeder 2bef1d4d51 local state: fix anti-entropy state tests
The anti-entropy tests relied on the side-effect of the StartSync()
method to perform a full sync instead of a partial sync. This lead to
multiple anti-entropy go routines being started unnecessary retry loops.

This change changes the behavior to perform synchronous full syncs when
necessary removing the need for all of the time.Sleep and most of the
retry loops.
2017-10-23 08:03:18 +02:00
Frank Schroeder 36677bc90d local state: fix test with updated error message 2017-10-23 08:03:18 +02:00
Frank Schroeder 37c8492e5e local state: fix failing tests 2017-10-23 08:03:18 +02:00
Frank Schroeder 884f98f8aa local state: tests compile 2017-10-23 08:03:18 +02:00
Frank Schroeder 60095484c4 local state: replace multi-map state with structs
The state of the service and health check records was spread out over
multiple maps guarded by a single lock. Access to the maps has to happen
in a coordinated effort and the tests often violated this which made
them brittle and racy.

This patch replaces the multiple maps with a single one for both checks
and services to make the code less fragile.

This is also necessary since moving the local state into its own package
creates circular dependencies for the tests. To avoid this the tests can
no longer access internal data structures which they should not be doing
in the first place.

The tests still don't compile but this is a ncessary step in that
direction.
2017-10-23 08:03:18 +02:00
Frank Schroeder ef9aa6b3b6 local state: move to separate package
This patch moves the local state to a separate package to further
decouple it from the agent code.

The code compiles but the tests do not yet.
2017-10-23 08:03:18 +02:00
Frank Schroeder c03eba91d0 agent: simplify some loops 2017-10-23 08:03:18 +02:00
Frank Schroeder 98e5dc86fb agent: refactor sync loop to linear flow of control 2017-10-23 08:03:18 +02:00
Frank Schroeder 5302479ad5 agent: cleanup StateSyncer
This patch cleans up the state syncer code by renaming fields, adding
helpers and documentation.
2017-10-23 08:03:18 +02:00
Frank Schroeder 034ee43cef agent: decouple anti-entropy from local state
The anti-entropy code manages background synchronizations of the local
state on a regular basis or on demand when either the state has changed
or a new consul server has been added.

This patch moves the anti-entropy code into its own package and
decouples it from the local state code since they are performing
two different functions.

To simplify code-review this revision does not make any optimizations,
renames or refactorings. This will happen in subsequent commits.
2017-10-23 08:03:18 +02:00
Frank Schroeder 6df6ac03b7 config: do not allow an ANY address as DNS recursor 2017-10-23 08:01:25 +02:00
Frank Schroeder 3b13290144 config: add support for go-sockaddr templates for DNS recursors
DNS recursors can be added through go-sockaddr templates. Entries
are deduplicated while the order is maintained.

Originally proposed by @taylorchu

See #2932
2017-10-23 08:01:25 +02:00
James Phillips 2e7d048345
Cleans up import sorting. 2017-10-21 20:08:11 -07:00
Hadar Greinsmark 0c5f5e2821 Implement HTTP Watch handler (#3413)
Implement HTTP Watch handler
2017-10-21 20:39:09 -05:00
Frank Schroeder 74859ff3c0 test: replace porter tool with freeport lib
This patch removes the porter tool which hands out free ports from a
given range with a library which does the same thing. The challenge for
acquiring free ports in concurrent go test runs is that go packages are
tested concurrently and run in separate processes. There has to be some
inter-process synchronization in preventing processes allocating the
same ports.

freeport allocates blocks of ports from a range expected to be not in
heavy use and implements a system-wide mutex by binding to the first
port of that block for the lifetime of the application. Ports are then
provided sequentially from that block and are tested on localhost before
being returned as available.
2017-10-21 22:01:09 +02:00
Frank Schröder d26b0406e4 dns: return NXDOMAIN if datacenter is invalid (#3200) (#3596)
Queries to the DNS server can contain an optional datacenter
name in the query name. You can query for 'foo.service.consul'
or 'foo.service.dc.consul' to get a response for either the
default or a specific datacenter.

Datacenter names cannot have dots, therefore the datacenter
name can refer to only one element in the DNS query name.

The DNS server allowed extra labels between the optional
datacenter name and the domain and returned a valid response
instead of returning NXDOMAIN. For example, if the domain
is set to '.consul' then 'foo.service.dc1.extra.consul'
should return NXDOMAIN because of 'extra' being between
the datacenter name 'dc1' and the domain '.consul'.

Fixes #3200
2017-10-20 16:49:17 -07:00
Frank Schroeder 6628ca1cf1
config: do not allow an ANY address as DNS recursor 2017-10-20 20:00:45 +02:00
Frank Schroeder 2c4f98cf12
config: add support for go-sockaddr templates for DNS recursors
DNS recursors can be added through go-sockaddr templates. Entries
are deduplicated while the order is maintained.

Originally proposed by @taylorchu

See #2932
2017-10-20 15:51:49 +02:00
James Phillips 3d52f42715 Fixes API client for ScriptArgs and updates documentation. (#3589)
* Updates the API client to support the current `ScriptArgs` parameter
for checks.

* Updates docs for checks to explain the `ScriptArgs` parameter issue.

* Adds mappings for "args" and "script-args" to give th API parity
with config.

* Adds checks on return codes.

* Removes debug logging that shows empty when args are used.
2017-10-18 11:28:39 -07:00
Ryan Slade 6f05ea91a3 Replace time.Now().Sub(x) with time.Since(x) 2017-10-17 20:38:24 +02:00
James Phillips 39f2359804 Fixes an XSS issue with unescaped node names. (#3578)
* Fixes an XSS issue with node names in the tomography graph.

* Updates built-in static web assets.

* Updates the change log.
2017-10-16 09:12:36 -07:00
James Phillips fdd08c78a9 Adds a brief wait and poll period to update check status after a timeout. (#3573)
* Adds a brief wait and poll period to update the check status
if we get stucking waiting for the processes to terminate.

Fixes #3570

* Jumps out of timeout case and includes script output.
2017-10-12 13:49:46 -07:00
James Phillips e9670761f9
Cleans up some drift between the OSS and Enterprise trees. 2017-10-11 15:53:07 -07:00
Kyle Havlovitz eea2bd2753 Kill check processes after the timeout is reached (#3567)
* Kill check processes after the timeout is reached

Kill the subprocess spawned by a script check once the timeout is reached. Previously Consul just marked the check critical and left the subprocess around.

Fixes #3565.

* Set err to non-nil when timeout occurs

* Fix check timeout test

* Kill entire process subtree on check timeout

* Add a docs note about windows subprocess termination
2017-10-11 11:57:39 -07:00
Frank Schroeder c4215bc04f
config: remove redundant code 2017-10-11 10:16:21 +02:00
Frank Schroeder 8cda75454a
config: fix check for segment.port <= 0 and add test 2017-10-11 10:15:55 +02:00
James Phillips a16dbc0212
Adds check to make sure port is given so we avoid a nil bind address. 2017-10-10 18:11:21 -07:00
James Phillips 275e83de08
Removes obsolete segment stub. 2017-10-10 17:21:32 -07:00
Frank Schröder 9b2e3c2091 agent: add option to discard health output (#3562)
* agent: add option to discard health output

In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.

* agent: discard output also when adding a check

* agent: add test for discard check output

* agent: update docs

* go vet

* Adds discard_check_output to reloadable config table.

* Updates the change log.
2017-10-10 17:04:52 -07:00
preetapan f6066f8305 Fixes agent error handling when check definition is invalid. Distingu… (#3560)
* Fixes agent error handling when check definition is invalid. Distinguishes between empty checks vs invalid checks

* Made CheckTypes return Checks from service definition struct rather than a new copy, and other changes from code review. This also errors when json payload contains empty structs

* Simplify and improve validate method, and make sure that CheckTypes always returns a new copy of validated check definitions

* Tweaks some small style things and error messages.

* Updates the change log.
2017-10-10 16:54:06 -07:00
Frank Schröder fa22ad4573 config: add generic method to translate between CamelCase and snake_case (#3557)
* doc: document discrepancy between id and CheckID

* doc: document enable_tag_override change

* config: add TranslateKeys helper

TranslateKeys makes it easier to map between different representations
of internal structures. It allows to recursively map alias keys to
canonical keys in structured maps.

* config: use TranslateKeys for config file

This also adds support for 'enabletagoverride' and removes
the need for a separate CheckID alias field.

* config: remove dead code

* agent: use TranslateKeys for FixupCheckType

* agent: translate enable_tag_override during service registration

* doc: add '.hcl' as valid extension

* config: map ScriptArgs to args

* config: add comment for TranslateKeys
2017-10-10 16:40:59 -07:00
James Phillips d1ad538345 Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
Preetha Appan 25e64b5362 Fix unit test after dns library upgrade to account for correct data length 2017-10-06 17:40:17 -05:00
James Phillips a1db119d02 Fixes handling of stop channel and failed barrier attempts. (#3546)
* Fixes handling of stop channel and failed barrier attempts.

There were two issues here. First, we needed to not exit when there
was a timeout trying to write the barrier, because Raft might not
step down, so we'd be left as the leader but having run all the step
down actions.

Second, we didn't close over the stopCh correctly, so it was possible
to nil that out and have the leaderLoop never exit. We close over it
properly AND sequence the nil-ing of it AFTER the leaderLoop exits for
good measure, so the code is more robust.

Fixes #3545

* Cleans up based on code review feedback.

* Tweaks comments.

* Renames variables and removes comments.
2017-10-06 07:54:49 -07:00
Victor Boivie 77f7008363 Minor typo (boostrap) 2017-10-05 16:28:48 +02:00
James Phillips 97b580f593
Adds script warning and fixes Docker args recognition. 2017-10-04 21:41:27 -07:00
Kyle Havlovitz dde743700f Merge pull request #3535 from hashicorp/metric-docs
Update metric names and add a legacy config flag
2017-10-04 17:39:16 -07:00
Kyle Havlovitz d5fec6b7ac
Add a test for legacy metrics with a whitelist filter 2017-10-04 17:27:57 -07:00
Kyle Havlovitz be04bfed34 Clean up subprocess handling and make shell use optional (#3509)
* Clean up handling of subprocesses and make using a shell optional

* Update docs for subprocess changes

* Fix tests for new subprocess behavior

* More cleanup of subprocesses

* Minor adjustments and cleanup for subprocess logic

* Makes the watch handler reload test use the new path.

* Adds check tests for new args path, and updates existing tests to use new path.

* Adds support for script args in Docker checks.

* Fixes the sanitize unit test.

* Adds panic for unknown watch type, and reverts back to Run().

* Adds shell option back to consul lock command.

* Adds shell option back to consul exec command.

* Adds shell back into consul watch command.

* Refactors signal forwarding and makes Windows-friendly.

* Adds a clarifying comment.

* Changes error wording to a warning.

* Scopes signals to interrupt and kill.

This avoids us trying to send SIGCHILD to the dead process.

* Adds an error for shell=false for consul exec.

* Adds notes about the deprecated script and handler fields.

* De-nests an if statement.
2017-10-04 16:48:00 -07:00
Kyle Havlovitz 0063516e5e
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Frank Schröder b2c4dc4360 Provide stable config for agent/self (#3532)
* config: provide stable config for /v1/agent/self (#3530)

This patch adds a stable subset of the previous Config struct to the
agent/self response. The actual runtime configuration is moved into
DebugConfig and will be documented to change.

Fixes #3530

* config: fix tests

* doc: update api documentation for /v1/agent/self
2017-10-04 10:43:17 -07:00
James Phillips 6529c505a5 Merge pull request #3531 from hashicorp/pr-3521-slackpad
ui: Use monospace font for textarea controls.
2017-10-04 09:53:41 -07:00
James Phillips 539285cf1f
Updates checked in web assets to pick up CSS change.
Closes #3521
2017-10-04 09:52:15 -07:00
Preetha Appan f38d20eb40 Remove extra newline 2017-10-03 15:19:31 -05:00
Preetha Appan 3c81e2db7c Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment. 2017-10-03 15:15:56 -05:00
Preetha Appan d5acfc3982 Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in. 2017-10-02 17:10:21 -05:00
Frank Schroeder 6b3a957c5e use ports from derived addresses 2017-09-29 20:26:43 +02:00
Frank Schroeder 8d8e2523eb config: drop advertise_addrs
Fixes #3516
2017-09-29 20:26:43 +02:00
Frank Schroeder f0efe2a3de
Fix tests after config refactor 2017-09-28 12:32:46 +02:00
Patrick Sodré 55c2746963
Implement encodeKVasRFC1464 function 2017-09-28 12:32:46 +02:00
Patrick Sodré d880634cfa
Add RFC1464 tests 2017-09-28 12:32:45 +02:00
Patrick Sodré 7083f9fb14
Turn encodeKVasRFC1464 into a plain function 2017-09-28 12:32:45 +02:00
Patrick Sodré be258a3315
Use verify for NodeLookup CNAME, and TXT tests 2017-09-28 12:32:45 +02:00
Patrick Sodré 8982719f5b
Refactor formatTxtRecords as encodeKVasRFC1464
- Move the logic of rfc1035 out of the encoding function
  - Left basic version of encodingKV as 'k=v'
2017-09-28 12:32:45 +02:00
Patrick Sodré a16e0f7419
Fix editorial suggestions 2017-09-28 12:32:45 +02:00
Patrick Sodré 4b2d1546fa
Remove redundant check of Node.Meta size 2017-09-28 12:32:45 +02:00
Patrick Sodré b8369b54fb
Return Node.Meta info using the DNS interface 2017-09-28 12:32:45 +02:00
Patrick Sodré b8905dd065
Add test for NoteLookup ANY request 2017-09-28 12:32:45 +02:00
Patrick Sodré 354765c549
Add test for querying Node.Meta with DNS TXT
- Lookup TXT records using recursive lookups
  - Expect TXT record equal to value if key starts with rfc1035-
  - Expect TXT record in rfc1464 otherwise, i.e. (k=v)

ref #2709
2017-09-28 12:32:45 +02:00
Frank Schröder 5f6d0fd8c5 fail early when advertise addr is set to ANY (#3507) 2017-09-27 13:57:55 -07:00
Frank Schröder beb803f0d9 only detect advertise address if derived value is any (#3506)
* only detect advertise address if derived value is any

* determine detect function only when advertise addr is any
2017-09-27 12:59:47 -07:00
James Phillips d677999258
Adds a comment about Datacenter and NodeName being stable interfaces
in the runtime config strucutre.
2017-09-27 11:59:22 -07:00
Frank Schröder cda0eacff1 Recursive sanitize (#3505)
* vendor: add github.com/sergi/go-diff/diffmatchpatch for diff'ing test output

* config: refactor Sanitize to recursively clean runtime config and format complex fields

* Removes an extra int cast.

* Adds a top-level check test case for sanitization.
2017-09-27 11:47:40 -07:00
James Phillips 330ce87851
Gets rid of flaky clause in stats fetcher unit test.
Given how the rutine is coded we can still get data so this wasn't
a reliable thing to check.
2017-09-26 20:53:06 -07:00
preetapan 783e24be64 Issue 3452 (#3500)
* Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf

* Get address from node table in the state store rather than from service address

* Fix incorrect lookup by checkname instead of node name

* Make sure that serverlookup is called with the right address format, added unit test.

* Address code review comments

* Tweaks style stuff.
2017-09-26 20:49:41 -07:00
Frank Schröder 707f8e329a Metrics service prefix (#3498)
* metrics: replace statsite_prefix with service_prefix

The metrics prefix isn't statsite specific and is in fact used
for all metrics providers. Since we are deprecating fields
anyway we should fix this one as well.

Fixes #3293

* Updates docs and sorts telemetry section.

* Renames to "metrics_prefix" to disambiguate with Consul services.

* Updates the change log.
2017-09-26 17:49:55 -07:00
James Phillips 3130fcaccc Merge pull request #3501 from hashicorp/snapshot-test-hang
Cleans up some edge cases in TestSnapshot_Forward_Leader.
2017-09-26 14:08:33 -07:00
James Phillips 4b17c9618f
Cleans up some edge cases in TestSnapshot_Forward_Leader.
These could cause the tests to hang.
2017-09-26 14:07:28 -07:00
Kyle Havlovitz 3460506264 Fix watch error when http & https are disabled (#3493)
Remove an error in watch reloading that happens when http and https
are both disabled, and use an https address for running watches if
no http addresses are present.

Fixes #3425.
2017-09-26 13:47:27 -07:00
Preetha Appan 318d0232f7 Move Raft protocol version for list peers end point to server side, fix unit tests. This fixes #3449 2017-09-26 09:35:39 -05:00
Frank Schroeder 94fbae4732
fix data race
Since state.Checks() returns a shallow copy
its elements must not be modified. Copying
the elements in the handler does not guarantee
consistency since that list is guarded by a different
lock. Therefore, the only solution is to have state.Checks()
return a deep copy.
2017-09-26 13:42:10 +02:00
Frank Schroeder a1d65cbe78 config: do not clobber multiple check and service definitions
This patch ensures that multiple files with single 'check' or 'service'
definitions result in the combination of them.
2017-09-26 10:24:18 +02:00
James Phillips 23de0f9ea9
Renames `enable_ui` to `ui` to keep compatibility with existing configs. 2017-09-26 00:05:55 -07:00
Frank Schröder c7cc62ab5a agent: consolidate handling of 405 Method Not Allowed (#3405)
* agent: consolidate http method not allowed checks

This patch uses the error handling of the http handlers to handle HTTP
method not allowed errors across all available endpoints. It also adds a
test for testing whether the endpoints respond with the correct status
code.

* agent: do not panic on metrics tests

* agent: drop other tests for MethodNotAllowed

* agent: align /agent/join with reality

/agent/join uses PUT instead of GET as documented.

* agent: align /agent/check/{fail,warn,pass} with reality

/agent/check/{fail,warn,pass} uses PUT instead of GET as documented.

* fix some tests

* Drop more tests for method not allowed

* Align TestAgent_RegisterService_InvalidAddress with reality

* Changes API client join to use PUT instead of GET.

* Fixes agent endpoint verbs and removes obsolete tests.

* Updates the change log.
2017-09-25 23:11:19 -07:00
preetapan 4ced57c1f8 Merge pull request #3494 from hashicorp/enforce_json_extension
Enforce json or hcl extension to Consul config files, updated unit tests
2017-09-25 17:30:33 -05:00
James Phillips fcaa889116 Bumps default Raft protocol to version 3. (#3477)
* Changes default Raft protocol to 3.

* Changes numPeers() to report only voters.

This should have been there before, but it's more obvious that this
is incorrect now that we default the Raft protocol to 3, which puts
new servers in a read-only state while Autopilot waits for them to
become healthy.

* Fixes TestLeader_RollRaftServer.

* Fixes TestOperator_RaftRemovePeerByAddress.

* Fixes TestServer_*.

Relaxed the check for a given number of voter peers and instead do
a thorough check that all servers see each other in their Raft
configurations.

* Fixes TestACL_*.

These now just check for Raft replication to be set up, and don't
care about the number of voter peers.

* Fixes TestOperator_Raft_ListPeers.

* Fixes TestAutopilot_CleanupDeadServerPeriodic.

* Fixes TestCatalog_ListNodes_ConsistentRead_Fail.

* Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away
sockets when it sees io.EOF.

* Changes version to 1.0.0 in the options doc.

* Makes metrics test more deterministic with autopilot metrics possible.
2017-09-25 15:27:04 -07:00
Preetha Appan 1e8385df2c Enforce json or hcl extension to Consul config files, updated unit tests 2017-09-25 17:17:12 -05:00
James Phillips 5208a1ac96
Removes unused imports in agent_test.go. 2017-09-25 13:42:15 -07:00
Preetha Appan 8394ad08db Introduce Code Policy validation via sentinel, with a noop implementation 2017-09-25 13:44:55 -05:00
Frank Schröder 69a088ca85 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
James Phillips 268018c558
Robustifies check in TestCatalog_ListNodes_ConsistentRead_Fail test.
Fixes #3469
2017-09-13 21:22:53 -07:00
James Phillips 8be4ee766a
Revert "Manages segments list via a pointer."
This reverts commit c277a4250461443cbd63de0259e5e32766f651ea.
2017-09-07 16:37:11 -07:00
James Phillips 5008aabb62
Manages segments list via a pointer. 2017-09-07 16:21:07 -07:00
James Phillips 908f7be97f
Cleans up formatting. 2017-09-07 12:26:58 -07:00
James Phillips 02a3f3f27b
Shows the segment name in the keyring API and command output. 2017-09-07 12:17:39 -07:00
James Phillips 34bae2487d
Populates the segment keyrings based on the LAN keyring. 2017-09-07 12:17:20 -07:00
James Phillips 7c616e3768
Moves reconcile loop into segment stub. 2017-09-06 18:01:53 -07:00
James Phillips 4e34c2af06
Takes the skip out of the client check.
Without this the merge delegate won't check the segment for non-servers
a little below here.
2017-09-06 17:05:40 -07:00
James Phillips 78ac144fff Merge pull request #3447 from hashicorp/issue-3070
Skips unique node ID check for old versions of Consul.
2017-09-06 13:24:15 -07:00
James Phillips 62d9299646
Fixes incorrect comment. 2017-09-06 13:23:19 -07:00
James Phillips 031f1874d0
Pulls down some code for the check loop. 2017-09-06 13:07:42 -07:00
James Phillips 2fd9328b21
Uses the Raft configuration for the self-add skip check. 2017-09-06 13:05:51 -07:00
Preetha Appan 1eae9f1e2f Change member join reconcile step to process joining itself, to handle node IP address changes correctly when number of servers < 3 2017-09-06 13:53:01 -05:00
James Phillips 353e037c9b
Skips unique node ID check for old versions of Consul.
Fixes #3070.
2017-09-05 22:57:29 -07:00
James Phillips aecf890054
Allow _all for WAN as a no-op. 2017-09-05 13:40:19 -07:00
James Phillips c629773b40
Makes the all segments query explict, and the default for `consul members`. 2017-09-05 12:22:20 -07:00
James Phillips bc9780baad Adds simple rate limiting for client agent RPC calls to Consul servers. (#3440)
* Added rate limiting for agent RPC calls.
* Initializes the rate limiter based on the config.
* Adds the rate limiter into the snapshot RPC path.
* Adds unit tests for the RPC rate limiter.
* Groups the RPC limit parameters under "limits" in the config.
* Adds some documentation about the RPC limiter.
* Sends a 429 response when the rate limiter kicks in.
* Adds docs for new telemetry.
* Makes snapshot telemetry look like RPC telemetry and cleans up comments.
2017-09-01 15:02:50 -07:00
Kyle Havlovitz 334e082848 Merge pull request #3431 from hashicorp/network-segments-oss 2017-09-01 10:24:58 -07:00
Kyle Havlovitz ff994e9ade
Pass listeners into setupSegments 2017-08-31 17:56:43 -07:00
Kyle Havlovitz 5cc4b32a5d
Organize segments for a cleaner split between enterprise and OSS 2017-08-31 17:39:46 -07:00
Kyle Havlovitz 5ebea7f049
Fill in the segment in the QuerySource for prepared query lookups 2017-08-31 03:35:59 -07:00
Kyle Havlovitz b77a0aa932
Fix some inconsistencies with segment logic and comments 2017-08-30 17:43:46 -07:00
Kyle Havlovitz 3b0df3350f
Default bind/advertise for segments to BindAddr/AdvertiseAddr 2017-08-30 12:51:10 -07:00
Preetha Appan 0728a04dbb Wire server provider for raft layer only on protocol version 3 and above, and update changelog 2017-08-30 14:36:47 -05:00
Kyle Havlovitz d9fc2b3d75
Update coord display in ui to account for segments 2017-08-30 11:58:29 -07:00
Kyle Havlovitz 6ded43131a
Add segment addr field to tags for LAN flood joiner 2017-08-30 11:58:29 -07:00
Kyle Havlovitz 1c04f1537a
Add agent.segment interpolation to prepared queries 2017-08-30 11:58:29 -07:00
Kyle Havlovitz 107d7f6c5a
Add rpc_listener option to segment config 2017-08-30 11:58:29 -07:00
Kyle Havlovitz e582d02079
Add segment config validation 2017-08-30 11:58:29 -07:00
James Phillips 6a6eadd8c7
Adds open source side of network segments (feature is Enterprise-only). 2017-08-30 11:58:29 -07:00
Preetha Appan e944370cde More cleanup from code review 2017-08-30 12:31:36 -05:00
Preetha Appan a215c764cd Remove copy pasted duplicate line, update documentation. 2017-08-30 10:02:10 -05:00
Preetha Appan 5a29eb7486 Consolidate server lookup into one place and replace usages of localConsuls. 2017-08-30 09:30:33 -05:00
Preetha Appan cac1c29ec5 Remove unused function 2017-08-30 09:30:33 -05:00
Preetha Appan d8fe01db4c Remove stray commented line 2017-08-30 09:30:33 -05:00
Preetha Appan ca48e7e4c2 Remove server address tracking logic from manager/router and maintain it as part of lan event listener instead. Used sync.Map to track this, and added unit tests 2017-08-30 09:30:33 -05:00
Preetha Appan b4a9d77d49 ServerAddressProvider interface also returns an error now 2017-08-30 09:30:33 -05:00
Preetha Appan edb408bc22 Use config struct to create NetworkTransport layer when setting up raft 2017-08-30 09:30:33 -05:00
Preetha Appan 01f8e469aa Implement AddressProvider and wire that up to raft transport layer to support server nodes changing their IP addresses in containerized environments 2017-08-30 09:30:33 -05:00
Frank Schroeder 62c77d70f0 build: make tests independent of build tags
When the metadata server is scanning the agents for potential servers
it is parsing the version number which the agent provided when it
joined. This version number has to conform to a certain format, i.e.
'n.n.n'. Without this version number properly set some tests fail with
error messages that disguise the root cause.

The default version number is currently set to 'unknown' in
version/version.go which does not parse and triggers the tests to fail.
The work around is to use a build tag 'consul' which will use the
version number set in version_base.go instead which has the correct
format and is set to the current release version.

In addition, some parts of the code also require the version number to
be of a certain value. Setting it to '0.0.0' for example makes some
tests pass and others fail since they don't pass the semantic check.

When using go build/install/test one has to remember to use '-tags
consul' or tests will fail with non-obvious error messages.

Using build tags makes the build process more complex and error prone
since it prevents the use of the plain go toolchain and - at least in
its current form - introduces subtle build and test issues. We should
try to eliminate build tags for anything else but platform specific
code.

This patch removes all references to specific version numbers in the
code and tests and sets the default version to '9.9.9' which is
syntactically correct and passes the semantic check. This solves the
issue of running go build/install/test without tags for the OSS build.
2017-08-30 13:40:18 +02:00
Frank Schroeder 84a1bf0a99 agent: drop status code comments 2017-08-23 22:36:23 +02:00
Frank Schroeder b06584f631 agent: use http.StatusRequestEntityTooLarge instead of 413 2017-08-23 22:36:23 +02:00
Frank Schroeder cc83590962 agent: use http.StatusInternalServerError instead of 500 2017-08-23 22:36:23 +02:00
Frank Schroeder bf426beb45 agent: use http.StatusMethodNotAllowed instead of 405 2017-08-23 22:36:23 +02:00
Frank Schroeder 0c3534cbf7 agent: use http.StatusNotFound instead of 404 2017-08-23 22:36:23 +02:00
Frank Schroeder 970a7f97ec agent: use http.StatusForbidden instead of 403 2017-08-23 22:36:23 +02:00
Frank Schroeder 2e586be5aa agent: use http.StatusUnauthorized instead of 401 2017-08-23 22:36:23 +02:00
Frank Schroeder 923f8e2364 agent: use http.StatusBadRequest instead of 400 2017-08-23 22:36:23 +02:00
Frank Schroeder a32eab5923 agent: support go-discover retry-join for wan 2017-08-23 21:23:34 +02:00
Frank Schröder 44e6b8122d acl: consolidate error handling (#3401)
The error handling of the ACL code relies on the presence of certain
magic error messages. Since the error values are sent via RPC between
older and newer consul agents we cannot just replace the magic values
with typed errors and switch to type checks since this would break
compatibility with older clients.

Therefore, this patch moves all magic ACL error messages into the acl
package and provides default error values and helper functions which
determine the type of error.
2017-08-23 16:52:48 +02:00
Frank Schroeder d9e2a51887 agent: drop unused code
This code from http://github.com/hashicorp/consul/pull/3353 is no longer
required.
2017-08-22 00:02:46 +02:00
Frank Schroeder 4bfcf7b613 dns: replace nameserver lookup with consistent rpc call
This patch replaces the code which determines the list of servers in the
current cluster with an RPC call to get the list of active consul
service instances which only run on servers.

This replaces the previous implementation which was more complex and
relied on serf messages which can provide a different view than the
consistent response from the raft log.

As a side effect it makes the implementation independent of the server
and the agent which means it works consistently across both. Different
behavior for server and agent was the root cause for the bug in
http://github.com/hashicorp/consul/issue/3047.

Fixes #3407
2017-08-22 00:02:46 +02:00
Frank Schroeder 8e1f9b9b68 dns: split node lookup from request handling 2017-08-22 00:02:46 +02:00
Frank Schroeder db8ad8922e dns: refactor label by unrolling loop 2017-08-22 00:02:46 +02:00
Frank Schroeder c35206db07 dns: move ttl closer to usage 2017-08-22 00:02:46 +02:00
James Phillips 738ac55d96
Switches to using a read lock for the agent's RPC dispatcher.
This prevents RPC calls from getting serialized in this spot.

Fixes #3376
2017-08-09 18:51:55 -07:00
Frank Schröder 32d4eecc1a agent: honor deprecated flags for retry-join-{ec2,azure,gce} (#3384) 2017-08-09 16:18:30 -07:00
James Phillips 3518e27a76 Revert "Return 403 rather than a 404 when acls cause all results to be filter…" 2017-08-09 15:06:57 -07:00
James Phillips 91205b2cd6 Revert "Ensure that we return a permission denied only if the list of keys/en…" 2017-08-09 15:06:20 -07:00
Preetha Appan 121326161e Added unit test case to kvs_endpointtest 2017-08-09 15:50:22 -05:00
Preetha Appan d06002dc62 Ensure that we return a permission denied only if the list of keys/entries prior to filtering by ACL is non empty 2017-08-09 15:32:18 -05:00
Frank Schroeder c38dcf2d17
agent: move agent/consul/agent to agent/metadata 2017-08-09 14:36:52 +02:00
Frank Schroeder 85bdb77d90
agent: move agent/consul/servers to agent/router 2017-08-09 14:36:37 +02:00
Frank Schroeder 1d0bbfed9c
agent: move agent/consul/structs to agent/structs 2017-08-09 14:32:12 +02:00
James Phillips a600f681d6
Cleans up some go fmt issues. 2017-08-08 21:52:50 -07:00
James Phillips 7b4d3d5576
Fixes a vet error. 2017-08-08 16:00:18 -07:00
Kyle Havlovitz 8c2e422074 Merge pull request #3369 from hashicorp/metrics-enhancements
Add support for labels/filters from go-metrics
2017-08-08 13:55:30 -07:00
Kyle Havlovitz 160395d3c7
Add doc links for metrics endpoint 2017-08-08 13:05:38 -07:00
Kyle Havlovitz 308d7b785d
Update docs for metrics endpoint 2017-08-08 12:33:30 -07:00
Frank Schroeder 0f4986dcc7
dns: minor cleanups 2017-08-08 13:55:58 +02:00
Kyle Havlovitz 975ded2714
Add support for labels/filters from go-metrics 2017-08-08 01:45:10 -07:00
Preetha Appan 2df084968c Go back to using <nodename>.node.dc.consul as the name of the ns record being returned. 2017-08-07 16:02:33 -05:00
Frank Schroeder b571cb8097
dns: keep NS names in consul domain 2017-08-07 11:11:55 +02:00
Frank Schroeder 7b39af2b2d
dns: postmaster -> hostmaster 2017-08-07 11:11:55 +02:00
Frank Schroeder 98de22e13e
dns: we do not support zone transfers 2017-08-07 11:11:55 +02:00
Frank Schroeder e1bcbc6832
dns: drop CNAME for primary name server 2017-08-07 11:11:55 +02:00
Preetha Appan 393a0eae93
Added test case with IPV6 bind address for NS records, rewrote tests to use verify library and other code review feedback 2017-08-07 11:11:55 +02:00
Preetha Appan 52075bda1c
Added back glue records in NS response, expanded unit test. Also reused same function used in node lookup for adding A/AAAA records in the extra section of the NS response 2017-08-07 11:11:55 +02:00
Preetha Appan c7c4100503
Don't add A records for NS requests, because the record being returned already resolves correctly. Also fixed all the unit tests, and ignored hostnames that don't meet valid dns hostname criteria 2017-08-07 11:11:55 +02:00
Frank Schroeder 450d8a69b5
dns: provide correct SOA and NS responses
This patch changes the behavior of the DNS server as follows:

* The SOA response contains the SOA record in the Answer section instead
  of the Authority section. It also contains NS records in the Authority
  and the corresponding A glue records in the Extra section.
  In addition, CNAMEs are added to the Extra section to make the
  MNAME of the SOA record resolvable.

  AAAA glue records are not yet supported.

* The NS response returns up to three random servers from the
  consul cluster in the Answer section and the glue A
  records in the Extra section.

  AAAA glue records are not yet supported.
2017-08-07 11:11:55 +02:00
Preetha Appan bff45ee1da
Unify regex used to identify invalid dns characters 2017-08-07 11:11:55 +02:00
Preetha Appan 6bac9355fd
Use sanitized version of node name of server in NS record, and start with "server" rather than "ns" 2017-08-07 11:11:55 +02:00
Preetha Appan 7e9d683ab1
Removed a copy pasted irrelevant comment, and other code review feedback 2017-08-07 11:11:54 +02:00
Preetha Appan c38906daad
Add NS records and A records for each server. Constructs ns host names using the advertise address of the server. 2017-08-07 11:11:54 +02:00
James Phillips 803ed9a245 Adds secure introduction for the ACL replication token. (#3357)
Adds secure introduction for the ACL replication token, as well as a separate enable config for ACL replication.
2017-08-03 15:39:31 -07:00
Frank Schroeder 1cb602e085
agent: fix code for updated go-discover signature
Closes #3351
2017-08-03 21:32:11 +02:00
James Phillips c31b56a03e Adds a new /v1/acl/bootstrap API (#3349) 2017-08-02 17:05:18 -07:00
Miguel Prokop ea6d610dee agent: Fix script quoting on windows (#1875)
This patch fixes the quoting for executing scripts on windows
and splits the platform dependent code.

Fixes #1875
2017-08-02 17:01:21 +02:00
Frank Schroeder 68e8f3d0f7 agent: use github.com/hashicorp/go-discover
Replace the provider specific node discovery code
with go-discover to support AWS, Azure and GCE.

Fixes #3282
2017-08-01 11:41:43 +02:00
Preetha Appan 307049e17f Return nil instead of empty list when returning a PermissionDenied error, updated unit test 2017-07-31 17:23:20 -05:00
Preetha Appan da29b74d03 Return 403 rather than a 404 when acls cause all results to be filtered out. This fixes #2637 2017-07-31 13:50:29 -05:00
preetapan 677949b14d Merge pull request #3332 from hashicorp/issue_3322
This fixes #3322
2017-07-28 17:54:30 -05:00
Preetha Appan 3b12545844 Tweaked parsing error message to quote properly 2017-07-28 17:52:35 -05:00
James Phillips 8f1f762ddd Adds missing autopilot snapshot test and avoids snapshotting nil. (#3333) 2017-07-28 15:48:42 -07:00
Preetha Appan 86b9e3c5f3 Validate unix sockets and ip addresses as needed, more test cases 2017-07-28 17:18:10 -05:00
Preetha Appan ac068de336 Modify ResolveTmplAddrs to parse advertise IPs, added test cases that fail to parse correctly 2017-07-28 15:01:32 -05:00
Preetha Appan 4b82d09df0 Removed extra newlines 2017-07-28 10:51:11 -05:00
Preetha Appan 7b99f7ca08 Fix comments, and remove redundant TestConfig init from a couple of unit tests 2017-07-28 10:40:43 -05:00
Frank Schroeder f27202b608
add tests for go-sockaddr template parsing 2017-07-28 15:40:22 +02:00
Frank Schroeder 5ee498cbc5
agent: unix sockets are not ip addrs 2017-07-28 14:53:21 +02:00
Frank Schroeder 0b13a38d90
config: refactor tmpl resolution fn 2017-07-28 12:20:49 +02:00
Preetha Appan 28016190e0 Moved handling advertise address to readConfig and out of the agent's constructor, plus unit test fixes 2017-07-27 22:06:31 -05:00
Preetha Appan 398c1e450c Move go-socketaddr template parsing into config package to make it happen before creating a new agent. Also removed redundant parsetemplate calls from agent.go. 2017-07-27 16:17:35 -05:00
James Phillips 6b51744ddf Adds option to prepared queries to remove empty tags. (#3330) 2017-07-26 22:46:43 -07:00
James Phillips 6e794ea1b3 Adds support for agent-side ACL token management via API instead of config files. (#3324)
* Adds token store and removes all runtime use of config for ACL tokens.
* Adds a new API for changing agent tokens on the fly.
2017-07-26 11:03:43 -07:00
Preetha Appan 4692b1478e Add extra test case for deleting entire tree with empty prefix 2017-07-26 09:42:07 -05:00
Preetha Appan 74ba4c3c6b Don't insert tombstone for empty prefix delete. Other minor unit test fixes 2017-07-25 21:54:11 -05:00
Preetha Appan a6b7e66e9a Removed redundant comments and unit test 2017-07-25 20:39:33 -05:00
Preetha Appan 1503d63595 Removed redundant call to reap tombstone from unit test 2017-07-25 19:39:05 -05:00
Preetha Appan 996302c085 Improved unit test per code review 2017-07-25 19:17:40 -05:00
Preetha Appan f4cccf44e3 Use new DeletePrefixMethod for implementing KVSDeleteTree operation. This makes deletes on sub trees larger than one million nodes about 100 times faster. Added unit tests. 2017-07-25 17:21:18 -05:00
James Phillips cf7b1aaf04 Removes an unnecessary close. 2017-07-24 21:41:18 -07:00
Preetha Appan 213af3650f Removed redundant logging 2017-07-24 21:07:48 -05:00
Preetha Appan c08ff6c8ae Clean up temporary files on write errors, and ignore any temporary service files on load with a warning. This fixes #3207 2017-07-24 12:42:51 -05:00
James Phillips a0867b5d49
Tweaks the error when scripts are disabled.
This will hopefully help people self-serve if they upgrade without accounting
for this.
2017-07-19 22:15:04 -07:00
Kyle Havlovitz c74d7558a5 Fix UpgradeVersionTag field not being passed correctly (#3304) 2017-07-19 17:39:48 -07:00
Preetha Appan 9116186b4c Made unit test for AddCheck error check the actual error string 2017-07-19 11:00:56 -05:00
Preetha Appan f790c7279a Unit test for failure case of AddCheck 2017-07-19 10:28:52 -05:00