Commit graph

600 commits

Author SHA1 Message Date
Pierre Souchay eb2a4eaea3 Refactoring to have clearer code without weird bool 2018-03-19 16:12:54 +01:00
Pierre Souchay a5f6ac0df4 [BUGFIX] When a node level check is removed, ensure all services of node are notified
Bugfix for https://github.com/hashicorp/consul/pull/3899

When a node level check is removed (example: maintenance),
some watchers on services might have to recompute their state.

If those nodes are performing blocking queries, they have to be notified.
While their state was updated when node-level state did change or was added
this was not the case when the check was removed. This fixes it.
2018-03-19 14:14:03 +01:00
Preetha Appan 84bd6dc5d1
cleanup unit test code a bit 2018-03-16 09:36:57 -05:00
Preetha 164fb3f48c
Merge pull request #3885 from eddsteel/support-options-requests
Support OPTIONS requests
2018-03-16 09:20:16 -05:00
Preetha 401215230c
Merge pull request #3940 from pierresouchay/dns_max_size
Allow to control the number of A/AAAA Record returned by DNS
2018-03-09 07:35:32 -06:00
Pierre Souchay 241c7e5f5f Cleaner Unit tests from suggestions from @preetapan 2018-03-07 18:24:41 +01:00
Mitchell Hashimoto fbac58280e
agent/consul/fsm: begin using testify/assert 2018-03-06 09:48:15 -08:00
Pierre Souchay 09970479b5 Allow to control the number of A/AAAA Record returned by DNS
This allows to have randomized resource records (i.e. each
answer contains only one IP, but the IP changes every request) for
A, AAAA records.

It will fix https://github.com/hashicorp/consul/issues/3355 and
https://github.com/hashicorp/consul/issues/3937

See https://github.com/hashicorp/consul/issues/3937#issuecomment-370610509
for details.

It basically add a new option called `a_record_limit` and will not
return more than a_record_limit when performing A, AAAA or ANY DNS
requests.

The existing `udp_answer_limit` option is still working but should
be considered as deprecated since it works only with DNS clients
not supporting EDNS.
2018-03-06 02:07:42 +01:00
Edd Steel 413cb3d3b5
Re-use defined endpoints for tests 2018-03-03 11:19:18 -08:00
Paul Banks 628dcc9793
Merge pull request #3899 from pierresouchay/fix_blocking_queries_index
Services Indexes modified per service instead of using a global Index
2018-03-02 16:24:43 +00:00
Pierre Souchay 85b73f8163 Simplified error handling for maxIndexForService
* added unit tests to ensure service index is properly garbage collected
* added Upgrade from Version 1.0.6 to higher section in documentation
2018-03-01 14:09:36 +01:00
Paul Banks f7ecbce39a
Fix test running in non-bash shells 2018-02-22 14:06:06 +00:00
Paul Banks 0ee77a5e02
Merge pull request #3900 from hashicorp/fix-monitor-sigint-3891
Fixes #3891: agent monitor no longer unresponsive before logs stream.
2018-02-21 21:28:33 +00:00
Preetha Appan 77d35f1829
Remove extra newline 2018-02-21 13:21:47 -06:00
Preetha Appan 573500dc51
Unit test that calls revokeLeadership twice to make sure its idempotent 2018-02-21 12:48:53 -06:00
Preetha Appan bd270b02ba
Make sure revokeLeadership is called if establishLeadership errors 2018-02-21 12:33:22 -06:00
Alex Dadgar 535842004c Test autopilots start/stop idempotency 2018-02-21 10:19:30 -08:00
Alex Dadgar 4d99696f02 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Pierre Souchay e6d85cb36a Fixed comments for function maxIndexForService 2018-02-20 23:57:28 +01:00
Pierre Souchay b26ea3c230 [Revert] Only update services if tags are different
This patch did give some better results, but break watches on
the services of a node.

It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.

Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay 903e866835 Only update services if tags are different 2018-02-20 23:08:04 +01:00
Pierre Souchay 56d5c0bf22 Enable Raft index optimization per service name on health endpoint
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Paul Banks 69ebbf3e79
Fixes #3891: agent monitor no longer unresponsive before logs stream.
The root cause is actually that the agent's streaming HTTP API didn't flush until the first log line was found which commonly was pretty soon since the default level is INFO. In cases where there were no logs immediately due to level for instance, the client gets stuck in the HTTP code waiting on a response packet from the server before we enter the loop that checks the shutdown channel from the signal handler.

This fix flushes the initial status immediately on the streaming endpoint which lets the client code get into it's expected state where it's listening for shutdown or log lines.
2018-02-19 21:53:10 +00:00
Pierre Souchay ec1b278595 Get only first service to test whether we have to cleanup index of a service 2018-02-19 22:44:49 +01:00
Pierre Souchay 523feb0be4 Fixed comment about raftIndex + use test.Helper() 2018-02-19 19:30:25 +01:00
Pierre Souchay 4c188c1d08 Services Indexes modified per service instead of using a global Index
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.

It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Edd Steel 35c2083422
Clarify comments 2018-02-17 17:46:11 -08:00
Edd Steel 61be181f6f Test every endpoint for OPTIONS/MethodNotFound 2018-02-17 17:34:13 -08:00
Edd Steel 6c33163959 Allow endpoints to handle OPTIONS/MethodNotFound themselves 2018-02-17 17:34:03 -08:00
Edd Steel 4dc9d2ebd7
Initialise allowedMethods in init() 2018-02-17 17:31:24 -08:00
Kyle Havlovitz ea452c6032
Fix the coordinate update endpoint not passing the ACL token 2018-02-15 11:58:02 -08:00
Edd Steel 40eefc9f7d
Support OPTIONS requests
- register endpoints with supported methods
- support OPTIONS requests, indicating supported methods
- extract method validation (error 405) from individual endpoints
- on 405 where multiple methods are allowed, create a single Allow
  header with comma-separated values, not multiple Allow headers.
2018-02-12 10:15:31 -08:00
Andrei Burd dbb010c865 adding human readability for dns requests debug log (#3751) 2018-02-11 09:02:28 -06:00
James Phillips 37cf6583db
Fixes a panic on TCP-based DNS lookups.
This came in via the monkey patch in #3861.

Fixes #3877
2018-02-08 17:57:41 -08:00
James Phillips 4f3b4d0e55
Addresses additional state mutations.
Did a sweep of 84d6ac2d51
and checked them all.
2018-02-07 07:02:10 -08:00
James Phillips ca461f8890
Fixes all the racy output-side updates to tags. 2018-02-06 20:35:55 -08:00
James Phillips e7dd7b2d13
Adds a more robust unit test for index churn. 2018-02-06 20:35:38 -08:00
James Phillips 41e3fcf205
Makes server manager shift away from failed servers from Serf events.
Because this code was doing pointer equality checks, it would work for
the case of a failed attempted RPC because the objects are from the
manager itself:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/consul/rpc.go#L283-L302

But the pointer check would always fail for events coming in from the
Serf path because the server object is newly-created:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/router/serf_adapter.go#L14-L40

This means that we didn't proactively shift RPC traffic away from a
failed server, we'd have to wait for an RPC to fail, which exposes
the error to the calling client.

By switching over to a name check vs. a pointer check we get the correct
behavior. We added a DEBUG log as well to help observe this behavior during
integrated testing.

Related to #3863 since the fix here needed the same logic duplicated, owing
to the complicated atomic stuff.

/cc @dadgar for a heads up in case this also affects Nomad.
2018-02-05 17:56:00 -08:00
James Phillips c718459e49
Adds a before/after test for #3845. 2018-02-05 16:18:29 -08:00
James Phillips 5b245c0201
Merge pull request #3845 from 42wim/tagfix
Fix service tags not added to health check. Part two
2018-02-05 16:18:00 -08:00
Kyle Havlovitz 46745eb89b
Add enterprise default config section 2018-02-05 13:33:59 -08:00
James Phillips 0aa05cc5f0
Merge pull request #3855 from hashicorp/pr-3782-slackpad
Adds support for gRPC health checks.
2018-02-02 17:57:27 -08:00
James Phillips 1a08e8c0f1
Changes "TLS" to "GRPCUseTLS" since it only applies to GRPC checks. 2018-02-02 17:29:34 -08:00
Wim 5cc76cce09 Fix service tags not added to health check. Part two 2018-01-29 20:32:44 +01:00
Veselkov Konstantin c2395d9bd0 fix refactoring 2018-01-28 22:53:30 +04:00
Veselkov Konstantin c4ad54e057 fix refactoring 2018-01-28 22:48:21 +04:00
Veselkov Konstantin 05666113a4 remove golint warnings 2018-01-28 22:40:13 +04:00
James Phillips 443250c76c
Improves user lookup error message.
Closes #3188
Closes #3184
2018-01-26 07:56:44 -08:00
Kyle Havlovitz 32dbb51c3b
Remove nonvoter from metadata.Server 2018-01-25 17:08:03 -08:00
James Phillips 38f5b2e7ce
Gets rid of named return parameters.
This wasn't wrong before but we don't generally use this style in
Consul.
2018-01-25 14:29:50 -08:00