Commit Graph

4231 Commits

Author SHA1 Message Date
R.B. Boyer 642b75b60b
health: ensure /v1/health/service/:service endpoint returns the most recent results when a filter is used with streaming (#12640)
The primary bug here is in the streaming subsystem that makes the overall v1/health/service/:service request behave incorrectly when servicing a blocking request with a filter provided.

There is a secondary non-streaming bug being fixed here that is much less obvious related to when to update the `reply` variable in a `blockingQuery` evaluation. It is unlikely that it is triggerable in practical environments and I could not actually get the bug to manifest, but I fixed it anyway while investigating the original issue.

Simple reproduction (streaming):

1. Register a service with a tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "a" ], "EnableTagOverride": true }'

2. Do an initial filter query that matches on the tag.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test' --data-urlencode 'filter=a in Service.Tags'

3. Note you get one result. Use the `X-Consul-Index` header to establish
   a blocking query in another terminal, this should not return yet.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test?index=$INDEX' --data-urlencode 'filter=a in Service.Tags'

4. Re-register that service with a different tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "b" ], "EnableTagOverride": true }'

5. Your blocking query from (3) should return with a header
   `X-Consul-Query-Backend: streaming` and empty results if it works
   correctly `[]`.

Attempts to reproduce with non-streaming failed (where you add `&near=_agent` to the read queries and ensure `X-Consul-Query-Backend: blocking-query` shows up in the results).
2022-04-27 10:39:45 -05:00
R.B. Boyer c949de9f50
structs: ensure exported-services PeerName field can be addressed as peer_name (#12862) 2022-04-27 10:27:21 -05:00
Dhia Ayachi 9dc5200155
update raft to v1.3.8 (#12844)
* update raft to v1.3.7

* add changelog

* fix compilation error

* fix HeartbeatTimeout

* fix ElectionTimeout to reload only if value is valid

* fix default values for `ElectionTimeout` and `HeartbeatTimeout`

* fix test defaults

* bump raft to v1.3.8
2022-04-25 10:19:26 -04:00
R.B. Boyer 809344a6f5
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
Will Jordan 45ffdc360e
Add timeout to Client RPC calls (#11500)
Adds a timeout (deadline) to client RPC calls, so that streams will no longer hang indefinitely in unstable network conditions.

Co-authored-by: kisunji <ckim@hashicorp.com>
2022-04-21 16:21:35 -04:00
Matt Keeler f49adfaaf0
Implement the ServerDiscovery.WatchServers gRPC endpoint (#12819)
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
2022-04-21 12:56:18 -04:00
Blake Covarrubias 2beea7eb7c
acl: Clarify node/service identities must be lowercase (#12807)
Modify ACL error message for invalid node/service identities names to
clearly state only lowercase alphanumeric characters are supported.
2022-04-21 09:29:16 -07:00
R.B. Boyer bbd38e95ce
chore: upgrade mockery to v2 and regenerate (#12836) 2022-04-21 09:48:21 -05:00
R.B. Boyer cf0c5110be
ca: fix a bug that caused a non blocking leaf cert query after a blocking leaf cert query to block (#12820)
Fixes #12048

Fixes #12319

Regression introduced in #11693

Local reproduction steps:
1. `consul agent -dev`
2. `curl -sLiv 'localhost:8500/v1/agent/connect/ca/leaf/web'`
3. make note of the `X-Consul-Index` header returned 
4. `curl -sLi 'localhost:8500/v1/agent/connect/ca/leaf/web?index=<VALUE_FROM_STEP_3>'`
5. Kill the above curl when it hangs with Ctrl-C
6. Repeat (2) and it should not hang.
2022-04-20 12:21:47 -05:00
Riddhi Shah 1d49f5c84e
[OSS] gRPC call to get envoy bootstrap params (#12825)
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
2022-04-19 17:24:21 -07:00
Matt Keeler 3badd4c35c
Add event generation for autopilot state updates (#12626)
Whenever autopilot updates its state it notifies Consul. That notification will then trigger Consul to extract out the ready server information. If the ready servers have changed, then an event will be published to notify any subscribers of the full set of ready servers.

All these ready server event things are contained within an autopilotevents package instead of the consul package to make importing them into the grpc related packages possible
2022-04-19 13:03:03 -04:00
Evan Culver 9d0b5bf8e9
connect: Add Envoy 1.22 to integration tests, remove Envoy 1.18 (#12805)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-04-18 09:36:07 -07:00
DanStough a050aa39b9 Update go version to 1.18.1 2022-04-18 11:41:10 -04:00
Kyle Havlovitz 449a0f855c
Add an internal env var for managed cluster config in the ui (#12796) 2022-04-15 09:55:52 -07:00
John Murret 1e118a5410 set vault namespaces on vault client prior to logging in with the vault auth method 2022-04-14 12:18:06 -06:00
Evan Culver e62745c82c
connect: Add Envoy 1.21.1 to support matrix, remove 1.17.4 (#12777) 2022-04-14 10:44:42 -07:00
Dan Upton 769d1d6e8e
ConnectCA.Sign gRPC Endpoint (#12787)
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
2022-04-14 14:26:14 +01:00
Kyle Havlovitz 199f1c7200
Fix namespace default field names in expanded token output 2022-04-13 16:46:39 -07:00
Paul Glass 5eea62b47a
acl: Adjust region handling in AWS IAM auth method (#12774)
* acl: Adjust region handling in AWS IAM auth method
2022-04-13 14:31:37 -05:00
Eric Haberkorn 7961901153
Merge pull request #12773 from hashicorp/fix-lambda-intentions-and-routing
Implement Routing and Intentions for Lambdas
2022-04-13 13:01:15 -04:00
Eric b062f8c2aa Implement routing and intentions for AWS Lambdas 2022-04-13 11:45:25 -04:00
Karl Cardenas b0b197964c
Merge pull request #12562 from hashicorp/docs/blake-agent-config
docs: Agent configuration hierarchy reorganization
2022-04-12 12:33:42 -07:00
FFMMM cf7e6484aa
add more labels to RequestRecorder (#12727)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-04-12 10:50:25 -07:00
Matt Keeler 2a4ca71d3f
Move to using a shared EventPublisher (#12673)
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.

The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
2022-04-12 09:47:42 -04:00
Blake Covarrubias 3175bf6b1b Remove .html extensions from docs URLs 2022-04-11 17:38:49 -07:00
Natalie Smith cd17e98800 docs: fix yet more references to agent/options 2022-04-11 17:38:49 -07:00
R.B. Boyer f5a882f66c
fix broken test (#12741) 2022-04-11 10:56:57 -05:00
Jared Kirschner 9a81a77264
Merge pull request #12725 from hashicorp/clarify-service-deregister-after-critical-message
improve error msg for deregister critical service
2022-04-07 18:01:54 -04:00
R.B. Boyer f4eac06b21
xds: ensure that all connect timeout configs can apply equally to tproxy direct dial connections (#12711)
Just like standard upstreams the order of applicability in descending precedence:

1. caller's `service-defaults` upstream override for destination
2. caller's `service-defaults` upstream defaults
3. destination's `service-resolver` ConnectTimeout
4. system default of 5s

Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>
2022-04-07 16:58:21 -05:00
Jared Kirschner 1835e761fd improve error msg for deregister critical service
If a service is automatically registered because it has a critical health check
for longer than deregister_critical_service_after, the error message will now
include:
- mention of the deregister_critical_service_after option
- the value of deregister_critical_service_after for that check
2022-04-07 14:50:02 -07:00
Kyle Havlovitz a2c28526a7
Merge pull request #12685 from hashicorp/http-check-redirect-option
Add a field to disable following redirects on http checks
2022-04-07 11:29:27 -07:00
Matt Keeler 3447880091
Enable running autopilot state updates on all servers (#12617)
* Fixes a lint warning about t.Errorf not supporting %w

* Enable running autopilot on all servers

On the non-leader servers all they do is update the state and do not attempt any modifications.

* Fix the RPC conn limiting tests

Technically they were relying on racey behavior before. Now they should be reliable.
2022-04-07 10:48:48 -04:00
FFMMM 0f68bf879a
[rpc/middleware][consul] plumb intercept off, add server level happy test (#12692) 2022-04-06 14:33:05 -07:00
FFMMM 6bdde40d5e
lower log to trace (#12708) 2022-04-06 11:37:08 -07:00
Kyle Havlovitz b3db499c74 Add a field to disable following redirects on http checks 2022-04-05 16:12:18 -07:00
Mark Anderson ed3e42296d Fixup acl.EnterpriseMeta
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 15:11:49 -07:00
Mark Anderson 0905c1d83d Manual Structs fixup
Change things by hand that I couldn't figure out how to automate

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:51:10 -07:00
Mark Anderson 497b300c76 add new entmeta stuff.
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:49:31 -07:00
R.B. Boyer 12523197f9
syncing changes back from enterprise (#12701) 2022-04-05 15:46:56 -05:00
Riddhi Shah 0e5d46e9c4
Merge pull request #12695 from hashicorp/feature-negotiation-grpc-api-oss
[OSS] Supported dataplane features gRPC endpoint
2022-04-05 11:26:33 -07:00
Dan Upton e3d2b91e34
ca: move ConnectCA.Sign authorization logic to CAManager (#12609)
OSS sync of enterprise changes at 8d6fd125
2022-04-05 13:16:20 -05:00
Kyle Havlovitz 9380343689
Merge pull request #12672 from hashicorp/tgate-san-validation
Respect SNI with terminating gateways and log a warning if it isn't set alongside TLS
2022-04-05 11:15:59 -07:00
Riddhi Shah f053279c4e [OSS] Supported dataplane features gRPC endpoint
Adds a new gRPC service and endpoint to return the list of supported
consul dataplane features. The Consul Dataplane will use this API to
customize its interaction with that particular server.
2022-04-05 07:38:58 -07:00
Dan Upton e48c1611ee
WatchRoots gRPC endpoint (#12678)
Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will
use to fetch the current list of active Connect CA roots and receive new
lists whenever the roots are rotated.
2022-04-05 15:26:14 +01:00
Dhia Ayachi cdcb249449
add a rate limiter to config auto-reload (#12490)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* do not reset timer on errors, rename OS specific files

* rename New func

* events trigger on write and rename

* add missing test

* fix flaking tests

* fix flaky test

* check reconcile when removed

* delete invalid file

* fix test to create files with different mod time.

* back date file instead of sleeping

* add watching file in agent command.

* fix watcher call to use new API

* add configuration and stop watcher when server stop

* add certs as watched files

* move FileWatcher to the agent start instead of the command code

* stop watcher before replacing it

* save watched files in agent

* add add and remove interfaces to the file watcher

* fix remove to not return an error

* use `Add` and `Remove` to update certs files

* fix tests

* close events channel on the file watcher even when the context is done

* extract `NotAutoReloadableRuntimeConfig` is a separate struct

* fix linter errors

* add Ca configs and outgoing verify to the not auto reloadable config

* add some logs and fix to use background context

* add tests to auto-config reload

* remove stale test

* add tests to changes to config files

* add check to see if old cert files still trigger updates

* rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig`

* fix to re add both key and cert file. Add test to cover this case.

* review suggestion

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* add check to static runtime config changes

* fix test

* add changelog file

* fix review comments

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* update flag description

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

* fix compilation error

* add static runtime config support

* fix test

* fix review comments

* fix log test

* Update .changelog/12329.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

* transfer tests to runtime_test.go

* fix filewatcher Replace to not deadlock.

* avoid having lingering locks

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* split ReloadConfig func

* fix warning message

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* convert `FileWatcher` into an interface

* fix compilation errors

* fix tests

* extract func for adding and removing files

* add a coalesceTimer with a very small timer

* extract coaelsce Timer and add a shim for testing

* add tests to coalesceTimer fix to send remaining events

* set `coalesceTimer` to 1 Second

* support symlink, fix a nil deref.

* fix compile error

* fix compile error

* refactor file watcher rate limiting to be a Watcher implementation

* fix linter issue

* fix runtime config

* fix runtime test

* fix flaky tests

* fix compile error

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix agent New to return an error if File watcher New return an error

* quit timer loop if ctx is canceled

* Apply suggestions from code review

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Upton <daniel@floppy.co>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-04-04 11:31:39 -04:00
Eric Haberkorn 66c40e073a
Merge pull request #12681 from hashicorp/lambda-patching-tweaks
Tweak the Lambda Envoy configuration generated by the serverless patcher
2022-04-01 19:59:30 -04:00
FFMMM b0cba2ec03
mark disable_compat_1.9 to deprecate in 1.13, change default to true (#12675)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-04-01 10:35:56 -07:00
R.B. Boyer 3060b5cb8f
xds: errors from the xds serverless plugin are fatal (#12682) 2022-04-01 10:30:26 -05:00
Eric 66391186ce Tweak the Lambda Envoy configuration generated by the serverless patcher
- Move from `strip_matching_host_port` to `strip_any_host_port`
- Remove `auto_host_rewrite` since it conflicts with `strip_any_host_port`
2022-04-01 11:13:44 -04:00
Eric Haberkorn 9b291d362a
Merge pull request #12676 from hashicorp/implement-lambda-patching
Implement Lambda Patching in the Serverless Plugin
2022-04-01 09:58:56 -04:00