Commit Graph

3215 Commits

Author SHA1 Message Date
hc-github-team-consul-core 1f1c6a0f5e auto-updated agent/uiserver/bindata_assetfs.go from commit 07e11f32e 2021-05-04 16:56:40 +00:00
hc-github-team-consul-core 7c7b240685 auto-updated agent/uiserver/bindata_assetfs.go from commit fc7fe23b4 2021-05-04 16:36:21 +00:00
Freddy ec38cf3206
Fixup discovery chain handling in transparent mode (#10168)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

Previously we would associate the address of a discovery chain target
with the discovery chain's filter chain. This was broken for a few reasons:

- If the upstream is a virtual service, the client proxy has no way of
dialing it because virtual services are not targets of their discovery
chains. The targets are distinct services. This is addressed by watching
the endpoints of all upstream services, not just their discovery chain
targets.

- If multiple discovery chains resolve to the same target, that would
lead to multiple filter chains attempting to match on the target's
virtual IP. This is addressed by only matching on the upstream's virtual
IP.

NOTE: this implementation requires an intention to the redirecting
virtual service and not just to the final destination. This is how
we can know that the virtual service is an upstream to watch.

A later PR will look into traversing discovery chains when computing
upstreams so that intentions are only required to the discovery chain
targets.
2021-05-04 08:45:19 -06:00
Paul Banks d47eea3a3f
Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 15:36:53 +01:00
Luke Kysow eb84a856c4
Give descriptive error if auth method not found (#10163)
* Give descriptive error if auth method not found

Previously during a `consul login -method=blah`, if the auth method was not found, the
error returned would be "ACL not found". This is potentially confusing
because there may be many different ACLs involved in a login: the ACL of
the Consul client, perhaps the binding rule or the auth method.

Now the error will be "auth method blah not found", which is much easier
to debug.
2021-05-03 13:39:13 -07:00
Freddy 5427a1465c
Only consider virtual IPs for transparent proxies (#10162)
Initially we were loading every potential upstream address into Envoy
and then routing traffic to the logical upstream service. The downside
of this behavior is that traffic meant to go to a specific instance
would be load balanced across ALL instances.

Traffic to specific instance IPs should be forwarded to the original
destination and if it's a destination in the mesh then we should ensure
the appropriate certificates are used.

This PR makes transparent proxying a Kubernetes-only feature for now
since support for other environments requires generating virtual IPs,
and Consul does not do that at the moment.
2021-05-03 14:15:22 -06:00
hc-github-team-consul-core 5146f86dfc auto-updated agent/uiserver/bindata_assetfs.go from commit 211bc824a 2021-05-03 15:24:01 +00:00
Daniel Nephin dcb5b924dc
Merge pull request #10149 from hashicorp/dnephin/config-use-streaming-backend-defualt-true
config: default UseStreamingBackend to true
2021-04-30 16:29:11 -04:00
R.B. Boyer 97e57aedfb
connect: update supported envoy versions to 1.18.2, 1.17.2, 1.16.3, and 1.15.4 (#10101)
The only thing that needed fixing up pertained to this section of the 1.18.x release notes:

> grpc_stats: the default value for stats_for_all_methods is switched from true to false, in order to avoid possible memory exhaustion due to an untrusted downstream sending a large number of unique method names. The previous default value was deprecated in version 1.14.0. This only changes the behavior when the value is not set. The previous behavior can be used by setting the value to true. This behavior change by be overridden by setting runtime feature envoy.deprecated_features.grpc_stats_filter_enable_stats_for_all_methods_by_default.

For now to maintain status-quo I'm explicitly setting `stats_for_all_methods=true` in all versions to avoid relying upon the default.

Additionally the naming of the emitted metrics for these gRPC requests changed slightly so the integration test assertions for `case-grpc` needed adjusting.
2021-04-29 15:22:03 -05:00
R.B. Boyer 06cd0aaa8d
xds: ensure that all envoyproxy/go-control-plane protobuf symbols are linked into the final binary (#10131)
This ensures that if someone does include some extension Consul does not currently make use of, that extension is actually usable. Without linking these envoy protobufs into the main binary it can't round trip the escape hatches to send them down to envoy.

Whenenver the go-control-plane library is upgraded next we just have to re-run 'make envoy-library'.
2021-04-29 14:58:26 -05:00
R.B. Boyer 91bee6246f
Support Incremental xDS mode (#9855)
This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged.

Union of all commit messages follows to give an overarching summary:

xds: exclusively support incremental xDS when using xDS v3

Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail.
Work around a strange older envoy behavior involving empty CDS responses over incremental xDS.
xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support

Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead.
Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client.
xds: pull out checkStreamACLs method in advance of a later commit

xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings

In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty.

This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10.

xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time
2021-04-29 13:54:05 -05:00
hc-github-team-consul-core 93971cefd3 auto-updated agent/uiserver/bindata_assetfs.go from commit 5d89ba031 2021-04-29 09:46:32 +00:00
Freddy a5fc796d1f
Rename cluster config files to mesh as well (#10148) 2021-04-28 18:15:26 -06:00
Daniel Nephin 5fa077cf0d config: default UseStreamingBackend to true 2021-04-28 18:58:02 -04:00
Freddy 401f3010e0
Rename "cluster" config entry to "mesh" (#10127)
This config entry is being renamed primarily because in k8s the name
cluster could be confusing given that the config entry applies across
federated datacenters.

Additionally, this config entry will only apply to Consul as a service
mesh, so the more generic "cluster" name is not needed.
2021-04-28 16:13:29 -06:00
Daniel Nephin b4362552fb
Merge pull request #10112 from hashicorp/dnephin/remove-streaming-from-cache
streaming: replace agent/cache with submatview.Store
2021-04-28 17:31:42 -04:00
Daniel Nephin 65da18a986 submatview: fix godoc and comment typos
Co-authored-by: Paul Banks <banks@banksco.de>
2021-04-28 12:27:38 -04:00
hc-github-team-consul-core 16f1995d6d auto-updated agent/uiserver/bindata_assetfs.go from commit 345765f76 2021-04-28 11:21:09 +00:00
Daniel Nephin 872bb9db14 submatview: avoid sorting results unnecessarily
Previous getFromView would call view.Result when the result may not have been returned
(because the index is updated past the minIndex. This would allocate a slice and sort it
for no reason, because the values would never be returned.

Fix this by re-ordering the operations in getFromView.

The test changes in this commit were an attempt to cover the case where
an update is received but the index does not exceed the minIndex.
2021-04-27 19:03:17 -04:00
Daniel Nephin cf8520d85c submatview: only return materializer from getEntry
Also rename it to readEntry now that it doesn't return the entire entry. Based on feedback
in PR review, the full entry is not used by the caller, and accessing the fields wouldn't be
safe outside the lock, so it is safer to return only the Materializer
2021-04-27 19:03:17 -04:00
Daniel Nephin 768e0a7d03 submatview: godoc 2021-04-27 19:03:17 -04:00
Daniel Nephin 2dfacb2d49 rpcclient/health: convert tests to the new submatview.Store interface
Also fixes a minor data race in Materializer. Capture the error before
releasing the lock.
2021-04-27 19:03:17 -04:00
Daniel Nephin 1a6bff1109 agent: fix data race in tests caused by grpc log init
grpclog.SetLoggerV2 is meant to be called only once before any gRPC requests are received, but
each test that uses TestAgent will call NewBaseDeps again. Use a sync.Once to prevent the grpc
logging from being re-initialized by each test.

This will mean that a test can't use a fake logger to capture logs from the gRPC server.
2021-04-27 19:03:17 -04:00
Daniel Nephin 43b14bf57f submatview: fix two flaky tests
These tests can flake when we get a notification for an earlier event.
Retry the read from update channel a few times to make sure we get the
event we expect.
2021-04-27 19:03:17 -04:00
Daniel Nephin 8f47bbe89a rpcclient:health: fix a data race and flake in tests
Split the TestStreamingClient into the two logical components the real
client uses. This allows us to test multiple clients properly.

Previously writing of ctx from multiple Subscribe calls was showing a
data race.

Once this was fixed a test started to fail because the request had to be
made with a greater index, so that the store.Get call did not return
immediately.
2021-04-27 19:03:17 -04:00
Daniel Nephin 95c92343e4 rpcclient/health: fix data race in a test
The idleTTL was being written and read concurrently. Instead move the idleTTL to a struct
field so that when one test patches the TTL it does not impact others.

The background goroutines for the store can outlive a test because context cancellation
is async.
2021-04-27 19:03:17 -04:00
Daniel Nephin d257acee24 rpcclient: close the grpc.ClientConn on shutdown 2021-04-27 19:03:16 -04:00
Daniel Nephin 3cda0a7cc4 health: create health.Client in Agent.New 2021-04-27 19:03:16 -04:00
Daniel Nephin 0ea49c3e65 rpcclient/health: move all backend routing logic to client 2021-04-27 19:03:16 -04:00
Daniel Nephin 318bbd3e30 health: use blocking queries for near query parameter 2021-04-27 19:03:16 -04:00
Daniel Nephin 18c9e73832 connect: do not set QuerySource.Node
Setting this field to a value is equivalent to using the 'near' query paramter.
The intent is to sort the results by proximity to the node requesting
them. However with connect we send the results to envoy, which doesn't
care about the order, so setting this field is increasing the work
performed for no gain.

It is necessary to unset this field now because we would like connect
to use streaming, but streaming does not support sorting by proximity.
2021-04-27 19:03:16 -04:00
Daniel Nephin 440ab3e0ae submatview: move error return to NewMaterializer
So that we don't have to create views ahead of time, when we will never use that view.
2021-04-27 19:03:16 -04:00
Daniel Nephin aadb46b209 rpcclient/health: integrate submatview.Store into rpcclient/health 2021-04-27 19:03:16 -04:00
Daniel Nephin 26c44aacde rpcclient: move streaming cache tests 2021-04-27 19:03:16 -04:00
Daniel Nephin 31c9519712 submatview: test Store.Run 2021-04-27 19:03:16 -04:00
Daniel Nephin d23fd66aeb submatview: test store with Get and Notify calls together 2021-04-27 19:03:16 -04:00
Daniel Nephin 54a402d772 submatview: more test cases for Store.Get
And simplify the return value by using a Result type.
2021-04-27 19:03:16 -04:00
Daniel Nephin f7190b1c61 submatview: reduce the getFromView implementation
Remove View.Result error return value, it was always nil, and seems like it will likely always remain nill
since it is simply reading a stored value.

Also replace some cache types with local types.
2021-04-27 19:03:16 -04:00
Daniel Nephin f4573177ba submatview: track requests instead of notifiers
And only start expiration time when the last request ends. This makes tracking expiry simpler, and
ensures that no entry can be expired while there are active requests.
2021-04-27 19:03:16 -04:00
Daniel Nephin 468469bd27 submatview: set up expiry of materializers 2021-04-27 19:03:16 -04:00
Daniel Nephin f46a830e48 submatview: setup testing structure 2021-04-27 19:03:16 -04:00
Daniel Nephin e4c503c28e submatview: rough outline of the Get and Notify methods. 2021-04-27 19:03:16 -04:00
Daniel Nephin c23e98a5e6 submatview: setup the interface for the store 2021-04-27 19:03:15 -04:00
Matt Keeler 6c639be8ec
Add prometheus guage definitions for replication metrics. (#10109) 2021-04-23 17:05:33 -04:00
Matt Keeler 09bf05ec5d
Add replication metrics (#10073) 2021-04-22 11:20:53 -04:00
hashicorp-ci af54f11426 auto-updated agent/uiserver/bindata_assetfs.go from commit 7a095e7d2 2021-04-22 11:28:27 +00:00
hashicorp-ci 4dfa2ce1ab auto-updated agent/uiserver/bindata_assetfs.go from commit f6f282ac7 2021-04-22 11:21:21 +00:00
Daniel Nephin 281d7616fa
Merge pull request #10045 from hashicorp/dnephin/state-proxy-defaults
state: remove config-entries kind index
2021-04-20 17:12:50 -04:00
Daniel Nephin 6d1a5b3629 Handle ErrChunkingResubmit.Error properly
Previously canRetry was attempting to retrieve this error from args, however there was never
any callers that would pass an error to args.

With the change to raftApply to move this error to the error return value, it is now possible
to receive this error from the err argument.

This commit updates canRetry to check for ErrChunkingResubmit in err.
2021-04-20 13:29:31 -04:00
Daniel Nephin 8654adfc53 Handle FSM.Apply errors in raftApply
Previously we were inconsistently checking the response for errors. This
PR moves the response-is-error check into raftApply, so that all callers
can look at only the error response, instead of having to know that
errors could come from two places.

This should expose a few more errors that were previously hidden because
in some calls to raftApply we were ignoring the response return value.

Also handle errors more consistently. In some cases we would log the
error before returning it. This can be very confusing because it can
result in the same error being logged multiple times. Instead return
a wrapped error.
2021-04-20 13:29:29 -04:00