Commit Graph

17432 Commits

Author SHA1 Message Date
R.B. Boyer 3b12a5179f
test: fix flaky test TestEventBufferFuzz (#13175) 2022-05-23 09:22:30 -05:00
Matt Keeler c629e89289
Fix tests broken in #13173 (#13178)
I changed the error type returned in a situation but didn’t update the tests to expect that error.
2022-05-23 10:00:06 -04:00
Matt Keeler 8a968299dd
Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173)
Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that.

Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed.

It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test)

The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped.

We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe.

So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.
2022-05-23 08:59:13 -04:00
Evan Culver ce89d07428
ci: name unnamed GHA workflows (#13162) 2022-05-20 16:06:35 -07:00
R.B. Boyer 69d3e729a4
agent: allow for service discovery queries involving peer name to use streaming (#13168) 2022-05-20 15:27:01 -05:00
Dan Upton 30775ed54d
proxycfg: remove dependency on `cache.UpdateEvent` (#13144)
OSS portion of enterprise PR 1857.

This removes (most) references to the `cache.UpdateEvent` type in the
`proxycfg` package.

As we're going to be direct usage of the agent cache with interfaces that
can be satisfied by alternative server-local datasources, it doesn't make
sense to depend on this type everywhere anymore (particularly on the
`state.ch` channel).

We also plan to extract `proxycfg` out of Consul into a shared library in
the future, which would require removing this dependency.

Aside from a fairly rote find-and-replace, the main change is that the
`cache.Cache` and `health.Client` types now accept a callback function
parameter, rather than a `chan<- cache.UpdateEvents`. This allows us to
do the type conversion without running another goroutine.
2022-05-20 15:47:40 +01:00
Evan Culver 5554a40c53
telemetry: remove unused arg (#13161) 2022-05-19 19:17:30 -07:00
funkiestj 386106a139
api: add the ability to specify a path prefix (#12914)
Specifically meant for when consul is behind a reverse proxy / API gateway

Co-authored-by: Evan Culver <eculver@hashicorp.com>
2022-05-19 16:07:59 -07:00
R.B. Boyer 63a9175bd6
peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
R.B. Boyer 68789effeb
test: TestServer_RPC_MetricsIntercept should use a concurrency-safe metrics store (#13157) 2022-05-19 15:39:28 -05:00
DanStough b8f32b5862 chore(ci): fix rate-limiting for backport-assistant 2022-05-19 16:20:04 -04:00
cskh df27fa0c84
Retry on bad dogstatsd connection (#13091)
- Introduce a new telemetry configurable parameter retry_failed_connection. User can set the value to true to let consul agent continue its start process on failed connection to datadog server. When set to false, agent will stop on failed start. The default behavior is true.

Co-authored-by: Dan Upton <daniel@floppy.co>
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
2022-05-19 16:03:46 -04:00
R.B. Boyer 9389b8b6fa
Merge pull request #13150 from hashicorp/peering-replicate-connect-endpoints
peering: replicate discovery chains information to importing peers
2022-05-19 14:33:30 -05:00
R.B. Boyer 91691eca87 peering: replicate discovery chains information to importing peers
Treat each exported service as a "discovery chain" and replicate one
synthetic CheckServiceNode for each chain and remote mesh gateway.

The health will be a flattened generated check of the checks for that
mesh gateway node.
2022-05-19 14:21:44 -05:00
R.B. Boyer bf05e8c1f1 prefactor some functions out of the monolithic file 2022-05-19 14:21:29 -05:00
R.B. Boyer 6427784846
test: fix more flakes in the compatibility test (#13145) 2022-05-19 14:05:41 -05:00
R.B. Boyer 09861a2792
test: fix incorrect use of t instead of r in retry test (#13146) 2022-05-19 14:00:07 -05:00
DanStough 2b13cfdddf chore(ci): squash backport-assistant commits in PRs 2022-05-19 12:35:13 -04:00
Dhia Ayachi 58ca5cb15c
fix 'releases/*.*.x' to 'release/*.*.x' (#13132) 2022-05-19 10:01:57 -04:00
Dmytro 5f48222db7
docs: Update service/node identities text on ACL index (#13022)
Modify node and service identities paragraphs on ACL index to better
conform with the style guide.

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2022-05-18 18:55:35 -07:00
R.B. Boyer e8bbcaca86
test: cleanup and unflake parts of the upgrade compat tests (#13126) 2022-05-18 14:52:26 -05:00
John Murret 5fdfec1275
Updating compatibility matrix for consul-k8s version 0.44.0 (#13111) 2022-05-18 10:05:41 -06:00
Dan Upton 7492357b43
config: prevent top-level `verify_incoming` enabling mTLS on gRPC port (#13118)
Fixes #13088

This is a backwards-compatibility bug introduced in 1.12.
2022-05-18 16:15:57 +01:00
Jasmine W 34f5f99423
Merge pull request #13110 from hashicorp/ui/bugfix/misspelling
Corrected "visualization" typo
2022-05-17 20:26:13 -04:00
Jasmine W 1df1e0811a Corrected "visualization" typo 2022-05-17 16:27:10 -04:00
cskh 9d4dc6ec1b
Merge pull request #13108 from hashicorp/remove-unused-constant
Remove unused constant in retry.go
2022-05-17 15:45:36 -04:00
Hui Kang c983f32e1c Remove unused constant in retry.go 2022-05-17 15:34:04 -04:00
Michele Degges 68e72fd0db
[CI-only] Build and publish dev dockerhub images (#13084) 2022-05-17 12:23:03 -07:00
Dhia Ayachi 9ecaa6547f
fix yaml alignment (#13105) 2022-05-17 11:41:23 -04:00
Dhia Ayachi 9762875f7b
add a github workflow to trigger ent->oss merge on every PR merged (#13072)
* add a github workflow to trigger ent->oss merge on every PR merged

* remove the workflow automation ref to trigger-oss-merge in circle-ci

* remove workflow automation

* revert circle-ci changes

* add actor

* remove cherrypicker

* add condition to avoid running in enterprise
2022-05-17 11:32:40 -04:00
R.B. Boyer 91bea38118
update the version of the underlying vm for these two test types (#13099) 2022-05-17 10:25:38 -05:00
Michael Schurter 69c30069cf
docs: specify 1.12.0+ent incompat with nomad (#12883)
Due to build changes in Consul 1.12.0 the `+ent` modifier is missing
from the version reported by `/v1/agent/self`.

Nomad looks for the `ent` modifier when determining whether to reconcile
services in non-default namespaces. Without the modifier Nomad will only
end up removing services from the default Consul namespace.
2022-05-16 18:41:13 -04:00
cskh 7d56ae2f66
Merge pull request #13098 from hashicorp/fix-insecure-load-test
fix insecure cidr_blocks in load test
2022-05-16 17:08:48 -04:00
Hui Kang 3a6547740f change to var.vpc_cidr 2022-05-16 16:49:46 -04:00
Hui Kang bc5f13e331 fix insecure cidr_blocks in load test 2022-05-16 16:37:45 -04:00
Dhia Ayachi 4b5919f95a
upgrade serf to v0.9.8 (#13062)
* upgrade serf to v0.9.8

* add changelog

* Update .changelog/13062.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

Co-authored-by: Dan Upton <daniel@floppy.co>
2022-05-16 14:13:23 -04:00
DanStough 0854b848d1 chore(ci): fixes and spilt nightly tests per release branch 2022-05-16 12:51:06 -04:00
Dan Stough abed0bf5b0 Update pr-labeler.yml
Changing from `pull_request` to `pull_request_target` so that forks can get the write permissions to add labels. See [this thread](https://github.com/actions/labeler/issues/121) in the action repo.
2022-05-16 12:03:10 -04:00
Freddy 6c868b6c0e
Patches to peering initiation for POC demo (#13076)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-05-13 13:01:00 -06:00
DanStough 48ee44c060 chore(ci): fix automerge spelling 2022-05-13 14:15:58 -04:00
Dhia Ayachi 70b93ea693
When a host header is defined override `req.Host` in the metrics ui (#13071)
* When a host header is defined override the req.Host in the metrics ui endpoint.

* add changelog
2022-05-13 14:05:22 -04:00
Blake Covarrubias d33d06e033
Merge pull request #12972 from hashicorp/docs/simplify-acl-policy-examples
* Fix JSON formatted example ACL policies
* Remove HCL examples of roles since only JSON payloads can be used
* Remove unnecessary quotes from HCL variables
* Clarify purpose of example ACL rules and policies
* Redirect /api-docs/acl/acl to the new URL
2022-05-13 10:39:37 -07:00
Blake Covarrubias e653cb58ad Redirect /api-docs/acl/acl to the new URL 2022-05-13 10:37:16 -07:00
Blake Covarrubias 2848721d82 Clarify purpose of example ACL rules and policies 2022-05-13 10:37:16 -07:00
Blake Covarrubias 3d0abe7aef Remove unnecessary quotes from HCL variables 2022-05-13 10:35:24 -07:00
Blake Covarrubias 9afeac9643 Remove HCL examples of roles since only JSON payloads can be used 2022-05-13 10:35:24 -07:00
Blake Covarrubias 948fc47e75 docs: Fix JSON formatted example ACL policies
Commit 9333fad added JSON formatted examples for all ACL polices.
Most of these these examples are not valid JSON, and thus an error is
raised when attempting to create the example policies/rules in Consul.

This commit fixes the example JSON formatted ACL rules so that they
are valid JSON. This enables readers to use the policies as-is from
the documentation to successfully create policies in Consul.

It also removes unnecessary arrays from the example policies so that
the policies are easier for practitioners to read and write.
2022-05-13 10:35:24 -07:00
Evan Culver 7de052dfbe
peering: fix panic in test (#13069) 2022-05-13 10:13:02 -07:00
DanStough af4c674709 chore(ci): deprecate cherry-pick script and enable backport merge 2022-05-13 11:13:12 -04:00
Freddy 160acdf876
Actually block when syncing subscriptions (#13066)
By changing to use WatchCtx we will actually block for changes to the peering list. WatchCh creates a goroutine to collect errors from WatchCtx and returns immediately.

The existing behavior wouldn't result in a tight loop because of the rate limiting in the surrounding function, but it would still lead to more work than is necessary.
2022-05-12 17:36:14 -06:00