Find a file
Matt Keeler 8a968299dd
Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173)
Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that.

Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed.

It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test)

The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped.

We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe.

So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.
2022-05-23 08:59:13 -04:00
.changelog api: add the ability to specify a path prefix (#12914) 2022-05-19 16:07:59 -07:00
.circleci update the version of the underlying vm for these two test types (#13099) 2022-05-17 10:25:38 -05:00
.github ci: name unnamed GHA workflows (#13162) 2022-05-20 16:06:35 -07:00
.release [CI-only] Build and publish dev dockerhub images (#13084) 2022-05-17 12:23:03 -07:00
acl [OSS] Add upsert handling for receiving CheckServiceNode (#13061) 2022-05-12 15:04:44 -06:00
agent Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173) 2022-05-23 08:59:13 -04:00
api api: add the ability to specify a path prefix (#12914) 2022-05-19 16:07:59 -07:00
bench
build-support
command Merge pull request #12878 from hashicorp/ma/x-forwarded-client-cert 2022-05-04 11:05:44 -07:00
connect Retry on bad dogstatsd connection (#13091) 2022-05-19 16:03:46 -04:00
contributing
docs Add some docs on pprof usage (#12551) 2022-05-09 14:15:21 -06:00
grafana
internal internal: port RPC glue changes from Enterprise (#13034) 2022-05-11 15:03:07 -07:00
ipaddr
lib telemetry: remove unused arg (#13161) 2022-05-19 19:17:30 -07:00
logging peering: initial sync (#12842) 2022-04-21 17:34:40 -05:00
proto agent: allow for service discovery queries involving peer name to use streaming (#13168) 2022-05-20 15:27:01 -05:00
proto-public acl: gRPC login and logout endpoints (#12935) 2022-05-04 17:38:45 +01:00
sdk add general runstep test helper instead of copying it all over the place (#13013) 2022-05-10 15:25:51 -05:00
sentinel
service_os
snapshot
test test: fix more flakes in the compatibility test (#13145) 2022-05-19 14:05:41 -05:00
testrpc peering: initial sync (#12842) 2022-04-21 17:34:40 -05:00
tlsutil
types
ui Corrected "visualization" typo 2022-05-17 16:27:10 -04:00
version Update version/version.go to match enterprise 2022-05-09 10:07:19 -07:00
website docs: Update service/node identities text on ACL index (#13022) 2022-05-18 18:55:35 -07:00
.dockerignore
.gitattributes
.gitignore changelog: snapshot-agent acl token fix for CLI and ENV 2022-04-25 16:46:55 -04:00
.golangci.yml build: speed up linting by 1.5x (#12908) 2022-05-05 12:42:52 -05:00
CHANGELOG.md Update changelog for 1.12.0 2022-04-20 17:10:03 -07:00
Dockerfile
fixup_acl_move.sh
GNUmakefile test: cleanup and unflake parts of the upgrade compat tests (#13126) 2022-05-18 14:52:26 -05:00
go.mod peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
go.sum peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
LICENSE
main.go
NOTICE.md
README.md
Vagrantfile

Consul logo Consul

Docker Pulls Go Report Card

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.

Consul provides several key features:

  • Multi-Datacenter - Consul is built to be datacenter aware, and can support any number of regions without complex configuration.

  • Service Mesh/Service Segmentation - Consul Connect enables secure service-to-service communication with automatic TLS encryption and identity-based authorization. Applications can use sidecar proxies in a service mesh configuration to establish TLS connections for inbound and outbound connections without being aware of Connect at all.

  • Service Discovery - Consul makes it simple for services to register themselves and to discover other services via a DNS or HTTP interface. External services such as SaaS providers can be registered as well.

  • Health Checking - Health Checking enables Consul to quickly alert operators about any issues in a cluster. The integration with service discovery prevents routing traffic to unhealthy hosts and enables service level circuit breakers.

  • Key/Value Storage - A flexible key/value store enables storing dynamic configuration, feature flagging, coordination, leader election and more. The simple HTTP API makes it easy to use anywhere.

Consul runs on Linux, macOS, FreeBSD, Solaris, and Windows and includes an optional browser based UI. A commercial version called Consul Enterprise is also available.

Please note: We take Consul's security and our users' trust very seriously. If you believe you have found a security issue in Consul, please responsibly disclose by contacting us at security@hashicorp.com.

Quick Start

A few quick start guides are available on the Consul website:

Documentation

Full, comprehensive documentation is available on the Consul website:

https://www.consul.io/docs

Contributing

Thank you for your interest in contributing! Please refer to CONTRIBUTING.md for guidance. For contributions specifically to the browser based UI, please refer to the UI's README.md for guidance.