We have seen test flakes caused by 'concurrent map read and map write', and the race detector
reports the problem as well (prevent us from running some tests with -race).
The root of the problem is the grpc expects resolvers to be registered at init time
before any requests are made, but we were using a separate resolver for each test.
This commit introduces a resolver registry. The registry is registered as the single
resolver for the consul scheme. Each test uses the Authority section of the target
(instead of the scheme) to identify the resolver that should be used for the test.
The scheme is used for lookup, which is why it can no longer be used as the unique
key.
This allows us to use a lock around the map of resolvers, preventing the data race.
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
* WIP reloadable raft config
* Pre-define new raft gauges
* Update go-metrics to change gauge reset behaviour
* Update raft to pull in new metric and reloadable config
* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important
* Update telemetry docs
* Update config and telemetry docs
* Add note to oldestLogAge on when it is visible
* Add changelog entry
* Update website/content/docs/agent/options.mdx
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
grpclog.SetLoggerV2 is meant to be called only once before any gRPC requests are received, but
each test that uses TestAgent will call NewBaseDeps again. Use a sync.Once to prevent the grpc
logging from being re-initialized by each test.
This will mean that a test can't use a fake logger to capture logs from the gRPC server.
This commit reduces the interface to Load() a bit, in preparation for
unexporting NewBuilder and having everything call Load.
The three arguments are reduced to a single argument by moving the other
two into the options struct.
The three return values are reduced to two by moving the RuntimeConfig
and Warnings into a LoadResult struct.
* create consul version metric with version label
* agent/agent.go: add pre-release Version as well as label
Co-Authored-By: Radha13 <kumari.radha3@gmail.com>
* verion and pre-release version labels.
* hyphen/- breaks prometheus
* Add Prometheus gauge defintion for version metric
* Add new metric to telemetry docs
Co-authored-by: Radha Kumari <kumari.radha3@gmail.com>
Co-authored-by: Aestek <thib.gilles@gmail.com>
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
The router.Manager is already rebalancing servers for other connection pools, so it can call into our resolver to do the same.
This change allows us to remove the serf dependency from resolverBuilder, and remove Datacenter from the config.
Also revert the change to refreshServerRebalanceTimer
Rename GRPCClient to ClientConnPool. This type appears to be more of a
conn pool than a client. The clients receive the connections from this
pool.
Reduce some dependencies by adjusting the interface baoundaries.
Remove the need to create a second slice of Servers, just to pick one and throw the rest away.
Unexport serverResolver, it is not used outside the package.
Use a RWMutex for ServerResolverBuilder, some locking is read-only.
Add more godoc.
In an upcoming change we will need to pass a grpc.ClientConnPool from
BaseDeps into Server. While looking at that change I noticed all of the
existing consulOption fields are already on BaseDeps.
Instead of duplicating the fields, we can create a struct used by
agent/consul, and use that struct in BaseDeps. This allows us to pass
along dependencies without translating them into different
representations.
I also looked at moving all of BaseDeps in agent/consul, however that
created some circular imports. Resolving those cycles wouldn't be too
bad (it was only an error in agent/consul being imported from
cache-types), however this change seems a little better by starting to
introduce some structure to BaseDeps.
This change is also a small step in reducing the scope of Agent.
Also remove some constants that were only used by tests, and move the
relevant comment to where the live configuration is set.
Removed some validation from NewServer and NewClient, as these are not
really runtime errors. They would be code errors, which will cause a
panic anyway, so no reason to handle them specially here.
And into token.Store. This change isolates any awareness of token
persistence in a single place.
It is a small step in allowing Agent.New to accept its dependencies.
This might be better handled by allowing configuration for the InMemSink interval and retail, and disabling
the global. For now this is a smaller change to remove the goroutine leak caused by tests because go-metrics
does not provide any way of shutting down the global goroutine.
With this change, Agent.New() accepts many of the dependencies instead
of creating them in New. Accepting fully constructed dependencies from
a constructor makes the type easier to test, and easier to change.
There are still a number of dependencies created in Start() which can
be addressed in a follow up.