10 minutes is the default blocking query timeout. Using the same value results in us hitting
the expired cache entry bug frequently. By extending this TTL we at least mitigate the problem.
The underlying bug still needs to be fixed.
The materializer is often reset when an error is received. By resetting
the retryWaiter we effectively never wait. The retryWaiter should only
be reset when we get an event without error. This is done in
Materializer.updateView().
Streaming can not be used for these queries because the near query
paramter indicates a specific sort of the results, and that sort
requires data that is not available to the client from the streaming
API.
This fixes an issue where leaf certificates issued in primary
datacenters using Vault as a Connect CA would be reissued very
frequently (every ~20 seconds) because the logic meant to detect root
rotation was errantly triggering.
The hash of the rootCA was being compared against a hash of the
intermediateCA and always failing. This doesn't apply to the Consul
built-in CA provider because there is no intermediate in use in the
primary DC.
This is reminiscent of #6513
In a situation where the mesh gateway is configured to bind to multiple
network interfaces, we use a feature called 'tagged addresses'.
Sometimes an address is duplicated across multiple tags such as 'lan'
and 'lan_ipv4'.
There is code to deduplicate these things when creating envoy listeners,
but that code doesn't ensure that the same tag wins every time. If the
winning tag flaps between xDS discovery requests it will cause the
listener to be drained and replaced.
An old PR (#7623) was merged after #9585. The old code was incompatible with the new
changes, but none of the lines caused a git conflict so the merge was allowed.
The incompatible changes caused the tests to fail. This fixes the old code to
work with the new changes.
registerSchema creates some indirection which is not necessary in this
case. newDBSchema can call each of the tables.
Enterprise tables can be added from the existing withEnterpriseSchema
shim.
Deadlock scenario:
1. Due to scheduling, the state runner sends one snapshot into
snapCh and then attempts to send a second. The first send succeeds
because the channel is buffered, but the second blocks.
2. Separately, Manager.Watch is called by the xDS server after
getting a discovery request from Envoy. This function acquires the
manager lock and then blocks on receiving the CurrentSnapshot from
the state runner.
3. Separately, there is a Manager goroutine that reads the snapshots
from the channel in step 1. These reads are done to notify proxy
watchers, but they require holding the manager lock. This goroutine
goes to acquire that lock, but can't because it is held by step 2.
Now, the goroutine from step 3 is waiting on the one from step 2 to
release the lock. The goroutine from step 2 won't release the lock until
the goroutine in step 1 advances. But the goroutine in step 1 is waiting
for the one in step 3. Deadlock.
By making this send non-blocking step 1 above can proceed. The coalesce
timer will be reset and a new valid snapshot will be delivered after it
elapses or when one is requested by xDS.