Previously we believe it was necessary for all code that required ports
to use freeport to prevent conflicts.
https://github.com/dnephin/freeport-test shows that it is actually save
to use port 0 (`127.0.0.1:0`) as long as it is passed directly to
`net.Listen`, and the listener holds the port for as long as it is
needed.
This works because freeport explicitly avoids the ephemeral port range,
and port 0 always uses that range. As you can see from the test output
of https://github.com/dnephin/freeport-test, the two systems never use
overlapping ports.
This commit converts all uses of freeport that were being passed
directly to a net.Listen to use port 0 instead. This allows us to remove
a bit of wrapping we had around httptest, in a couple places.
Some global variables are patched to shorter values in these tests. But the goroutines that read
them can outlive the test because nothing waited for them to exit.
This commit adds a Wait() method to the routine manager, so that tests can wait for the goroutines
to exit. This prevents the data race because the 'reset to original value' can happen
after all other goroutines have stopped.
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
I changed this in https://github.com/hashicorp/consul/pull/8802#pullrequestreview-500779357 because
exceeding the MaxWait seemed wrong, but as other have pointed out, that behaviour is probably correct.
When multiple waiters hit the max value, we don't want them to converge, so restore the behaviour of
allowing jitter to exceed max, and document it.
Add a skip condition to all tests slower than 100ms.
This change was made using `gotestsum tool slowest` with data from the
last 3 CI runs of master.
See https://github.com/gotestyourself/gotestsum#finding-and-skipping-slow-tests
With this change:
```
$ time go test -count=1 -short ./agent
ok github.com/hashicorp/consul/agent 0.743s
real 0m4.791s
$ time go test -count=1 -short ./agent/consul
ok github.com/hashicorp/consul/agent/consul 4.229s
real 0m8.769s
```
Prepopulate was setting entry.Expiry.HeapIndex to 0. Previously this would result in a call to heap.Fix(0)
which wasn't correct, but was also not really a problem because at worse it would re-notify.
With the recent change to extract cachettl it was changed to call Update(idx), which would have updated
the wrong entry.
A previous commit removed the setting of entry.Expiry so that the HeapIndex would be reported
as -1, and this commit adds a test and handles the -1 heap index.
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.
This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
Reduce Jitter to one function
Rename NewRetryWaiter
Fix a bug in calculateWait where maxWait was applied before jitter, which would make it
possible to wait longer than maxWait.
The fields are only ever read by Waiter, and
setting the fields makes the calling code read much better without
having to create a bunch of constants that only ever get used once.
IsErrEOF returns false when it should return true in a couple of cases:
1. if the error has been wrapped in another error (for example, if EOF
is wrapped in an RPC error)
2. if the error has been created from an Error field in an RPC response
(as it is the case in CallWithCodec in the net-rpc-msgpackrpc package
for example)
This might be better handled by allowing configuration for the InMemSink interval and retail, and disabling
the global. For now this is a smaller change to remove the goroutine leak caused by tests because go-metrics
does not provide any way of shutting down the global goroutine.
Also handle []interface{} in HookWeakDecodeFromSlice
Without this change only the top level []map[string]interface{} will be
unpacked as a single item. With this change any nested config will be
unpacked.
Currently opaque config blocks (config entries, and CA provider config) are
modified by PatchSliceOfMaps, making it impossible for these opaque
config sections to contain slices of maps.
In order to fix this problem, any lazy-decoding of these blocks needs to support
weak decoding of []map[string]interface{} to a struct type before
PatchSliceOfMaps is replaces. This is necessary because these config
blobs are persisted, and during an upgrade an older version of Consul
could read one of the new configuration values, which would cause an error.
To support the upgrade path, this commit first introduces the new hooks
for weak decoding of []map[string]interface{} and uses them only in the
lazy-decode paths. That way, in a future release, new style
configuration will be supported by the older version of Consul.
This decode hook has a number of advantages:
1. It no longer panics. It allows mapstructure to report the error
2. It no longer requires the user to declare which fields are slices of
structs. It can deduce that information from the 'to' value.
3. It will make it possible to preserve opaque configuration, allowing
for structured opaque config.
* testing: replace most goe/verify.Values with require.Equal
One difference between these two comparisons is that go/verify considers
nil slices/maps to be equal to empty slices/maps, where as testify/require
does not, and does not appear to provide any way to enable that behaviour.
Because of this difference some expected values were changed from empty
slices to nil slices, and some calls to verify.Values were left.
* Remove github.com/pascaldekloe/goe/verify
Reduce the number of assertion packages we use from 2 to 1