The current retry framework in testutil/testprc.WaitForResult uses
a func() (bool, error) callback until it succeeds or times out.
It captures the last error and returns it.
if err := testutil.WaitForResult(t, func() (bool, error) {
if err := foo(); err != nil {
return false, err
}
...
return true, nil
}); err != nil {
t.Fatal(err)
}
This makes the test functions more complex than they need to be since
both the boolean and the error indicate a success or a failure.
The retry.Run framework uses a an approach similar to t.Run()
from the testing framework.
retry.Run(t, func(r *retry.R) {
if err := foo(); err != nil {
r.Fatal(err)
}
})
The behavior of the Run function is configurable so that different
timeouts can be used for different tests.
`AddAccessibleService` works just like `AddService` but also passing
"address" and "port". It is helpfu when you need to prepare a
fakeService that will be accessed later in target source code.
This patch removes duplicate internal copies of constants in the structs
package which are also defined in the api package. The api.KVOp type
with all its values for the TXN endpoint and the api.HealthXXX constants
are now used throughout the codebase.
This resulted in some circular dependencies in the testutil package
which have been resolved by copying code and constants and moving the
WaitForLeader function into a separate testrpc package.
Ended up removing the leader_test.go server address change test as part
of this. The join was failing becase we were using a new node name with
the new logic here, but realized this was hitting some of the memberlist
conflict logic and not working as we expected. We need some additional
work to fully support address changes, so removed the test for now.
There's likely a race (related to https://github.com/hashicorp/consul/issues/2644) where the catalog update might be in but the leader tracking doesn't report a leader, so this blocks forever and then times out. As a workaround we can lower the query wait time to always allow for a few retries.
The last change here made the time overall theoretically the same, but the overhead of running so quickly before probably meant that we were spending longer. Tests seemed marginal in Travis so doubling this to see how things go.
This bit me on CI. The current behavior of the testutil server is to skip if consul isn't present. When lots of output is scrolling by, you're likely to miss the message that the test was skipped. Instead, I propose that we hard fatal if consul doesn't exist, and upstream consumers can skip the tests if they want.
The testutil server uses an atomic incrementer to generate unique port
numbers. This works great until tests are run in parallel, _across
packages_. Because each package starts at the same "offset" idx, they
collide.
One way to overcome this is to run each packages' test in isolation, but
that makes the test suite much longer as it does not maximize
parallelization. Alternatively, instead of having "predictable" ports,
we can let the OS choose a random open port automatically.
This still has a (albeit smaller) race condition in that the OS could
return an open port twice, before the server has a chance to actually
start and occupy said port. In practice, I have not been able to hit
this race condition, so it either doesn't happen or it happens far less
frequently that the existing implementation.
I'm not sure how I feel about the panic, but this is just test code, so
I'm including to say it's okay?
I've removed unneeded conversions by performing the following commands:
$ go get -u github.com/mdempsky/unconvert
$ go list ./... | grep -v vendor | xargs unconvert -apply
Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
Two of the changes are in tests; the one of consequence is in the API.
As explained in #1308 this can cause conflicts with downstream programs.
Fixes#1308.