When this test flakes sometimes this happens:
--- FAIL: TestCoordinate_Node (1.69s)
panic: interface conversion: interface {} is nil, not structs.Coordinates [recovered]
FAIL github.com/hashicorp/consul/agent 19.999s
Exit code: 1
panic: interface conversion: interface {} is nil, not structs.Coordinates [recovered]
panic: interface conversion: interface {} is nil, not structs.Coordinates
There is definitely a bug lurking, but the code seems to imply this can
only return nil on 404. The tests previously were not checking the
status code.
The underlying cause of the flake is unknown, but this should turn the
failure into a more normal test failure.
When there is an node name conflicts, such messages are displayed within Consul:
`consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "e1d456bc-f72d-98e5-ebb3-26ae80d785cf": Node name node001 is reserved by node 05f10209-1b9c-b90c-e3e2-059e64556d4a with name node001`
While it is easy to find the node that has reserved the name, it is hard to find
the node trying to aquire the name since it is not registered, because it
is not part of `consul members` output
This PR will display the IP of the offender and solve far more easily those issues.
Since FUNCNAME is not defined when running outside a function,
trap does not work and display wrong error message.
Example from https://circleci.com/gh/hashicorp/consul/69506 :
```
⨯ FAIL
/home/circleci/project/test/integration/connect/envoy/run-tests.sh: line 1: FUNCNAME[0]: unbound variable
make: *** [GNUmakefile:363: test-envoy-integ] Error 1
```
This fix will avoid this error message and display the real cause.
The embedded `Server` field on a `DNSServer` is only set inside of the
`ListenAndServe` method. If that method fails for reasons like the
address being in use and is not bindable, then the `Server` field will
not be set and the overall `Agent.Start()` will fail.
This will trigger the inner loop of `TestAgent.Start()` to invoke
`ShutdownEndpoints` which will attempt to pretty print the DNS servers
using fields on that inner `Server` field. Because it was never set,
this causes a nil pointer dereference and crashes the test.
* website: Update middleman-hashicorp container and Gemfile.lock
Time marches on, and so do security vulnerabilities in Nokogiri. So it's time
for a new container.
As with last time, here's a reminder for the next person who needs to update
this:
- You shouldn't just update the dependency in Gemfile.lock, because your build
times will go to heck as you compile Nokogiri from source on every run. So you
need an updated container with all the dependencies.
- To update the container, you need to push a new tag to the middleman-hashicorp
repo. Teamcity does the rest, and will ship a new container to Docker Hub
(unless its credentials are out of date, in which case go ask team-eng-serv.)
- Once that's pushed:
- Update Makefile
- Update the Gemfile
- Delete Gemfile.lock
- `make website` until it comes up, then ctrl-C
- Commit the changes
* website: Specify a different json version in Gemfile.lock
The Consul website uses different containers for preview and deploy, and this
oddball JSON version was causing issues. This commit sacrifices a little bit
of preview startup speed for (hopefully) working deploys.
Previously `verify_incoming` was required when turning on `auto_encrypt.allow_tls`, but that doesn't work together with HTTPS UI in some scenarios. Adding `verify_incoming_rpc` to the allowed configurations.
AutoEncrypt needs the server-port because it wants to talk via RPC. Information from gossip might not be available at that point and thats why the server-port is being used.
- Bootstrap escape hatches are OK.
- Public listener/cluster escape hatches are OK.
- Upstream listener/cluster escape hatches are not supported.
If an unsupported escape hatch is configured and the discovery chain is
activated log a warning and act like it was not configured.
Fixes#6160