Once a peering is marked for deletion a new leader routine will now
clean up all imported resources and then the peering itself.
A lot of the logic was grabbed from the namespace/partitions deferred
deletions but with a handful of simplifications:
- The rate limiting is not configurable.
- Deleting imported nodes/services/checks is done by deleting nodes with
the Txn API. The services and checks are deleted as a side-effect.
- There is no "round rate limiter" like with namespaces and partitions.
This is because peerings are purely local, and deleting a peering in
the datacenter does not depend on deleting data from other DCs like
with WAN-federated namespaces. All rate limiting is handled by the
Raft rate limiter.
* Fixes a lint warning about t.Errorf not supporting %w
* Enable running autopilot on all servers
On the non-leader servers all they do is update the state and do not attempt any modifications.
* Fix the RPC conn limiting tests
Technically they were relying on racey behavior before. Now they should be reliable.
Many places in consul already treated node names case insensitively.
The state store indexes already do it, but there are a few places that
did a direct byte comparison which have now been corrected.
One place of particular consideration is ensureCheckIfNodeMatches
which is executed during snapshot restore (among other places). If a
node check used a slightly different casing than the casing of the node
during register then the snapshot restore here would deterministically
fail. This has been fixed.
Primary approach:
git grep -i "node.*[!=]=.*node" -- ':!*_test.go' ':!docs'
git grep -i '\[[^]]*member[^]]*\]
git grep -i '\[[^]]*\(member\|name\|node\)[^]]*\]' -- ':!*_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
This query has been incorrectly querying by accessor ID since New ACLs
were added. However, the legacy token compat allowed this to continue to
work, since it made a fallback query for the anonymousToken ID.
PR #11184 removed this legacy token query, which means that the query by
accessor ID is now the only check for the anonymous token's existence.
This PR updates the GetBySecret call to use the secret ID of the token.
It seems like this was missing. Previously this was only called by init of ACLs during an upgrade.
Now that legacy ACLs are removed, nothing was calling stop.
Also remove an unused method from client.
TestAgentLeaks_Server was reporting a goroutine leak without this. Not sure if it would actually
be a leak in production or if this is due to the test setup, but seems easy enough to call it
this way until we remove legacyACLTokenUpgrade.
This field has been unnecessary for a while now. It was always set to the same value
as PrimaryDatacenter. So we can remove the duplicate field and use PrimaryDatacenter
directly.
This change was made by GoLand refactor, which did most of the work for me.
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
Previously we were inconsistently checking the response for errors. This
PR moves the response-is-error check into raftApply, so that all callers
can look at only the error response, instead of having to know that
errors could come from two places.
This should expose a few more errors that were previously hidden because
in some calls to raftApply we were ignoring the response return value.
Also handle errors more consistently. In some cases we would log the
error before returning it. This can be very confusing because it can
result in the same error being logged multiple times. Instead return
a wrapped error.
This way we only have to wait for the serf barrier to pass once before
we can make use of federation state APIs Without this patch every
restart needs to re-compute the change.
After fixing that bug I uncovered a couple more:
Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.
* Consul Service meta wrongly computes and exposes non_voter meta
In Serf Tags, entreprise members being non-voters use the tag
`nonvoter=1`, not `non_voter = false`, so non-voters in members
were wrongly displayed as voter.
Demonstration:
```
consul members -detailed|grep voter
consul20-hk5 10.200.100.110:8301 alive acls=1,build=1.8.4+ent,dc=hk5,expect=3,ft_fs=1,ft_ns=1,id=xxxxxxxx-5629-08f2-3a79-10a1ab3849d5,nonvoter=1,port=8300,raft_vsn=3,role=consul,segment=<all>,use_tls=1,vsn=2,vsn_max=3,vsn_min=2,wan_join_port=8302
```
* Added changelog
* Added changelog entry