open-consul/.changelog/19866.txt
hc-github-team-consul-core 48b9a0ff8c
Backport of Fix xDS missing endpoint race condition. into release/1.16.x (#19873)
Fix xDS missing endpoint race condition.

This fixes the following race condition:
- Send update endpoints
- Send update cluster
- Recv ACK endpoints
- Recv ACK cluster

Prior to this fix, it would have resulted in the endpoints NOT existing in
Envoy. This occurred because the cluster update implicitly clears the endpoints
in Envoy, but we would never re-send the endpoint data to compensate for the
loss, because we would incorrectly ACK the invalid old endpoint hash. Since the
endpoint's hash did not actually change, they would not be resent.

The fix for this is to effectively clear out the invalid pending ACKs for child
resources whenever the parent changes. This ensures that we do not store the
child's hash as accepted when the race occurs.

An escape-hatch environment variable `XDS_PROTOCOL_LEGACY_CHILD_RESEND` was
added so that users can revert back to the old legacy behavior in the event
that this produces unknown side-effects.

This bug report and fix was mostly implemented by @ksmiley with some minor
tweaks.

Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
Co-authored-by: Keith Smiley <ksmiley@salesforce.com>
2023-12-08 12:16:43 -06:00

4 lines
147 B
Plaintext

```release-note:bug
xds: ensure child resources are re-sent to Envoy when the parent is updated even if the child already has pending updates.
```