open-nomad/.changelog/18200.txt at 71f8405b2d58bf862b971ea98f25b8228f856298 - luxolus/open-nomad - Stateless Git Forge

luxolus/open-nomad

Tim Gross 0a19fe3b60 fix multiple overflow errors in exponential backoff (#18200 )

We use capped exponential backoff in several places in the code when handling
failures. The code we've copy-and-pasted all over has a check to see if the
backoff is greater than the limit, but this check happens after the bitshift and
we always increment the number of attempts. This causes an overflow with a
fairly small number of failures (ex. at one place I tested it occurs after only
24 iterations), resulting in a negative backoff which then never recovers. The
backoff becomes a tight loop consuming resources and/or DoS'ing a Nomad RPC
handler or an external API such as Vault. Note this doesn't occur in places
where we cap the number of iterations so the loop breaks (usually to return an
error), so long as the number of iterations is reasonable.

Introduce a helper with a check on the cap before the bitshift to avoid overflow in all 
places this can occur.

Fixes: #18199
Co-authored-by: stswidwinski <stan.swidwinski@gmail.com>

2023-08-15 14:39:09 -04:00

4 lines

104 B

Plaintext

Raw Blame History

	```release-note:bug
	`core: Fixed a bug where exponential backoff could result in excessive CPU usage`
	```