open-nomad

Commit Graph

Author	SHA1	Message	Date
Tim Gross	5fc63ace0b	scheduler: count implicit spread targets as a single target (#17195 ) When calculating the score in the `SpreadIterator`, the score boost is proportional to the difference between the current and desired count. But when there are implicit spread targets, the current count is the sum of the possible implicit targets, which results in incorrect scoring unless there's only one implicit target. This changeset updates the `propertySet` struct to accept a set of explicit target values so it can detect when a property value falls into the implicit set and should be combined with other implicit values. Fixes: #11823	2023-05-17 10:25:00 -04:00
Tim Gross	2426aae832	scheduler: prevent -Inf in spread scoring (#17198 ) When spread targets have a percent value of zero it's possible for them to return -Inf scoring because of a float divide by zero. This is very hard for operators to debug because the string "-Inf" is returned in the API and that breaks the presentation of debugging data. Most scoring iterators are bracketed to -1/+1, but spread iterators do not so that they can handle greatly unbalanced scoring so we can't simply return a -1 score without generating a score that might be greater than the negative scores set by other spread targets. Instead, track the lowest-seen spread boost and use that as the spread boost for any cases where we'd divide by zero. Fixes: #8863	2023-05-16 16:01:32 -04:00
hashicorp-copywrite[bot]	005636afa0	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Piotr Kazmierczak	14b53df3b6	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
Tim Gross	74486d86fb	scheduler: prevent panic in spread iterator during alloc stop The spread iterator can panic when processing an evaluation, resulting in an unrecoverable state in the cluster. Whenever a panicked server restarts and quorum is restored, the next server to dequeue the evaluation will panic. To trigger this state: * The job must have `max_parallel = 0` and a `canary >= 1`. * The job must not have a `spread` block. * The job must have a previous version. * The previous version must have a `spread` block and at least one failed allocation. In this scenario, the desired changes include `(place 1+) (stop 1+), (ignore n) (canary 1)`. Before the scheduler can place the canary allocation, it tries to find out which allocations can be stopped. This passes back through the stack so that we can determine previous-node penalties, etc. We call `SetJob` on the stack with the previous version of the job, which will include assessing the `spread` block (even though the results are unused). The task group spread info state from that pass through the spread iterator is not reset when we call `SetJob` again. When the new job version iterates over the `groupPropertySets`, it will get an empty `spreadAttributeMap`, resulting in an unexpected nil pointer dereference. This changeset resets the spread iterator internal state when setting the job, logging with a bypass around the bug in case we hit similar cases, and a test that panics the scheduler without the patch.	2022-02-09 19:53:06 -05:00
Preetha Appan	afff27b69b	More error->debug for logging in the bin packing iterator	2019-12-12 15:50:16 -06:00
Preetha Appan	374eee421f	Fix comment and assert score in test case	2019-05-15 12:35:57 -05:00
Nick Ethier	f0b9f8e37a	fix missing brace	2019-05-15 13:02:04 -04:00
Nick Ethier	0d851b5d11	scheduler: add check to prohibit returning inf during spread boost calculation	2019-05-15 13:00:24 -04:00
Alex Dadgar	41265d4d61	Change types of weights on spread/affinity	2019-01-30 12:20:38 -08:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Preetha Appan	9bc0962527	Track top k nodes by norm score rather than top k nodes per scorer	2018-09-04 16:10:11 -05:00
Preetha Appan	65cf4373b3	fix linting error	2018-09-04 16:10:11 -05:00
Preetha Appan	dd5fe6373f	Fix scoring logic for uneven spread to incorporate current alloc count Also addressed other small code review comments	2018-09-04 16:10:11 -05:00
Preetha Appan	e72c0fe527	more cleanup	2018-09-04 16:10:11 -05:00
Preetha Appan	4c624424e6	added some unit tests for -1 spread score	2018-09-04 16:10:11 -05:00
Preetha Appan	92d37acc2a	comment and formatting cleanup	2018-09-04 16:10:11 -05:00
Preetha Appan	7b0a27cad6	fix scoring algorithm when min count == current count	2018-09-04 16:10:11 -05:00
Preetha Appan	bad075f640	Remove hardcoded boosts for even spread. instead, calculate them based on delta between current and minimum value	2018-09-04 16:10:11 -05:00
Preetha Appan	c56873ff37	Implement support for even spread across datacenters, with unit test	2018-09-04 16:10:11 -05:00
Preetha Appan	d091c00dd3	Support implicit spread target to account for remaining desired counts	2018-09-04 16:10:11 -05:00
Preetha Appan	33779abe5f	fix comments	2018-09-04 16:10:11 -05:00
Preetha Appan	55f276c189	Include spreads configured at job level when precomputing weights/desired counts.	2018-09-04 16:10:11 -05:00
Preetha Appan	db0d95b09c	Implement spread iterator that scores according to percentage of desired count in each target. Added this as a new step in the stack and some unit tests	2018-09-04 16:10:11 -05:00

24 Commits