4c5a0cae35
The BinPackIter accounted for node reservations twice when scoring nodes which could bias scores toward nodes with reservations. Pseudo-code for previous algorithm: ``` proposed = reservedResources + sum(allocsResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are added to the total resources used by allocations, and then the node's reserved resources are later substracted from the node's overall resources. The new algorithm is: ``` proposed = sum(allocResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are no longer added to the total resources used by allocations. My guess as to how this bug happened is that the resource utilization variable (`util`) is calculated and returned by the `AllocsFit` function which needs to take reserved resources into account as a basic feasibility check. To avoid re-calculating alloc resource usage (because there may be a large number of allocs), we reused `util` in the `ScoreFit` function. `ScoreFit` properly accounts for reserved resources by subtracting them from the node's overall resources. However since `util` _also_ took reserved resources into account the score would be incorrect. Prior to the fix the added test output: ``` Node: reserved Score: 1.0000 Node: reserved2 Score: 1.0000 Node: no-reserved Score: 0.9741 ``` The scores being 1.0 for *both* nodes with reserved resources is a good hint something is wrong as they should receive different scores. Upon further inspection the double accounting of reserved resources caused their scores to be >1.0 and clamped. After the fix the added test outputs: ``` Node: no-reserved Score: 0.9741 Node: reserved Score: 0.9480 Node: reserved2 Score: 0.8717 ``` |
||
---|---|---|
.. | ||
config | ||
batch_future.go | ||
batch_future_test.go | ||
bitmap.go | ||
bitmap_test.go | ||
csi.go | ||
csi_test.go | ||
devices.go | ||
devices_test.go | ||
diff.go | ||
diff_test.go | ||
errors.go | ||
errors_test.go | ||
funcs.go | ||
funcs_test.go | ||
generate.sh | ||
network.go | ||
network_test.go | ||
node.go | ||
node_class.go | ||
node_class_test.go | ||
node_test.go | ||
operator.go | ||
service_identities.go | ||
services.go | ||
services_test.go | ||
streaming_rpc.go | ||
structs.go | ||
structs_codegen.go | ||
structs_test.go | ||
testing.go | ||
volumes.go |