Commit graph

53 commits

Author SHA1 Message Date
Mahmood Ali b9e3cde865 tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle d8e5e02398 Wiring algorithm to scheduler calls 2020-05-01 13:13:29 -04:00
Michael Schurter 4c5a0cae35 core: fix node reservation scoring
The BinPackIter accounted for node reservations twice when scoring nodes
which could bias scores toward nodes with reservations.

Pseudo-code for previous algorithm:
```
	proposed  = reservedResources + sum(allocsResources)
	available = nodeResources - reservedResources
	score     = 1 - (proposed / available)
```

The node's reserved resources are added to the total resources used by
allocations, and then the node's reserved resources are later
substracted from the node's overall resources.

The new algorithm is:
```
	proposed  = sum(allocResources)
	available = nodeResources - reservedResources
	score     = 1 - (proposed / available)
```

The node's reserved resources are no longer added to the total resources
used by allocations.

My guess as to how this bug happened is that the resource utilization
variable (`util`) is calculated and returned by the `AllocsFit` function
which needs to take reserved resources into account as a basic
feasibility check.

To avoid re-calculating alloc resource usage (because there may be a
large number of allocs), we reused `util` in the `ScoreFit` function.
`ScoreFit` properly accounts for reserved resources by subtracting them
from the node's overall resources. However since `util` _also_ took
reserved resources into account the score would be incorrect.

Prior to the fix the added test output:
```
Node: reserved     Score: 1.0000
Node: reserved2    Score: 1.0000
Node: no-reserved  Score: 0.9741
```

The scores being 1.0 for *both* nodes with reserved resources is a good
hint something is wrong as they should receive different scores. Upon
further inspection the double accounting of reserved resources caused
their scores to be >1.0 and clamped.

After the fix the added test outputs:
```
Node: no-reserved  Score: 0.9741
Node: reserved     Score: 0.9480
Node: reserved2    Score: 0.8717
```
2020-04-15 15:13:30 -07:00
James Rasell f125b5fb2d scaling: ensure min and max int64s are in toplevel of block. 2020-03-24 13:57:15 +00:00
Chris Baker abc7a52f56 finished refactoring state store, schema, etc 2020-03-24 13:57:14 +00:00
Chris Baker 6665d0bfb0 wip: added policy get endpoint, added UUID to policy 2020-03-24 13:55:20 +00:00
Chris Baker 65d92f1fbf WIP: adding ScalingPolicy to api/structs and state store 2020-03-24 13:55:18 +00:00
Alex Dadgar e3cbb2c82e allocs fit checks if devices get oversubscribed 2018-11-07 10:33:22 -08:00
Alex Dadgar 01f8e5b95f renames 2018-10-04 14:57:25 -07:00
Alex Dadgar bac5cb1e8b Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Alex Dadgar 99498da6ed Denormalize jobs in plan and ignore resources of terminal allocs
Denormalize jobs in AppendAllocs:
AppendAlloc was originally only ever called for inplace upgrades and new
allocations. Both these code paths would remove the job from the
allocation. Now we use this to also add fields such as FollowupEvalID
which did not normalize the job. This is only a performance enhancement.

Ignore terminal allocs:
Failed allocations are annotated with the followup Eval ID when one is
created to replace the failed allocation. However, in the plan applier,
when we check if allocations fit, these terminal allocations were not
filtered. This could result in the plan being rejected if the node would
be overcommited if the terminal allocations resources were considered.
2018-09-24 13:53:43 -07:00
Preetha Appan 9bc0962527
Track top k nodes by norm score rather than top k nodes per scorer 2018-09-04 16:10:11 -05:00
Preetha Appan 6ed527c636
Use heap to store top K scoring nodes.
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Preetha Appan 0037d72fa8
Structs and validation for spread 2018-09-04 16:10:11 -05:00
Preetha Appan 9f0caa9c3d
Affinity parsing, api and structs 2018-09-04 16:10:11 -05:00
Alex Dadgar 6dd1c9f49d Refactor 2018-02-15 13:59:00 -08:00
Michael Schurter a66c53d45a Remove structs import from api
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar 9b997d2670 fix multierror merge 2017-09-13 21:48:52 -07:00
Alex Dadgar a2363e7583 sync acls 2017-09-13 11:38:29 -07:00
Armon Dadgar 76a03f2d8e Address @dadgar feedback 2017-09-04 13:05:53 -07:00
Armon Dadgar e7586a80df nomad: Switch from SHA1 to Blake2 @chelseakomlo 2017-09-04 13:05:36 -07:00
Armon Dadgar fc23a4e7e5 structs: sort policies to avoid order dependence for caching 2017-09-04 13:05:36 -07:00
Armon Dadgar 98e0f98f7e structs: Adding ACL compilation helper 2017-09-04 13:05:35 -07:00
Armon Dadgar 583e654246 structs: cache key helper for policy list 2017-09-04 13:05:35 -07:00
Alex Dadgar 1cb877699a Disallow update stanza on batch jobs
This PR:
* disallows update stanzas on batch jobs
* undeprecates the stagger field
* changes the way warnings are returned
2017-07-07 12:11:39 -07:00
Alex Dadgar c77944ed29 assign names 2017-07-07 12:03:11 -07:00
Alex Dadgar b67c40f717 Proper denormalization in optimistic state store 2017-05-01 14:49:57 -07:00
Diptanu Choudhury e927de02d2 Moved functions to helper from structs 2017-01-18 15:55:14 -08:00
Alex Dadgar cfd9593e7a dispatch beginning 2016-11-25 18:04:55 -08:00
Alex Dadgar 54bcde8e36 Dispatch structs 2016-11-23 15:03:13 -08:00
Alex Dadgar aadc9e3017 Add implicit signal constraint and validate that a driver can handle the signal. Also fixes a bug with plan and implicit constraints by adding them to the job being planned 2016-10-20 13:55:35 -07:00
Diptanu Choudhury d94bb45ad3 Added some more comments 2016-08-31 14:06:31 -07:00
Diptanu Choudhury 52e9946da9 Implemented SetPrefferingNodes in stack 2016-08-30 16:17:50 -07:00
Diptanu Choudhury bfee7b30a3 Introducing shared resources in alloc 2016-08-29 13:49:25 -07:00
Diptanu Choudhury e79cb67391 Changing implementation of AllocsFit 2016-08-26 17:28:29 -05:00
Alex Dadgar 94b870a58b Start 2016-08-19 16:40:37 -07:00
Alex Dadgar 9bd9948c5b Job Register endpoint validates token 2016-08-17 16:25:38 -07:00
Alex Dadgar 7d899b6c60 Pass Vault config to client 2016-08-17 16:23:29 -07:00
Alex Dadgar ad92e50a24 Avoid serializes Allocation.Resources 2016-03-01 14:09:25 -08:00
Armon Dadgar 9784bb7285 nomad: cache bitmaps to avoid GC pressure 2016-02-20 12:18:22 -08:00
Alex Dadgar bb36684fec Copy handles empty case 2016-02-11 09:08:20 -08:00
Alex Dadgar 4d7ed4f164 Strip as much copystructure as possible 2016-02-10 17:54:43 -08:00
Armon Dadgar b213462cb4 Change CPU from float64 to int 2015-09-23 11:14:32 -07:00
Armon Dadgar cbc9b6dae2 nomad: thread alloc fit failure reason through 2015-09-13 18:38:11 -07:00
Armon Dadgar 1884296ff8 nomad: remove PortsOvercommited in favor of NetworkIndex 2015-09-13 14:56:51 -07:00
Armon Dadgar 5b6d365709 nomad: removing public field of network 2015-09-12 17:32:08 -07:00
Armon Dadgar 293e44474b nomad: adding helper structs 2015-09-07 15:08:50 -07:00
Armon Dadgar 42f9d4c1b6 nomad: plan supports more than just evict 2015-08-25 16:52:56 -07:00
Armon Dadgar 5b2dc385ec nomad: adding evict state for allocs 2015-08-22 18:27:51 -07:00
Armon Dadgar df21ab3d10 scheduler: working on bin pack 2015-08-13 11:54:59 -07:00