Commit Graph

580 Commits

Author SHA1 Message Date
Arshneet Singh d4e7a5c005 Add comments to functions, and use require instead of assert 2019-04-23 09:57:21 -07:00
Arshneet Singh 4cf4324b8f Remove allowPlanOptimization from schedulers 2019-04-23 09:18:02 -07:00
Arshneet Singh 0dd4c109e8 Compat tags 2019-04-23 09:18:01 -07:00
Arshneet Singh 65f5fab131 Add tests for plan normalization 2019-04-23 09:18:01 -07:00
Arshneet Singh b977748a4b Add code for plan normalization 2019-04-23 09:18:01 -07:00
Danielle 198a838b61
Merge pull request #5512 from hashicorp/dani/f-alloc-stop
alloc-lifecycle: nomad alloc stop
2019-04-23 13:05:08 +02:00
Danielle Lancashire 832f607433 allocs: Add nomad alloc stop
This adds a `nomad alloc stop` command that can be used to stop and
force migrate an allocation to a different node.

This is built on top of the AllocUpdateDesiredTransitionRequest and
explicitly limits the scope of access to that transition to expose it
under the alloc-lifecycle ACL.

The API returns the follow up eval that can be used as part of
monitoring in the CLI or parsed and used in an external tool.
2019-04-23 12:50:23 +02:00
Preetha Appan bcb5c8c70d
remove stray new line 2019-04-12 10:32:48 -05:00
Preetha Appan 8ddc076c1d
Refactor scheduler package to enable preemption for batch/service jobs 2019-04-10 20:24:01 -05:00
James Rasell 9470507cf4
Add NodeName to the alloc/job status outputs.
Currently when operators need to log onto a machine where an alloc
is running they will need to perform both an alloc/job status
call and then a call to discover the node name from the node list.

This updates both the job status and alloc status output to include
the node name within the information to make operator use easier.

Closes #2359
Cloess #1180
2019-04-10 10:34:10 -05:00
Preetha Appan da1ce9bcea
Fix bug where scoring metadata would be overridden during an inplace upgrade. 2019-03-12 23:36:46 -05:00
Alex Dadgar 41265d4d61 Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
Nick Ethier 24cbf42798
scheduler: fix NPE when deployment is nil, but placement is a canary 2019-01-28 20:22:59 -06:00
Alex Dadgar 5198ff05c3 convert driver to device for device constraint/attributes 2019-01-23 10:58:45 -08:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Preetha Appan 3b054d6135
Remove unnecessary usage of alloc.Resource 2019-01-10 16:36:47 -06:00
Mahmood Ali 0dfa93a3c1 appease linter 2019-01-08 10:58:49 -05:00
Alex Dadgar 8a35d7b1dd Test recovery 2019-01-07 14:49:41 -08:00
Preetha f406e66ab8
Merge pull request #4881 from hashicorp/f-device-preemption
Device preemption
2018-12-11 18:34:19 -06:00
Preetha Appan 977a4a540d
Early continue after meeting needed count
Also adds another optimization that filters out un-needed allocations
as a final filtering step
2018-12-11 10:12:18 -06:00
Preetha Appan f60c52c8ba
Score combinations of allocs from multiple devices for preemption 2018-12-07 18:35:47 -06:00
Alex Dadgar 1e3c3cb287 Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Preetha Appan 63681fac0c
use structured logging everywhere consistently 2018-12-03 08:31:41 -06:00
Preetha Appan 766820def3
addresses some code clarity review comments 2018-11-27 11:02:06 -06:00
Mahmood Ali 96ffe044e7
Simplify map count update logic
Co-Authored-By: preetapan <preetha@hashicorp.com>
2018-11-27 10:03:11 -06:00
Mahmood Ali 57b94c2d50
code review suggestion
Co-Authored-By: preetapan <preetha@hashicorp.com>
2018-11-27 09:59:57 -06:00
Preetha Appan 86f416a984
Fix formatting 2018-11-16 20:45:52 -06:00
Preetha Appan 8efe6171e4
Fix preemption logic bug, need to group allocations by device first.
This ensures that the set of allocations chosen for preemption all share
the same device where ID is <vendor/type/device>
2018-11-16 20:32:10 -06:00
Danielle Tomlinson 9c72dafc95 scheduler: Add is_set/is_not_set constraints
This adds constraints for asserting that a given attribute or value
exists, or does not exist. This acts as a companion to =, or !=
operators, e.g:

```hcl
constraint {
        attribute = "${attrs.type}"
        operator  = "!="
        value     = "database"
}

constraint {
        attribute = "${attrs.type}"
        operator  = "is_set"
}
```
2018-11-15 11:00:32 -08:00
Preetha Appan 998968f57a
fix linting 2018-11-15 12:27:32 -06:00
Preetha Appan e5de50fba8
Initial implementation of device preemption 2018-11-15 11:09:26 -06:00
Danielle Tomlinson e5c641daa9 scheduler: Allow comparisons of nil values
This commit allows the ConstraintChecker to test values that do not exist.
This is useful when wanting to _exclude_ given nodes from executing a
job, for example, if you wanted to give canary nodes an attribute, and
not run critical services on them, you may specify something like the
below, but not want to tag all other nodes with the inverse.

```hcl
constraint {
  attribute = "${node.attr.canary}
  operator = "!="
  value = "1"
}
```

This also requires all constraint checkers to allow for nil target
values, as they will no longer be short circuited by resolving a target.
2018-11-13 13:36:51 -08:00
Alex Dadgar 08dc2ea702
Merge pull request #4867 from hashicorp/b-deployment-progress-deadline
Blocked evaluation fixes
2018-11-13 10:29:03 -08:00
Preetha Appan de890b9d5c
blank line 2018-11-12 15:50:14 -06:00
Preetha Appan 20af09a1ef
Fix logic bug in tracking sum of matched affinity weights
We need to track the sum of matching weights per device, but only
change the final return value if its the highest scoring choice
2018-11-12 15:06:45 -06:00
Preetha Appan 285b9b6001
Normalize scores correctly 2018-11-08 17:01:58 -06:00
Preetha Appan f20f2ca8e9
Fixes device scheduling unit tests
Also changes the logic for score when there is more than one task
requesting a device. Since inter task affinities are already normalized,
we take the average of the scores across tasks.
2018-11-08 10:31:19 -06:00
Alex Dadgar dbb05357bc fix test 2018-11-07 11:59:24 -08:00
Alex Dadgar a7ca737fb6 review comments 2018-11-07 11:31:52 -08:00
Alex Dadgar 36abd3a3d8 review comments 2018-11-07 10:33:22 -08:00
Alex Dadgar e3cbb2c82e allocs fit checks if devices get oversubscribed 2018-11-07 10:33:22 -08:00
Alex Dadgar 4f9b3ede87 Split device accounter and allocator 2018-11-07 10:32:03 -08:00
Alex Dadgar 6fa893c801 affinities 2018-11-07 10:32:03 -08:00
Alex Dadgar feb83a2be3 assign devices 2018-11-07 10:32:03 -08:00
Alex Dadgar 6d8bb3a7bd Duplicate blocked evals cancelling improved
The old logic for cancelling duplicate blocked evaluations by job id had
the issue where the newer evaluation could have additional node classes
that it is (in)eligible for that we would not capture. This could make
it such that cluster state could change such that the job would make
progress but no evaluation was unblocked.
2018-11-07 10:08:23 -08:00
Preetha Appan a6b714b81c
update preemption tests to use new node resource structs
also includes a fix to remove unnecessary subtraction of network mbits
2018-11-02 17:59:53 -05:00
Preetha b2b52b1ada
Merge pull request #4794 from hashicorp/f-preemption-systemjobs
Preemption for system jobs
2018-11-02 16:28:06 -05:00
Preetha Appan 56de32f363
Address more minor code review feedback 2018-11-02 16:26:34 -05:00
Preetha Appan 253a351532
Fix test setup 2018-11-02 16:06:25 -05:00
Preetha Appan fba24e5a8a
dereference safely 2018-11-02 15:58:59 -05:00