Commit graph

13210 commits

Author SHA1 Message Date
Mahmood Ali 6f9126f475 show Device Stats header in alloc status 2018-11-16 17:34:37 -05:00
Mahmood Ali 00ffd02ced Show stable order of device attributes 2018-11-16 17:34:37 -05:00
Mahmood Ali f139234372 address review comments 2018-11-16 17:13:01 -05:00
Mahmood Ali 33c96a803a tweak whitespace in device stats output 2018-11-16 10:37:39 -05:00
Mahmood Ali 159b8f866a Display device stats in nomad alloc status 2018-11-16 10:26:32 -05:00
Mahmood Ali d88a3f8413 Prepare to reuse device resources printing 2018-11-16 10:26:32 -05:00
Mahmood Ali f72e599ee7 Populate alloc stats API with device stats
This change makes few compromises:

* Looks up the devices associated with tasks at look up time.  Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine.  However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.

* `client.Client` performs the task devices->statistics lookup.  It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
  * Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
  * The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)

* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
  * Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
  * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
2018-11-16 10:26:32 -05:00
Preetha 7873b212af
Merge pull request #4882 from sportebois/f-update-docs-env-stanza-coercion
Change misleading boolean example in env stanza coercion section (#4820)
2018-11-16 08:00:56 -06:00
Michael Schurter 6f3712ed48 gofmt -s -w command/helper_stats_test.go
Fixes the static checks build
2018-11-15 14:14:05 -08:00
Danielle Tomlinson 8bf17fe22d
Merge pull request #4875 from hashicorp/f-constraints
scheduler: Make != constraints more flexible
2018-11-15 11:04:21 -08:00
Mahmood Ali 1ec6f551a6
Merge pull request #4879 from hashicorp/f-node-devices-cli
Show device stats in nomad CLI
2018-11-15 14:02:22 -05:00
Danielle Tomlinson 890bde3482 Add changelog entry for != operator 2018-11-15 11:00:32 -08:00
Danielle Tomlinson 023ef40cd0 docs: Add is_set/is_not_set 2018-11-15 11:00:32 -08:00
Danielle Tomlinson 9c72dafc95 scheduler: Add is_set/is_not_set constraints
This adds constraints for asserting that a given attribute or value
exists, or does not exist. This acts as a companion to =, or !=
operators, e.g:

```hcl
constraint {
        attribute = "${attrs.type}"
        operator  = "!="
        value     = "database"
}

constraint {
        attribute = "${attrs.type}"
        operator  = "is_set"
}
```
2018-11-15 11:00:32 -08:00
Sébastien Portebois bf494cb86c Change misleading boolean example in env stanza coercion section 2018-11-15 13:12:58 -05:00
Mahmood Ali 24b37e0aaf Display StatsObject nested objects as well 2018-11-15 08:09:54 -05:00
Mahmood Ali ee9353fbd6 Use disk display format for devices 2018-11-14 22:13:23 -05:00
Mahmood Ali 0712be643f Print verbose device in nomad node status -stats 2018-11-14 22:13:23 -05:00
Mahmood Ali 93e8fc53f9 device stats summary in node status
Sample output with a mock device:

```
Host Resource Utilization
CPU             Memory          Disk
2651/26400 MHz  9.6 GiB/16 GiB  98 GiB/234 GiB

Device Resource Utilization
nomad/file/mock[README.md]    511 bytes
nomad/file/mock[e2e.go]       239 bytes
nomad/file/mock[e2e_test.go]  128 bytes

Allocations
No allocations placed
```
2018-11-14 22:13:23 -05:00
Mahmood Ali 5e2b31b692
Merge pull request #4872 from hashicorp/f-devices-client-stats
Expose Device Stats in Clients API
2018-11-14 21:18:23 -05:00
Mahmood Ali 1010213db2 format vendor.json 2018-11-14 20:17:11 -05:00
Alex Dadgar 232a78596c
Merge pull request #4874 from lunchbag/jen/update-share-img
Update open graph image
2018-11-14 14:36:44 -08:00
Mahmood Ali 7d3e34fab5 Add NodeResource Device types in api package 2018-11-14 14:42:36 -05:00
Mahmood Ali 046f098bac Track Node Device attributes and serve them in API 2018-11-14 14:42:29 -05:00
Mahmood Ali 63acda956c Add Client Device Stats structs in api package 2018-11-14 14:41:19 -05:00
Mahmood Ali b74ccc742c Expose Device Stats in /client/stats API endpoint 2018-11-14 14:41:19 -05:00
Mahmood Ali c5de71a424 Allow nullable fields in StatValues
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).

We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations.  Thus, I opted not to use it.
2018-11-14 14:41:19 -05:00
Mahmood Ali 713c9fe683 Move Stat{Object|Value} to plugins/shared/structs
Moving them as they may be useful for other packages/plugins besides
devices.
2018-11-14 09:01:26 -05:00
Mahmood Ali 1f4db08f42 Regenerate proto files with protoc-gen-go@v1.2.0 2018-11-14 09:01:26 -05:00
Mahmood Ali a4a9347501 fix comment typos 2018-11-14 08:36:14 -05:00
Omar Khawaja 0b9f32c6d2
AWS sandbox environment upgrade (#4873)
* upgrade Nomad from 0.8.4 to 0.8.6

* update deprecated nomad and vault commands

* update AMI ID

* add ingress rule for default fabio port and fabio UI

* upgrade Consul and Vault versions

* update AMI ID in README.md and terraform.tfvars
2018-11-13 23:21:01 -05:00
Danielle Tomlinson 0917e93537
Merge pull request #4869 from hashicorp/b-executor-stdout
executor: Fix stdout stderr copy/paste
2018-11-13 19:22:37 -08:00
Jen 78362b5a83 Update open graph image 2018-11-13 17:36:13 -05:00
Danielle Tomlinson e5c641daa9 scheduler: Allow comparisons of nil values
This commit allows the ConstraintChecker to test values that do not exist.
This is useful when wanting to _exclude_ given nodes from executing a
job, for example, if you wanted to give canary nodes an attribute, and
not run critical services on them, you may specify something like the
below, but not want to tag all other nodes with the inverse.

```hcl
constraint {
  attribute = "${node.attr.canary}
  operator = "!="
  value = "1"
}
```

This also requires all constraint checkers to allow for nil target
values, as they will no longer be short circuited by resolving a target.
2018-11-13 13:36:51 -08:00
Mahmood Ali 1e92161f14
Merge pull request #4858 from hashicorp/b-fix-master-20181109
Fix some tests in master
2018-11-13 16:08:26 -05:00
Alex Dadgar 08dc2ea702
Merge pull request #4867 from hashicorp/b-deployment-progress-deadline
Blocked evaluation fixes
2018-11-13 10:29:03 -08:00
Alex Dadgar 17e8446484
Merge pull request #4868 from hashicorp/b-plugin-ctx
Plugin client's handle plugin dying
2018-11-13 10:26:53 -08:00
Mahmood Ali 683a74d153 Ignore apt-get update failures in CI
We run with ~120 apt sources, and apt-get update fails if any of them is
down.

True errors would be raised again at install phase as true dependencies
fetch would fail.
;
2018-11-13 10:21:40 -05:00
Mahmood Ali 356c194acc Use materialized duration fields for driver config 2018-11-13 10:21:40 -05:00
Mahmood Ali 865419e756 convert all config durations to strings in tests 2018-11-13 10:21:40 -05:00
Mahmood Ali 470d20cdf3 Avoid downloading image if present locally 2018-11-13 10:21:40 -05:00
Mahmood Ali ac3b4571eb Address review comments 2018-11-13 10:21:40 -05:00
Mahmood Ali 69f26783e4 avoid setting resource limit on rkt command
Was accidentally modified in 5b14d24bf4626bab420d00783d92bcf25e0b641e .
2018-11-13 10:21:40 -05:00
Mahmood Ali fa146d9b85 fix plugin test 2018-11-13 10:21:40 -05:00
Mahmood Ali 66f4c23848 increase timeout to 30 minutes
nomad/client take very long and exceed 15m sometimes:

In https://travis-ci.org/hashicorp/nomad/jobs/452990197 :

```
panic: test timed out after 15m0s

goroutine 4739 [running]:
testing.(*M).startAlarm.func1()
	/home/travis/.gimme/versions/go1.11.2.linux.amd64/src/testing/testing.go:1296 +0xfd
....
goroutine 4665 [select]:
github.com/hashicorp/nomad/vendor/google.golang.org/grpc.newClientStream.func5(0xc0003dd500, 0xc000420120, 0x2b3f86295588, 0xc000496810)
	/home/travis/gopath/src/github.com/hashicorp/nomad/vendor/google.golang.org/grpc/stream.go:287 +0xd7
created by github.com/hashicorp/nomad/vendor/google.golang.org/grpc.newClientStream
	/home/travis/gopath/src/github.com/hashicorp/nomad/vendor/google.golang.org/grpc/stream.go:286 +0x842
FAIL	github.com/hashicorp/nomad/client/driver	900.036s
```
2018-11-13 10:21:40 -05:00
Mahmood Ali 8fa26f5521 Fix docker log fetching in tests
We no longer use syslog for tracking logs so tracking them explicitly
here
2018-11-13 10:21:40 -05:00
Mahmood Ali 88fa968623 killing should be done with wait client
Incidentally changed in 5b14d24bf4626bab420d00783d92bcf25e0b641e
2018-11-13 10:21:40 -05:00
Mahmood Ali 7690f389a0 Prioritize checking consumer context cancellation
Tests expect that as soon as eventer shuts down immediately on context
cancellations; but golang does not guarantee priority when multiple
pending channels are ready in a select statement.
2018-11-13 10:21:40 -05:00
Mahmood Ali c62ec124c0 Set clean config for mock driver
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver.  Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.

Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali c7610d8c22 mark and skip failing consul failing tests 2018-11-13 10:21:40 -05:00