Commit Graph

12652 Commits

Author SHA1 Message Date
Chris Baker d7aa8e6285 website testing: swapped out trap in website testing script for a simpler error catch 2018-10-09 16:06:12 -04:00
Chris Baker 6cd30be1e7 website testing: minor formatting changes, passing in middleman version from outer Makefile 2018-10-09 15:28:46 -04:00
Chris Baker e1062b7467 docs: broke docs testing out into its own travis stage 2018-10-08 15:11:46 -04:00
Chris Baker e2dc9b33ed docs: modified docs wget testing to be a little less verbose 2018-10-08 15:02:15 -04:00
Chris Baker daa363aaf3 docs: fixed incorrect env var in travis that to enable website testing 2018-10-08 14:54:40 -04:00
Chris Baker bd26f885ce docs: added docs website tests using a simple wget --recursive 2018-10-08 14:44:23 -04:00
Chris Baker 7326a51021 docs: changed localhost URLs for local UI from hyperlinks to pre-text, they were interfering with bad link detection 2018-10-08 14:42:33 -04:00
Chris Baker 6172eb92ca docs: fixed broken links to schedulers page 2018-10-08 13:23:14 -04:00
Alex Dadgar 0183fb4e5c nvidia package restructue + build non-linux 2018-10-05 13:56:04 -07:00
Omar Khawaja b833e4e8f6
editing monitoring.html (#4754) 2018-10-04 18:40:13 -04:00
Omar Khawaja ceb2eed3e5
editing lb guide (#4753) 2018-10-04 18:26:51 -04:00
Alex Dadgar b6d50726e2
Merge pull request #4638 from oleksii-shyman/nvidia-plugin
WIP :: Nvidia Plugin
2018-10-04 15:24:36 -07:00
oleksii.shyman 118e3fe7e9 Introduce nvidia-plugin reserve
- added reserve functionality that returns OCI compliant env variables
  specifying GPU IDs to be injected inside the container
2018-10-04 14:55:34 -07:00
Omar Khawaja b3937e3fc6
Monitoring and Alerting Guide with Prometheus [WIP] (#4706)
* add prometheus configuration guide

* fixing sub navigation issue

* Add detail to Next Steps

* add alerting component to guide

* update

* change docker image name and shorten job templates

* re-arrange to fix broken links
2018-10-04 17:15:10 -04:00
Omar Khawaja adfd89ded8
Load Balancing with Fabio Guide (#4445)
* add load-balancing guide

* restructure load balancing section

* defining consul lb strategies inline and giving fabio its own bullet point

* update docker image name and shorten job template

* changing system scheduler link to relative link and moving load balancing navigation link right to right above Web UI
2018-10-04 16:18:52 -04:00
oleksii.shyman 0ea1dc1776 Introduce Nvidia-plugin stats
- created go-nvml wrapper for stats
 - added stats feature to nvidia-plugin
2018-10-03 15:12:05 -07:00
oleksii.shyman b4a4b395e3 Introduce nvidia-plugin fingerprinting
- created go-nvml wrapper for fingerprinting
  - added fingerprinting feature to nvidia-plugin
2018-10-03 15:11:56 -07:00
Alex Dadgar 564da575e1 changelog 2018-09-26 14:53:15 -07:00
Alex Dadgar c75dc3d1e2
Merge pull request #4723 from hashicorp/b-autopilot-cli
Fix autopilot set enable custom upgrades flag
2018-09-25 13:53:52 -07:00
Alex Dadgar c031b22d03 Fix autopilot set enable custom upgrades flag 2018-09-25 13:49:35 -07:00
Alex Dadgar 9b793531d6
Merge pull request #4720 from hashicorp/b-jet-fixes
Series of scheduler fixes / debugging enhancements
2018-09-25 13:25:11 -07:00
Alex Dadgar 99c386c076 skip e2e/vault if integration isn't set 2018-09-25 11:29:09 -07:00
Alex Dadgar 10dee5108d
Merge pull request #4712 from hashicorp/b-failed-trigger-reason
Add a missing eval trigger reason
2018-09-25 10:50:16 -07:00
Alex Dadgar bd420692f3 fix logging 2018-09-25 10:49:55 -07:00
Preetha Appan a10118c461 Add failed follow up to the list of allowed eval trigger reasons
needs unit test
2018-09-25 10:49:55 -07:00
Preetha Appan 86e725e84c Added logging around nacked evals in the scheduler worker 2018-09-25 10:49:02 -07:00
Alex Dadgar 6bdd241641
Merge pull request #4717 from barda999/master
changed ${nomad.class} to ${node.class}
2018-09-24 16:51:27 -07:00
barda999 2c9f212dea
changed ${nomad.class} to ${node.class}
I guess that was an unintentional mistake
2018-09-24 16:48:06 -07:00
Alex Dadgar 759a36dc53
Merge pull request #4698 from hashicorp/t-vault-matrix
Vault test matrix
2018-09-24 16:34:35 -07:00
Alex Dadgar f9c60c91d8 proper variable capture 2018-09-24 16:34:15 -07:00
Alex Dadgar a7de6d1bb1
Merge pull request #4716 from hashicorp/f-no-reuse-triggerby
Unique TriggerBy for blocked evals
2018-09-24 16:08:31 -07:00
Alex Dadgar 3497c3c345 Merge branch 'b-plan' into b-jet-fixes 2018-09-24 16:07:29 -07:00
Alex Dadgar 6fa7071194
Merge pull request #4709 from hashicorp/b-deployments
Fix deployment watcher index usage
2018-09-24 16:05:02 -07:00
Alex Dadgar 6a21f9fe96 Unique TriggerBy for blocked evals
Give blocked evals a unique triggerby reason to make debugging a chain
of evaluations easier.
2018-09-24 14:47:49 -07:00
Alex Dadgar e1a102f58c test allocs fit 2018-09-24 13:59:01 -07:00
Alex Dadgar d7f5be9148 Better comment on snapshotindex 2018-09-24 13:53:43 -07:00
Alex Dadgar 99498da6ed Denormalize jobs in plan and ignore resources of terminal allocs
Denormalize jobs in AppendAllocs:
AppendAlloc was originally only ever called for inplace upgrades and new
allocations. Both these code paths would remove the job from the
allocation. Now we use this to also add fields such as FollowupEvalID
which did not normalize the job. This is only a performance enhancement.

Ignore terminal allocs:
Failed allocations are annotated with the followup Eval ID when one is
created to replace the failed allocation. However, in the plan applier,
when we check if allocations fit, these terminal allocations were not
filtered. This could result in the plan being rejected if the node would
be overcommited if the terminal allocations resources were considered.
2018-09-24 13:53:43 -07:00
Alex Dadgar de442226ae Fix other instances of blocking queries 2018-09-24 13:52:39 -07:00
Preetha Appan f8d9d7a179
update changelog 2018-09-24 11:19:51 -05:00
Preetha 63b58aa92c
Merge pull request #4702 from hashicorp/b-non-voter-boostrap
Do not bootstrap with non voters
2018-09-24 11:14:36 -05:00
Alex Dadgar 7f0d241ef4 always handle failed allocation 2018-09-21 15:13:54 -07:00
Alex Dadgar b2449ae1ce Fix deployment watcher index usage
Fixes three issues:
1. Retrieving the latest evaluation index was not properly selecting the
greatest index. This would undermine checks we had to reduce the number
of evaluations created when the latest eval index was greater than any
alloc change
2. Fix an issue where the blocking query code was using the incorrect
index such that the index was higher than necassary.
3. Special case handling of blocked evaluation since the create/snapshot
index is no particularly useful since they can be reblocked.
2018-09-21 13:59:11 -07:00
Alex Dadgar 5009566503 do not bootstrap with non voters 2018-09-19 17:17:39 -07:00
Alex Dadgar 5f77a78558
Merge pull request #4693 from Chaosteil/patch-1
Update federation.md command
2018-09-19 11:00:46 -07:00
Alex Dadgar 5e4da194e3 build nomad in e2e tests 2018-09-19 10:38:20 -07:00
Alex Dadgar d58595b0b9 vendor vault api for backwards compatibility 2018-09-19 10:23:18 -07:00
Alex Dadgar 9d85eaa2ab run in matrix 2018-09-19 10:21:57 -07:00
Alex Dadgar 69cd345778 vet 2018-09-19 10:18:10 -07:00
Alex Dadgar 34e704df64 test automation 2018-09-19 10:18:10 -07:00
Alex Dadgar e546215046 add a vault test matrix 2018-09-19 10:18:10 -07:00