Commit graph

2632 commits

Author SHA1 Message Date
Charlie Voiselle 969ddf9c2a Lowered to DEBUG from AD feedback 2017-11-16 14:13:03 -05:00
Charlie Voiselle 1197637251 Dropped loglevel for AWS fingerprinter env reads
Certain environments use WARN for serious logging; however, it's very
possible to have machines without some of the fingerprinted keys
(public-ipv4 and public-hostname specifcally).  Setting log level to
INFO seems more consistent with this possibility.
2017-11-15 18:20:59 +00:00
Chelsea Komlo 2dfda33703 Nomad agent reload TLS configuration on SIGHUP (#3479)
* Allow server TLS configuration to be reloaded via SIGHUP

* dynamic tls reloading for nomad agents

* code cleanup and refactoring

* ensure keyloader is initialized, add comments

* allow downgrading from TLS

* initalize keyloader if necessary

* integration test for tls reload

* fix up test to assert success on reloaded TLS configuration

* failure in loading a new TLS config should remain at current

Reload only the config if agent is already using TLS

* reload agent configuration before specific server/client

lock keyloader before loading/caching a new certificate

* introduce a get-or-set method for keyloader

* fixups from code review

* fix up linting errors

* fixups from code review

* add lock for config updates; improve copy of tls config

* GetCertificate only reloads certificates dynamically for the server

* config updates/copies should be on agent

* improve http integration test

* simplify agent reloading storing a local copy of config

* reuse the same keyloader when reloading

* Test that server and client get reloaded but keep keyloader

* Keyloader exposes GetClientCertificate as well for outgoing connections

* Fix spelling

* correct changelog style
2017-11-14 17:53:23 -08:00
Preetha Appan 926c9ed997 Make device mounting unit test verify configuration via docker inspect 2017-11-13 09:56:54 -06:00
Preetha Appan dc2d5fb5a4 Unit test (linux only) that tests mounting a device in the docker driver 2017-11-13 09:56:54 -06:00
Preetha Appan 4834710e45 Add default value for cgroup permissions for device if not set 2017-11-13 09:56:54 -06:00
Preetha Appan 9cdee6991c Remove unnecessary check since validate method already checks this 2017-11-13 09:56:54 -06:00
Preetha Appan 110c1fd4f0 Add support for passing device into docker driver 2017-11-13 09:56:54 -06:00
Alex Dadgar d1358ec1b6 alway load all templates 2017-11-10 12:35:51 -08:00
Alex Dadgar a3ea0c17a0 Handle multiple environment templates
Fixes https://github.com/hashicorp/nomad/issues/3498
2017-11-10 11:08:19 -08:00
Alex Dadgar b3edc12dd9
Merge pull request #3411 from cheeseprocedure/f-qemu-graceful-shutdown
Qemu driver: graceful shutdown feature
2017-11-03 16:41:34 -07:00
Michael Schurter 690b8f4cfb Remove noisy log line
Didn't mean to commit this
2017-11-03 16:00:30 -07:00
Matt Mercer 11e2870875 Qemu driver: clean up logging; fail unsupported features on Windows 2017-11-03 15:40:20 -07:00
Alex Dadgar 6034916ad1 fix spelling mistake 2017-11-03 15:04:59 -07:00
Alex Dadgar a23033932a
Merge pull request #3459 from multani/docker-oom-notification
docker: log that a container has been killed by the OOM killer
2017-11-03 13:24:03 -07:00
Matt Mercer cef9ba9770 Qemu driver: tweaks in response to PR feedback
Remove attribute for long qemu monitor path; misc cleanup; update tests
2017-11-03 11:28:56 -07:00
Preetha Appan 0eaef09675 Remove event GenericSource, and address other code review comments. Also added deprecation info in comments. 2017-11-03 10:10:06 -05:00
Preetha Appan 5f09c968b3 Move logic for determinic event display message to task_runner, added two new fields DisplayMessage and Details. 2017-11-03 09:13:01 -05:00
Alex Dadgar b4af10edde Alloc Runner doesn't panic on restoration. 2017-11-02 16:14:13 -07:00
Alex Dadgar abd28cbd7d
Merge pull request #3493 from hashicorp/f-remove-atlas
Remove Atlas and Scada from codebase
2017-11-02 16:00:44 -07:00
Michael Schurter eedbe8efbb
Merge pull request #3490 from hashicorp/f-gc-logging
Make unable-to-gc log level adaptive
2017-11-02 14:32:40 -07:00
Diptanu Choudhury cb68889652 Added the node_id as a tag 2017-11-02 13:29:10 -07:00
Alex Dadgar 701f462d33 remove atlas 2017-11-02 11:27:21 -07:00
Michael Schurter fc33c945be Make unable-to-gc log level adaptive
WARNing when someone has over 50 non-terminal allocs was just too
confusing.

Tested manually with `gc_max_allocs = 10` and bumping a job from `count
= 19` to `count = 21`:

```
2017/11/02 17:54:21.076132 [INFO] client.gc: garbage collection due to number of allocations (19) is over the limit (10) skipped because no terminal allocations
...
2017/11/02 17:54:48.634529 [WARN] client.gc: garbage collection due to number of allocations (21) is over the limit (10) skipped because no terminal allocations
```
2017-11-02 10:57:42 -07:00
Diptanu Choudhury 8a9d0d40b1 Added support for tagged metrics 2017-11-02 10:07:57 -07:00
Diptanu Choudhury 5f522c6de3 Incrementing the start counter when we are actually starting a container 2017-11-02 09:51:20 -07:00
Diptanu Choudhury 44535e5d10 Recording counter for dead allocs properly 2017-11-02 09:51:20 -07:00
Diptanu Choudhury 0b34e811b7 Added metrics to track task/alloc start/restarts/dead events 2017-11-02 09:51:20 -07:00
Matt Mercer 00f90323c2 Qemu driver: defer cleanup sooner 2017-11-01 17:37:43 -07:00
Matt Mercer 43256af5f3 Qemu driver: clean up test logging; retry integration test for longer 2017-11-01 17:21:56 -07:00
Matt Mercer b1145705d3 Use strings.Replace() instead of custom function 2017-11-01 15:31:35 -07:00
Matt Mercer d51d174fa0 Qemu driver: basic testing of graceful shutdown feature 2017-11-01 15:31:30 -07:00
Matt Mercer c26013ea0b Qemu driver: include PIDs in log output 2017-11-01 15:31:24 -07:00
Matt Mercer 38d9a391aa Qemu driver: ensure proper cleanup of resources 2017-11-01 15:31:20 -07:00
Matt Mercer 46f7e2fa4c Qemu driver: minor logging fixes 2017-11-01 15:31:14 -07:00
Matt Mercer 4afb9dfa2d Standardize driver.qemu logging prefix 2017-11-01 15:30:44 -07:00
Matt Mercer 5127e75569 Qemu driver: add graceful shutdown feature 2017-11-01 15:30:36 -07:00
Michael Schurter 1769db98b7 Fix regression by returning error on unknown alloc 2017-11-01 15:16:38 -05:00
Michael Schurter 9f26b9a403 Fix race in test 2017-11-01 15:16:38 -05:00
Michael Schurter 73e9b57908 Trigger GCs after alloc changes
GC much more aggressively by triggering GCs when allocations become
terminal as well as after new allocations are added.
2017-11-01 15:16:38 -05:00
Michael Schurter 2a81160dcd Fix GC'd alloc tracking
The Client.allocs map now contains all AllocRunners again, not just
un-GC'd AllocRunners. Client.allocs is only pruned when the server GCs
allocs.

Also stops logging "marked for GC" twice.
2017-11-01 15:16:38 -05:00
Alex Dadgar c710550551 fix test 2017-10-30 12:35:31 -07:00
Alex Dadgar 4831380e57 Node access is done using locked Node copy
Fixes https://github.com/hashicorp/nomad/issues/3454

Reliably reproduced the data race before by having a fingerprinter
change the nodes attributes every millisecond and syncing at the same
rate. With fix, did not ever panic.
2017-10-27 13:27:24 -07:00
Jonathan Ballet 5429d1c656 docker: changed OOM killed error message 2017-10-27 20:30:52 +02:00
Jonathan Ballet 12615bde9c docker: log that a container has been killed by the OOM killer
Fix: #2203 (at least for Docker tasks)
2017-10-27 18:05:27 +02:00
Alex Dadgar f117eb28c7 go style vars 2017-10-25 10:49:34 -07:00
Alex Dadgar 3f8495dd0e fix two flaky tests 2017-10-23 18:15:52 -07:00
Alex Dadgar cb0d0ef009 move to consul freeport implementation 2017-10-23 16:51:40 -07:00
Alex Dadgar dbc014b360 Standardize retrieving a free port into a helper package 2017-10-23 16:48:20 -07:00
Alex Dadgar 4a69e1ad15 don't double parallel 2017-10-23 16:48:06 -07:00