open-nomad

Author	SHA1	Message	Date
Charlie Voiselle	7a231897a5	Merge pull request #3556 from angrycub/f-fingerprint-log-level Dropped loglevel for AWS fingerprinter env read misses to DEBUG	2017-11-16 16:27:25 -05:00
Charlie Voiselle	969ddf9c2a	Lowered to DEBUG from AD feedback	2017-11-16 14:13:03 -05:00
Alex Dadgar	07963f0b6d	Merge pull request #3546 from hashicorp/f-heuristic Better interface selection heuristic	2017-11-15 12:51:21 -08:00
Alex Dadgar	97ec3974a9	Use interface attached to default route	2017-11-15 11:32:32 -08:00
Michael Schurter	f86f0bd9ea	Handle leader task being dead in RestoreState Fixes the panic mentioned in https://github.com/hashicorp/nomad/issues/3420#issuecomment-341666932 While a leader task dying serially stops all follower tasks, the synchronizing of state is asynchrnous. Nomad can shutdown before all follower tasks have updated their state to dead thus saving the state necessary to hit this panic: have a non-terminal alloc with a dead leader. The actual fix is a simple nil check to not assume non-terminal allocs leader's have a TaskRunner.	2017-11-15 10:36:13 -08:00
Charlie Voiselle	1197637251	Dropped loglevel for AWS fingerprinter env reads Certain environments use WARN for serious logging; however, it's very possible to have machines without some of the fingerprinted keys (public-ipv4 and public-hostname specifcally). Setting log level to INFO seems more consistent with this possibility.	2017-11-15 18:20:59 +00:00
Chelsea Komlo	2dfda33703	Nomad agent reload TLS configuration on SIGHUP (#3479 ) * Allow server TLS configuration to be reloaded via SIGHUP * dynamic tls reloading for nomad agents * code cleanup and refactoring * ensure keyloader is initialized, add comments * allow downgrading from TLS * initalize keyloader if necessary * integration test for tls reload * fix up test to assert success on reloaded TLS configuration * failure in loading a new TLS config should remain at current Reload only the config if agent is already using TLS * reload agent configuration before specific server/client lock keyloader before loading/caching a new certificate * introduce a get-or-set method for keyloader * fixups from code review * fix up linting errors * fixups from code review * add lock for config updates; improve copy of tls config * GetCertificate only reloads certificates dynamically for the server * config updates/copies should be on agent * improve http integration test * simplify agent reloading storing a local copy of config * reuse the same keyloader when reloading * Test that server and client get reloaded but keep keyloader * Keyloader exposes GetClientCertificate as well for outgoing connections * Fix spelling * correct changelog style	2017-11-14 17:53:23 -08:00
Alex Dadgar	ee31e15f51	Better interface selection heuristic This PR introduces a better interface selection heuristic such that we select interfaces with globally routable unicast addresses over link local addresses. Fixes https://github.com/hashicorp/nomad/issues/3487	2017-11-13 15:13:43 -08:00
Preetha Appan	926c9ed997	Make device mounting unit test verify configuration via docker inspect	2017-11-13 09:56:54 -06:00
Preetha Appan	dc2d5fb5a4	Unit test (linux only) that tests mounting a device in the docker driver	2017-11-13 09:56:54 -06:00
Preetha Appan	4834710e45	Add default value for cgroup permissions for device if not set	2017-11-13 09:56:54 -06:00
Preetha Appan	9cdee6991c	Remove unnecessary check since validate method already checks this	2017-11-13 09:56:54 -06:00
Preetha Appan	110c1fd4f0	Add support for passing device into docker driver	2017-11-13 09:56:54 -06:00
Alex Dadgar	d1358ec1b6	alway load all templates	2017-11-10 12:35:51 -08:00
Alex Dadgar	a3ea0c17a0	Handle multiple environment templates Fixes https://github.com/hashicorp/nomad/issues/3498	2017-11-10 11:08:19 -08:00
Alex Dadgar	b3edc12dd9	Merge pull request #3411 from cheeseprocedure/f-qemu-graceful-shutdown Qemu driver: graceful shutdown feature	2017-11-03 16:41:34 -07:00
Michael Schurter	690b8f4cfb	Remove noisy log line Didn't mean to commit this	2017-11-03 16:00:30 -07:00
Matt Mercer	11e2870875	Qemu driver: clean up logging; fail unsupported features on Windows	2017-11-03 15:40:20 -07:00
Alex Dadgar	6034916ad1	fix spelling mistake	2017-11-03 15:04:59 -07:00
Alex Dadgar	a23033932a	Merge pull request #3459 from multani/docker-oom-notification docker: log that a container has been killed by the OOM killer	2017-11-03 13:24:03 -07:00
Matt Mercer	cef9ba9770	Qemu driver: tweaks in response to PR feedback Remove attribute for long qemu monitor path; misc cleanup; update tests	2017-11-03 11:28:56 -07:00
Preetha Appan	0eaef09675	Remove event GenericSource, and address other code review comments. Also added deprecation info in comments.	2017-11-03 10:10:06 -05:00
Preetha Appan	5f09c968b3	Move logic for determinic event display message to task_runner, added two new fields DisplayMessage and Details.	2017-11-03 09:13:01 -05:00
Alex Dadgar	b4af10edde	Alloc Runner doesn't panic on restoration.	2017-11-02 16:14:13 -07:00
Alex Dadgar	abd28cbd7d	Merge pull request #3493 from hashicorp/f-remove-atlas Remove Atlas and Scada from codebase	2017-11-02 16:00:44 -07:00
Michael Schurter	eedbe8efbb	Merge pull request #3490 from hashicorp/f-gc-logging Make unable-to-gc log level adaptive	2017-11-02 14:32:40 -07:00
Diptanu Choudhury	cb68889652	Added the node_id as a tag	2017-11-02 13:29:10 -07:00
Alex Dadgar	701f462d33	remove atlas	2017-11-02 11:27:21 -07:00
Michael Schurter	fc33c945be	Make unable-to-gc log level adaptive WARNing when someone has over 50 non-terminal allocs was just too confusing. Tested manually with `gc_max_allocs = 10` and bumping a job from `count = 19` to `count = 21`: ``` 2017/11/02 17:54:21.076132 [INFO] client.gc: garbage collection due to number of allocations (19) is over the limit (10) skipped because no terminal allocations ... 2017/11/02 17:54:48.634529 [WARN] client.gc: garbage collection due to number of allocations (21) is over the limit (10) skipped because no terminal allocations ```	2017-11-02 10:57:42 -07:00
Diptanu Choudhury	8a9d0d40b1	Added support for tagged metrics	2017-11-02 10:07:57 -07:00
Diptanu Choudhury	5f522c6de3	Incrementing the start counter when we are actually starting a container	2017-11-02 09:51:20 -07:00
Diptanu Choudhury	44535e5d10	Recording counter for dead allocs properly	2017-11-02 09:51:20 -07:00
Diptanu Choudhury	0b34e811b7	Added metrics to track task/alloc start/restarts/dead events	2017-11-02 09:51:20 -07:00
Matt Mercer	00f90323c2	Qemu driver: defer cleanup sooner	2017-11-01 17:37:43 -07:00
Matt Mercer	43256af5f3	Qemu driver: clean up test logging; retry integration test for longer	2017-11-01 17:21:56 -07:00
Matt Mercer	b1145705d3	Use strings.Replace() instead of custom function	2017-11-01 15:31:35 -07:00
Matt Mercer	d51d174fa0	Qemu driver: basic testing of graceful shutdown feature	2017-11-01 15:31:30 -07:00
Matt Mercer	c26013ea0b	Qemu driver: include PIDs in log output	2017-11-01 15:31:24 -07:00
Matt Mercer	38d9a391aa	Qemu driver: ensure proper cleanup of resources	2017-11-01 15:31:20 -07:00
Matt Mercer	46f7e2fa4c	Qemu driver: minor logging fixes	2017-11-01 15:31:14 -07:00
Matt Mercer	4afb9dfa2d	Standardize driver.qemu logging prefix	2017-11-01 15:30:44 -07:00
Matt Mercer	5127e75569	Qemu driver: add graceful shutdown feature	2017-11-01 15:30:36 -07:00
Michael Schurter	1769db98b7	Fix regression by returning error on unknown alloc	2017-11-01 15:16:38 -05:00
Michael Schurter	9f26b9a403	Fix race in test	2017-11-01 15:16:38 -05:00
Michael Schurter	73e9b57908	Trigger GCs after alloc changes GC much more aggressively by triggering GCs when allocations become terminal as well as after new allocations are added.	2017-11-01 15:16:38 -05:00
Michael Schurter	2a81160dcd	Fix GC'd alloc tracking The Client.allocs map now contains all AllocRunners again, not just un-GC'd AllocRunners. Client.allocs is only pruned when the server GCs allocs. Also stops logging "marked for GC" twice.	2017-11-01 15:16:38 -05:00
Alex Dadgar	c710550551	fix test	2017-10-30 12:35:31 -07:00
Alex Dadgar	4831380e57	Node access is done using locked Node copy Fixes https://github.com/hashicorp/nomad/issues/3454 Reliably reproduced the data race before by having a fingerprinter change the nodes attributes every millisecond and syncing at the same rate. With fix, did not ever panic.	2017-10-27 13:27:24 -07:00
Jonathan Ballet	5429d1c656	docker: changed OOM killed error message	2017-10-27 20:30:52 +02:00
Jonathan Ballet	12615bde9c	docker: log that a container has been killed by the OOM killer Fix: #2203 (at least for Docker tasks)	2017-10-27 18:05:27 +02:00

1 2 3 4 5 ...

2637 commits