Commit graph

2673 commits

Author SHA1 Message Date
Chelsea Holland Komlo 77ab41124b set default kill signal on executor shutdown 2017-12-07 11:40:15 -05:00
Chelsea Holland Komlo 6cae8fe6e6 extend configurable kill signal to java driver 2017-12-07 11:40:10 -05:00
Chelsea Holland Komlo 350319239c change location of default kill signal 2017-12-06 17:48:25 -05:00
Chelsea Holland Komlo 7dfb64f941 extract signal helper into utils 2017-12-06 14:36:44 -05:00
Chelsea Holland Komlo b08611cfac move kill_signal to task level, extend to docker 2017-12-06 14:36:39 -05:00
Chelsea Holland Komlo 80de7d5ebd allow controlling the stop signal in exec/raw_exec 2017-12-06 11:28:45 -05:00
Chelsea Komlo 9ae849e09c
Merge pull request #3612 from hashicorp/docker-rkt-user
Set user for rkt tasks
2017-12-05 17:45:08 -05:00
Michael Schurter b66aa5b7f6
Merge pull request #3563 from hashicorp/b-snapshot-atomic
Atomic Snapshotting / Sticky Volume Migration
2017-12-05 09:16:33 -08:00
Chelsea Holland Komlo 4463dc607e fix up test 2017-12-05 10:12:40 -05:00
Chelsea Holland Komlo 7284f2385a remove unused user option 2017-12-04 18:01:31 -05:00
Michael Schurter 6ccc4219d3
Merge pull request #3615 from hashicorp/b-rkt-host-ports
rkt: Don't require port_map with host networking
2017-12-04 14:49:42 -08:00
Chelsea Holland Komlo 7c74968452 add ability to specify user for rkt 2017-12-04 14:21:48 -05:00
Michael Schurter 2bf1d6d85e rkt: Don't require port_map with host networking
Also don't try to return a DriverNetwork with host networking. None will
ever exist as that's the point of host networking: rkt won't create a
network namespace.
2017-12-01 17:23:25 -08:00
Chelsea Holland Komlo 4ee2122536 get KillTimeout in seconds, not nanoseconds 2017-12-01 10:43:00 -05:00
Michael Schurter 5e975bbd0f Add comment and normalize err check ordering
as per PR comments
2017-11-29 17:26:11 -08:00
Michael Schurter d996c3a231 Check for error file when receiving snapshots 2017-11-29 17:26:11 -08:00
Michael Schurter ca946679f6 Destroy partially migrated alloc dirs
Test that snapshot errors don't return a valid tar currently fails.
2017-11-29 17:26:11 -08:00
Michael Schurter 23c66e37c5 Handle errors during snapshotting
If an alloc dir is being GC'd (removed) during snapshotting the walk
func will be passed an error. Previously we didn't check for an error so
a panic would occur when we'd try to use a nil `fileInfo`.
2017-11-29 17:26:11 -08:00
Chelsea Holland Komlo 2208964948 Support StopTimeout for Docker tasksw
Update github.com/fsouza/go-dockerclient
2017-11-29 14:33:05 -05:00
Preetha Appan 6ad65c51e6 Missed assert in one place 2017-11-20 13:04:38 -06:00
Preetha Appan 747bd59daa Better error validation, and added test case for invalid sysctl inputs 2017-11-20 12:07:18 -06:00
Preetha Appan c68973747b Address some review comments 2017-11-20 11:15:09 -06:00
Preetha Appan 39ef9ee76d Fix gofmt warnings 2017-11-18 09:23:09 -06:00
Preetha Appan e53dd15f58 Fix test compilation after rebase 2017-11-17 17:46:04 -06:00
Samuel BERTHE 0fca2e19c8 review(docker driver): sysctls -> sysctl + ulimits -> ulimit 2017-11-17 16:30:45 -06:00
Samuel BERTHE 6c93922cb7 Oops 2017-11-17 16:14:14 -06:00
Samuel BERTHE c8363bc44b 💄 2017-11-17 16:03:22 -06:00
Samuel BERTHE 281ab90484 test(docker driver): testing sysctls and ulimits 2017-11-17 16:03:22 -06:00
Samuel BERTHE b9a10ff7fa feat(docker driver): adds sysctls and ulimits configs 2017-11-17 16:03:22 -06:00
Alex Dadgar 69d3bf7392
Merge pull request #3559 from hashicorp/b-metrics
Don't emit metrics for non-running tasks
2017-11-17 10:33:23 -08:00
Michael Schurter 3845c8d200
Merge pull request #3562 from hashicorp/b-3561-rkt-rm
Remove rkt pods when exiting
2017-11-16 17:30:21 -08:00
Michael Schurter 737fb45640
Merge pull request #3551 from hashicorp/b-3419-docker-409-bug
Fix Docker name conflict bug by updating dockerclient
2017-11-16 16:38:54 -08:00
Michael Schurter 437fce9954 Improve rktRemove error message 2017-11-16 15:45:14 -08:00
Michael Schurter 3ceec0caab Remove rkt pods when exiting
Fixes #3561
2017-11-16 14:33:44 -08:00
Charlie Voiselle 7a231897a5
Merge pull request #3556 from angrycub/f-fingerprint-log-level
Dropped loglevel for AWS fingerprinter env read misses to DEBUG
2017-11-16 16:27:25 -05:00
Charlie Voiselle 969ddf9c2a Lowered to DEBUG from AD feedback 2017-11-16 14:13:03 -05:00
Alex Dadgar 05b1588cea Only publish metric when the task is running and dev mode publishes metrics 2017-11-15 13:21:06 -08:00
Alex Dadgar 07963f0b6d
Merge pull request #3546 from hashicorp/f-heuristic
Better interface selection heuristic
2017-11-15 12:51:21 -08:00
Alex Dadgar 97ec3974a9 Use interface attached to default route 2017-11-15 11:32:32 -08:00
Michael Schurter f86f0bd9ea Handle leader task being dead in RestoreState
Fixes the panic mentioned in
https://github.com/hashicorp/nomad/issues/3420#issuecomment-341666932

While a leader task dying serially stops all follower tasks, the
synchronizing of state is asynchrnous. Nomad can shutdown before all
follower tasks have updated their state to dead thus saving the state
necessary to hit this panic: *have a non-terminal alloc with a dead
leader.*

The actual fix is a simple nil check to not assume non-terminal allocs
leader's have a TaskRunner.
2017-11-15 10:36:13 -08:00
Charlie Voiselle 1197637251 Dropped loglevel for AWS fingerprinter env reads
Certain environments use WARN for serious logging; however, it's very
possible to have machines without some of the fingerprinted keys
(public-ipv4 and public-hostname specifcally).  Setting log level to
INFO seems more consistent with this possibility.
2017-11-15 18:20:59 +00:00
Chelsea Komlo 2dfda33703 Nomad agent reload TLS configuration on SIGHUP (#3479)
* Allow server TLS configuration to be reloaded via SIGHUP

* dynamic tls reloading for nomad agents

* code cleanup and refactoring

* ensure keyloader is initialized, add comments

* allow downgrading from TLS

* initalize keyloader if necessary

* integration test for tls reload

* fix up test to assert success on reloaded TLS configuration

* failure in loading a new TLS config should remain at current

Reload only the config if agent is already using TLS

* reload agent configuration before specific server/client

lock keyloader before loading/caching a new certificate

* introduce a get-or-set method for keyloader

* fixups from code review

* fix up linting errors

* fixups from code review

* add lock for config updates; improve copy of tls config

* GetCertificate only reloads certificates dynamically for the server

* config updates/copies should be on agent

* improve http integration test

* simplify agent reloading storing a local copy of config

* reuse the same keyloader when reloading

* Test that server and client get reloaded but keep keyloader

* Keyloader exposes GetClientCertificate as well for outgoing connections

* Fix spelling

* correct changelog style
2017-11-14 17:53:23 -08:00
Michael Schurter 3023336b39 Add a test demonstrating the bug
Fails on Docker 17.09, passes on Docker 17.06 and earlier
2017-11-14 15:25:52 -08:00
Alex Dadgar ee31e15f51 Better interface selection heuristic
This PR introduces a better interface selection heuristic such that we
select interfaces with globally routable unicast addresses over link
local addresses.

Fixes https://github.com/hashicorp/nomad/issues/3487
2017-11-13 15:13:43 -08:00
Preetha Appan 926c9ed997 Make device mounting unit test verify configuration via docker inspect 2017-11-13 09:56:54 -06:00
Preetha Appan dc2d5fb5a4 Unit test (linux only) that tests mounting a device in the docker driver 2017-11-13 09:56:54 -06:00
Preetha Appan 4834710e45 Add default value for cgroup permissions for device if not set 2017-11-13 09:56:54 -06:00
Preetha Appan 9cdee6991c Remove unnecessary check since validate method already checks this 2017-11-13 09:56:54 -06:00
Preetha Appan 110c1fd4f0 Add support for passing device into docker driver 2017-11-13 09:56:54 -06:00
Alex Dadgar d1358ec1b6 alway load all templates 2017-11-10 12:35:51 -08:00