Commit graph

3030 commits

Author SHA1 Message Date
Jesus Vazquez 4f6db56283 Initialize dockerhandle with jobname, taskgroupname, taskname and allocid 2018-05-04 14:02:19 +08:00
Jesus Vazquez 127b764dfb Add Job, taskgroupname, taskname, and allocid to the DockerHandle struct 2018-05-04 14:01:26 +08:00
Jesus Vazquez fd1ff1a0cf Run goimports 2018-05-04 13:46:36 +08:00
Jesus Vazquez 5dd4059527 Add driver.docker counter metric for OOM Killer events 2018-05-04 13:46:36 +08:00
Alex Dadgar de4af37249 version bump and remove generated 2018-04-27 11:10:00 -07:00
Alex Dadgar 845a43864a generated files 2018-04-27 10:45:40 -07:00
Alex Dadgar 35e06ddb31 Remove generated and version bump 2018-04-26 16:49:19 -07:00
Alex Dadgar 43192cefae generated files 2018-04-26 16:28:58 -07:00
Michael Schurter 0e602d4779
Merge pull request #4188 from hashicorp/f-rkt-stats
rkt: create parent cgroup to enable stats
2018-04-24 14:54:36 -07:00
Michael Schurter d687761ebf rkt: test Stats() and always run tests
Remove the NOMAD_TEST_RKT flag as a guard for rkt tests. Still require
Linux, root, and rkt to be installed. Only check for rkt installation
once in hopes of speeding up rkt tests a bit.
2018-04-24 11:05:42 -07:00
Javier Palomo Almena 3e6c01ffa1 docker tests: Fix usage of NewDriverContext 2018-04-23 22:51:06 +02:00
Javier Palomo Almena 74d3c5df07 DriverContext: Add the TaskGroup and the Job name
Adding this fields to the DriverContext object, will allow us to pass
them to the drivers.

An use case for this, will be to emit tagged metrics in the drivers,
which contain all relevant information:
- Job
- TaskGroup
- Task
- ...

Ref: https://github.com/hashicorp/nomad/pull/4185
2018-04-23 00:15:29 +02:00
Michael Schurter 4cee6cca6c rkt: create parent cgroup to enable stats
Having the Nomad executor create parent cgroups that rkt is launched
within allows the stats collection code used for the exec driver to Just
Work. The only downside is that now the Nomad executor's resource
utilization counts against the cgroups resource limits just as it does
for the exec driver.
2018-04-19 15:14:56 -07:00
Michael Schurter 1a85d0c990 run goimports 2018-04-19 11:16:28 -07:00
Michael Schurter d77c265d1f
Merge pull request #4168 from ninoles/b-2117-windows-group-process
B 2117 windows group process
2018-04-19 11:10:51 -07:00
Michael Schurter fdbcbd4e5b
Merge pull request #4058 from hashicorp/f-mock-by-default
[Post-0.8] test: build with mock_driver by default
2018-04-18 15:57:00 -07:00
Michael Schurter d3650fb2cd test: build with mock_driver by default
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Michael Schurter a991923389 tests: fix race in alloc_runner_test.go
I could not reproduce the failure locally even with `stress -cpu ...`
eating all the cpu it could on my machine.

But I think the race was in one of two places:
* The task could restart which could create new events
* I think there could be a race between the updater's version of events
  and alloc runners as updates are async

I fixed both. Here's hoping that fixes this flaky test.
2018-04-17 17:14:59 -07:00
Fabien Ninoles c81bec48c9 Merge branch 'master' into b-2117-windows-group-process 2018-04-17 13:47:25 -04:00
Fabien Ninoles 35cf641416 Update based on PR request. 2018-04-17 13:43:04 -04:00
Alex Dadgar c4ad76091d
Merge pull request #4166 from hashicorp/b-panic-fix-update
Fixes races accessing node and updating it during fingerprinting
2018-04-17 10:02:19 -07:00
Chelsea Holland Komlo 9b8a079558 fix up comments 2018-04-17 11:53:08 -04:00
Alex Dadgar 9d612c8cb0 Cleanup 2018-04-16 15:48:34 -07:00
Alex Dadgar 32adaf9dfc Copy the config given to the alloc runner 2018-04-16 15:45:52 -07:00
Alex Dadgar 3ff2d4d795 fix race node access 2018-04-16 15:45:51 -07:00
Alex Dadgar 4f2a7b6949 Fix copying drivers 2018-04-16 15:45:51 -07:00
Alex Dadgar 0b799822ff Operate on copy 2018-04-16 15:45:49 -07:00
Fabien Ninoles 27cf4995ce - Clean up for windows compilation.
- Set CREATE_NEW_PROCESS_GROUP for Windows subprocess.
- Ensure we only kill actual process that need to.
2018-04-14 13:58:42 -04:00
Michael Schurter 3836b8a335
Merge pull request #3572 from emate/master
Create new process group on process startup.
2018-04-13 11:56:38 -07:00
Alex Dadgar adaf4fa7e0 Remove generated structs 2018-04-12 16:35:31 -07:00
Alex Dadgar 663c4d0433 Version bump and generated files 2018-04-12 16:21:50 -07:00
Alex Dadgar ff1a1a63e8 Move where attribute for driver detection is set 2018-04-12 15:50:25 -07:00
Chelsea Holland Komlo 5291788b40 delete driver name from only health check attributes 2018-04-12 18:24:41 -04:00
Alex Dadgar 3d53d380f7 Fix tests 2018-04-12 14:29:30 -07:00
Alex Dadgar f24ce2c50c Driver health detection cleanups
This PR does:

1. Health message based on detection has format "Driver XXX detected"
and "Driver XXX not detected"
2. Set initial health description based on detection status and don't
wait for the first health check.
3. Combine updating attributes on the node, fingerprint and health
checking update for drivers into a single call back.
4. Condensed driver info in `node status` only shows detected drivers
and make the output less wide by removing spaces.
2018-04-12 12:46:40 -07:00
Charlie Voiselle ba88f00ccb Changed "til" to "until"
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Chelsea Komlo eb5aac16e6
Merge pull request #4111 from hashicorp/b-undetected-set-health-to-false
Immediately set driver health status to false when driver moves to undetected
2018-04-10 18:30:31 -04:00
Chelsea Holland Komlo d58b3e473c update comment for when the fingerprinter setting health status 2018-04-10 16:53:00 -04:00
Chelsea Holland Komlo f7ef13cc64 fingerprinter should set health check status if health check is not periodic 2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo ede4f518bd add setters for access to the fingerprint manager's node
refactor extracting driver info
2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo f479da19f5 guard against overwriting health status 2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo ece1618815 immediately set healthy to false when driver moves to undetected 2018-04-10 15:29:51 -04:00
Alex Dadgar 3d367d6fd7 Fix client uptime metric missing client prefix 2018-04-10 10:39:36 -07:00
Seth Vargo df4fe7e76c
Set user-agent when talking to GCE metadata 2018-04-10 10:36:46 -04:00
Chelsea Komlo d3bd8fb96e
Merge pull request #4109 from hashicorp/f-shorten-docker-health-timeout
Shorten docker health timeout
2018-04-09 15:38:39 -04:00
Chelsea Holland Komlo ea4b65dd41 only initialize docker clients if they are nil 2018-04-09 14:13:07 -04:00
Chelsea Holland Komlo 288c7a33a1 refacotoring simplification from code review 2018-04-09 10:34:17 -04:00
Chelsea Holland Komlo 6e3b056c37 only run health check if driver moves from undetected to detected 2018-04-09 10:10:43 -04:00
Alex Dadgar ae1f76477e Start rebalance after discovering new servers 2018-04-05 15:41:59 -07:00
Alex Dadgar 929b6823a3
Merge pull request #4106 from hashicorp/b-servers
Improved Client handling of failed RPCs
2018-04-05 13:48:50 -07:00