Commit graph

224 commits

Author SHA1 Message Date
Alex Dadgar 5cd43e837a Merge branch 'master' into b-remount 2017-03-02 19:23:13 -08:00
Michael Schurter e03f64ea6a Safely ensure {dev,proc,alloc} are mounted
If they're unmounted by a reboot they'll be properly remounted.
2017-03-02 13:21:34 -08:00
Alex Dadgar af4e400b36 Update go-getter and add support for git and hg
Fixes https://github.com/hashicorp/nomad/issues/2042
2017-03-01 14:46:04 -08:00
Alex Dadgar fa853c9696 Fix two issues during client restore state
This PR fixes two issues:

1) A close of a nil stopCollection channel when restoring and prestart
fails. The failure will cause the killCh to be triggered which will
close collection before it has been initialized.

2) Fixes a deadlock in which the handleWaitCh is never triggered since
it is not initialized when there is an error in prestart and the killCh
is triggered.

Both fixes are by maintaining the loop invariant that the two channels
are valid after there is a handle.
2017-02-28 10:29:12 -08:00
Alex Dadgar cef7882827 Fix tests and docs 2017-02-22 18:28:07 -08:00
Diptanu Choudhury 98921575af Adding a task event for setup 2017-02-22 18:28:07 -08:00
Michael Schurter 51e4fe9915 Use getters & setters with nil guards 2017-02-09 17:44:58 -08:00
Michael Schurter 37e7e7a3e5 Fix upgrade path for created resources
This *might* be a fix for #2295 -- I've been unable to reproduce the
bug. However, this guard seems wise regardless. I should never be
overwriting an intialized created resources with a nil.
2017-02-09 13:54:33 -08:00
Michael Schurter aef3c2e380 Handle createdResourcs=nil
Combined with b522c472fdf this fixes #2256

Without these two commits in place upgrades to 0.5.3 panics.
2017-01-31 10:51:32 -08:00
Alex Dadgar 8196a58c4c Rename dispatch_input to dispatch_payload 2017-01-25 21:27:44 -08:00
Michael Schurter 054ee8df59 Fix index we get allocs by 2017-01-20 16:30:40 -08:00
Michael Schurter c5f222e4a6 Update created resources before exiting cleanup 2017-01-19 16:48:23 -08:00
Michael Schurter a93d43a9cf Exit early when cleanup succeeds 2017-01-19 15:07:01 -08:00
Michael Schurter 85f68aa00c Fix incorrect lock usage 2017-01-19 11:39:18 -08:00
Michael Schurter a3a3656dbb Switch to use recoverable errors from Cleanup
TaskRunner handles retrying but Cleanup handles all of CreatedResources.
2017-01-13 16:46:08 -08:00
Michael Schurter dc68aa1a5a Return errors from cleanup and let TaskRunner retry 2017-01-12 17:21:54 -08:00
Michael Schurter 4d081490e6 Add Cleanup method to Driver interface
Cleanup can be used for cleaning up resources created by drivers to run
a task. Initially the Docker driver is the only user (to remove
downloaded images).
2017-01-11 17:23:33 -08:00
Alex Dadgar 4127a3ea9d Merge pull request #2173 from hashicorp/b-stats
Don't retrieve Driver Stats if unsupported
2017-01-10 13:32:03 -08:00
Alex Dadgar 2be221d664 Don't retrieve Driver Stats if unsupported
This PR makes us only try to collect stats once if the Driver doesn't
support collecting stats.

Fixes https://github.com/hashicorp/nomad/issues/1986
2017-01-09 13:47:06 -08:00
Alex Dadgar 26e2c5bb74 Merge pull request #2164 from hashicorp/b-dispatch
Create Task directory structure in the Run method
2017-01-09 11:24:46 -08:00
Alex Dadgar 2a5fd85e3b Move to Run() 2017-01-08 13:55:12 -08:00
Alex Dadgar 2affef2972 Create task directory during Prestart() 2017-01-08 13:55:12 -08:00
Alex Dadgar 4ffd9a69e5 Send Driver events to servers immediately
This PR causes driver events to be sent to the server immediately rather
than waiting for Prestart() to finish.
2017-01-08 13:54:43 -08:00
Michael Schurter 3ea09ba16a Move chroot building into TaskRunner
* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
2017-01-05 16:31:49 -08:00
Alex Dadgar 8d5f0fea69 Merge pull request #2128 from hashicorp/f-dispatch
Nomad Constructor Jobs and Dispatch
2017-01-06 05:22:49 +08:00
Michael Schurter ea92cd102a Append host env vars on every task env 2016-12-20 12:24:24 -08:00
Michael Schurter 2aa235f8f2 Rename InitializationMessage to DriverMessage 2016-12-20 11:51:09 -08:00
Alex Dadgar 159c819e08 Client writes payload to disk 2016-12-16 15:11:56 -08:00
Michael Schurter 770ed703d0 Add Driver.Prestart method
The Driver.Prestart method currently does very little but lays the
foundation for where lifecycle plugins can interleave execution _after_
task environment setup but _before_ the task starts.

Currently Prestart does two things:

* Any driver specific task environment building
* Download Docker images

This change also attaches a TaskEvent emitter to Drivers, so they can
emit events during task initialization.
2016-12-02 11:03:48 -08:00
Alex Dadgar 960424f086 Merge pull request #1941 from hashicorp/b-complete-transistion
Task state "dead" is terminal
2016-11-04 17:16:10 -07:00
Alex Dadgar e6465e138b More precise marking of dead 2016-11-04 17:11:07 -07:00
Alex Dadgar 0fb7742c3c Task state "dead" is terminal 2016-11-04 16:57:24 -07:00
Alex Dadgar 8b7adb20e9 Fix tests 2016-11-04 15:10:18 -07:00
Alex Dadgar 4e8d39d674 Unique task 2016-11-04 14:53:37 -07:00
Alex Dadgar 4741a4b129 Create container much more robust 2016-11-04 14:39:56 -07:00
Alex Dadgar cd1791ed09 Download artifacts before templates 2016-10-31 11:29:26 -07:00
Alex Dadgar 6618f7a03d Fix passing of recoverable error from docker pull 2016-10-28 17:49:46 -07:00
Alex Dadgar fde7a24865 Consul-template fixes + PreviousAlloc in api 2016-10-28 15:50:35 -07:00
Alex Dadgar 4082732d3a Interpolate and then validate services 2016-10-25 14:27:49 -07:00
Alex Dadgar da8b05ba17 Fix merge 2016-10-24 17:04:10 -07:00
Alex Dadgar 03eba049ed Merge pull request #1848 from hashicorp/f-vault-error
Thread through whether DeriveToken error is recoverable or not
2016-10-24 15:01:18 -07:00
Alex Dadgar ede3a814ba Small fixes 2016-10-22 18:20:50 -07:00
Alex Dadgar 0070178741 Thread through whether DeriveToken error is recoverable or not 2016-10-22 18:08:30 -07:00
Alex Dadgar 46a7d1a0d7 Change how we mark tasks as failed and allow consul-template to fail tasks 2016-10-20 17:27:16 -07:00
Alex Dadgar b384bff053 Feedback 2016-10-18 15:01:04 -07:00
Alex Dadgar ba0b3963ef Comments 2016-10-18 11:36:04 -07:00
Alex Dadgar 4f8bfd7b18 Tests 2016-10-18 11:24:20 -07:00
Alex Dadgar 36cfe6e89e Large refactor of task runner and Vault token rehandling 2016-10-18 11:24:20 -07:00
Alex Dadgar 53eeec9bc1 Merge pull request #1801 from hashicorp/f-signals
Consul-template signal change mode
2016-10-18 11:23:47 -07:00
Ben Barnard 83f647ed84 Replace "the the" with "the" in documentation and comments 2016-10-11 15:31:40 -04:00
Alex Dadgar bc35eaee21 Task runner sends signals 2016-10-10 15:09:00 -07:00
Alex Dadgar e2d49eb4a2 Comments 2016-10-06 15:21:59 -07:00
Alex Dadgar 68c5fe78f8 Tests 2016-10-06 15:17:34 -07:00
Alex Dadgar 8fb07bb083 Fix handling of restart in TaskEvents 2016-10-06 15:06:54 -07:00
Alex Dadgar 8eb7fa91cf Start of integration 2016-10-06 15:05:49 -07:00
Alex Dadgar 50efdb00e9 Merge pull request #1713 from hashicorp/f-alloc-runner-vault
Vault integration in client
2016-09-20 16:15:55 -07:00
Diptanu Choudhury f7a9b39e8c Ensuring that we are not emitting stats when handle is nil (#1723)
* Ensuring that we are not emitting stats when handle is nil

* Updated the changelog
2016-09-20 11:29:34 -07:00
Alex Dadgar ec152a6d12 Clean up vault client 2016-09-14 18:10:56 -07:00
Alex Dadgar 6702a29071 Vault token threaded 2016-09-14 13:30:01 -07:00
Michael Schurter 6cb6d9cdf1 Lock around saving state
Prevent interleaving state syncs as it could conceivably lead to
empty state files as per #1367
2016-09-02 16:07:06 -07:00
Vishal Nayak b6b73545ea Merge pull request #1606 from hashicorp/f-vault-client
VaultClient for Nomad client's interactions with Vault
2016-08-30 13:13:54 -04:00
Michael Schurter d31f373a5b Merge pull request #1653 from hashicorp/b-fix-artifact-retry
Don't fail other tasks when retrying artifact get
2016-08-26 09:53:39 -07:00
Michael Schurter 5ce26f82fe Don't fail other tasks when retrying artifact get
The artifact fetching may be retried and succeed, so don't set the task
as dead.

Fixes #1558
2016-08-25 13:16:41 -07:00
Ivo Verberk 9113244131 Don't duplicate TaskKilled event and check for TaskSiblingFailed. 2016-08-25 20:11:10 +02:00
vishalnayak 56e42cf03d Employ DeriveVaultToken API and flesh-up DeriveToken 2016-08-24 12:29:59 -04:00
Alex Dadgar 1da8566322 Merge pull request #1580 from hashicorp/f-disk-usage-monitoring
Monitor and enforce shared allocation directory disk usage
2016-08-23 09:49:53 -07:00
Diptanu Choudhury 4ca623bcfe blocking chained allocations until previous allocation hasn't terminated 2016-08-22 11:34:24 -05:00
Ivo Verberk 2a17895a83 Disk resource monitoring and enforcement 2016-08-18 07:59:03 +02:00
Diptanu Choudhury 28b3f511e0 Fixed some error messages 2016-08-10 15:17:32 -07:00
Kenjiro Nakayama 5c621b74e5 tiny: Return fmt.Errorf instead of duplicated error messages 2016-08-09 08:57:26 +09:00
Diptanu Choudhury 70d2f8ef1d Merge pull request #1534 from nak3/fix-intask_runner
tiny: print task name and error message for SaveState error
2016-08-08 13:37:25 -04:00
Kenjiro Nakayama e7863ea8ee tiny: print task name and error message for the SaveState error in task_runner 2016-08-07 13:33:58 +09:00
Kenjiro Nakayama 60b58eed84 Update GetArtifact by removing unused logger 2016-08-06 23:37:32 +09:00
Diptanu Choudhury 41b540fbc8 Allow operators to opt into publishing node and alloc metrics 2016-08-01 19:52:20 -07:00
Alex Dadgar 90748cedad Add killing event and mark task as not running when killed 2016-07-21 15:49:54 -07:00
Alex Dadgar c35b1be845 Set running when restoring 2016-06-28 13:47:59 -07:00
Diptanu Choudhury 88ac1b33a4 Not emitting per-pid stats and added the total ticks consumed by a Task 2016-06-20 17:30:25 -07:00
Alex Dadgar fe588a2469 Guard against restoring a nil task in task_runner 2016-06-16 11:55:40 -07:00
Alex Dadgar fdda90229f only support latest and remove ring buffer 2016-06-12 09:32:38 -07:00
Alex Dadgar e952540f6f Allocation resources returned in a struct 2016-06-11 21:04:10 -07:00
Diptanu Choudhury fd60cfd585 Emitting client resource usage metrics as guages instead of k/v pairs 2016-06-11 22:17:32 +02:00
Alex Dadgar b7e3a45fef fix channel being nil on restore 2016-06-07 15:03:08 -07:00
Diptanu Choudhury c21d606ebb Getting inodes used percent back 2016-06-06 16:10:34 -07:00
Alex Dadgar ba1a92eb8c Handle errors during stats collection 2016-06-03 14:23:18 -07:00
Diptanu Choudhury 667b478f3f Merge pull request #1226 from hashicorp/f-push-stats
Push Resource Usage stats to remote sinks
2016-06-02 23:14:59 +02:00
Diptanu Choudhury 35e31c1b81 Enqueing metrics only if they are not nil 2016-06-02 17:14:15 -04:00
Diptanu Choudhury 7efde782fa Sending metrics for tasks as well 2016-06-01 16:42:16 +02:00
Alex Dadgar 4e15611339 fix wait result being nil and some panics in the cli 2016-05-31 23:09:05 +00:00
Diptanu Choudhury f95b1d00c3 Renamed error message in alloc endpoint 2016-05-28 20:03:52 -07:00
Diptanu Choudhury c0dc6cfbf2 Changing the api of the stats endpoints 2016-05-28 19:59:20 -07:00
Diptanu Choudhury fa9b0dd7e8 Implemented the resource usage ts since a time 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 77ac2dd624 Initializing the ring buffer with no cells 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 0b0d0764e4 Changed signature of Allocation Stats Reporter 2016-05-28 19:59:20 -07:00
Diptanu Choudhury c46400597e Making the stats collection interval and number of data points to keep in memory configurable 2016-05-28 19:59:20 -07:00
Diptanu Choudhury d2021e2953 Changed the signature of ResourceUsageTS 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 05c221186b Added disk usage to node status 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 84cd943c48 Stopping stats collection of tasks which has been destroyed 2016-05-28 19:59:20 -07:00
Diptanu Choudhury b9feae89ce Making the conversion to Stats simpler 2016-05-28 19:42:34 -07:00
Diptanu Choudhury 91d2cf319e Added some documentation 2016-05-28 19:42:34 -07:00
Diptanu Choudhury f3d0aecafe Reporting time series of stats 2016-05-28 19:42:34 -07:00