Michael Schurter
e204a287ed
Refactor Consul Syncer into new ServiceClient
...
Fixes #2478 #2474 #1995 #2294
The new client only handles agent and task service advertisement. Server
discovery is mostly unchanged.
The Nomad client agent now handles all Consul operations instead of the
executor handling task related operations. When upgrading from an
earlier version of Nomad existing executors will be told to deregister
from Consul so that the Nomad agent can re-register the task's services
and checks.
Drivers - other than qemu - now support an Exec method for executing
abritrary commands in a task's environment. This is used to implement
script checks.
Interfaces are used extensively to avoid interacting with Consul in
tests that don't assert any Consul related behavior.
2017-04-19 12:42:47 -07:00
Alex Dadgar
6bee23047a
Fix variable capture and add tests
...
This PR fixes token revocation and adds tests to make sure it is
working. The 0.5.6 RC1's token revocation does not work becasue the
token's value is captured at the instantiation of the deferred
stoprenewal statement rather than its exectution.
2017-03-29 13:17:50 -07:00
Michael Schurter
ae3810052d
Merge pull request #2482 from hashicorp/f-2289-better-artifact-err
...
Improve artifact download error message
2017-03-28 12:48:22 -07:00
Alex Dadgar
1e95ae7e6a
Merge pull request #2495 from hashicorp/b-vault-stop-renew
...
Stop Vault token renew on task exit
2017-03-28 11:14:18 -07:00
Alex Dadgar
d1645f47b1
Stop Vault token renew on task exit
...
This PR fixes an oversight in which the client would attempt to renew a
token even after the task exits.
Fixes https://github.com/hashicorp/nomad/issues/2475
2017-03-28 10:53:15 -07:00
Michael Schurter
507862ade3
Add WrapRecoverable helper
2017-03-27 15:37:15 -07:00
Alex Dadgar
4ecebe7d8c
Proper reference counting through task restarts
...
This PR fixes an issue in which the reference count on a Docker image
would become inflated through task restarts.
2017-03-25 17:05:53 -07:00
Michael Schurter
0e6c564406
Improve artifact download error message
...
Fixes #2289
Unfortunately took more RecoverableError hijinx than I would have liked.
There might be a better way.
2017-03-24 15:26:05 -07:00
Alex Dadgar
7e6c08191d
Fix vet
2017-03-24 12:24:47 -07:00
Alex Dadgar
c3551c761e
Fix panic when restarting non-running task
...
This PR fixes an issue that is hit when running templates with restart
mode in which the client could panic when the handle is not running.
Fixes https://github.com/hashicorp/nomad/issues/2479
2017-03-24 12:04:22 -07:00
Alex Dadgar
b5e53652aa
Handle git ssh artifacts
...
This PR adds handling for downloading git artifacts using ssh with the
format git@github.com:hashicorp/go-getter.git
Fixes https://github.com/hashicorp/nomad/issues/2430
2017-03-11 15:12:41 -08:00
Alex Dadgar
3fb285f7d3
Fix TestAllocRunner_SaveRestoreState
2017-03-02 20:45:46 -08:00
Alex Dadgar
5cd43e837a
Merge branch 'master' into b-remount
2017-03-02 19:23:13 -08:00
Michael Schurter
e03f64ea6a
Safely ensure {dev,proc,alloc} are mounted
...
If they're unmounted by a reboot they'll be properly remounted.
2017-03-02 13:21:34 -08:00
Alex Dadgar
af4e400b36
Update go-getter and add support for git and hg
...
Fixes https://github.com/hashicorp/nomad/issues/2042
2017-03-01 14:46:04 -08:00
Alex Dadgar
fa853c9696
Fix two issues during client restore state
...
This PR fixes two issues:
1) A close of a nil stopCollection channel when restoring and prestart
fails. The failure will cause the killCh to be triggered which will
close collection before it has been initialized.
2) Fixes a deadlock in which the handleWaitCh is never triggered since
it is not initialized when there is an error in prestart and the killCh
is triggered.
Both fixes are by maintaining the loop invariant that the two channels
are valid after there is a handle.
2017-02-28 10:29:12 -08:00
Alex Dadgar
cef7882827
Fix tests and docs
2017-02-22 18:28:07 -08:00
Diptanu Choudhury
98921575af
Adding a task event for setup
2017-02-22 18:28:07 -08:00
Michael Schurter
51e4fe9915
Use getters & setters with nil guards
2017-02-09 17:44:58 -08:00
Michael Schurter
37e7e7a3e5
Fix upgrade path for created resources
...
This *might* be a fix for #2295 -- I've been unable to reproduce the
bug. However, this guard seems wise regardless. I should never be
overwriting an intialized created resources with a nil.
2017-02-09 13:54:33 -08:00
Michael Schurter
aef3c2e380
Handle createdResourcs=nil
...
Combined with b522c472fdf this fixes #2256
Without these two commits in place upgrades to 0.5.3 panics.
2017-01-31 10:51:32 -08:00
Alex Dadgar
8196a58c4c
Rename dispatch_input to dispatch_payload
2017-01-25 21:27:44 -08:00
Michael Schurter
054ee8df59
Fix index we get allocs by
2017-01-20 16:30:40 -08:00
Michael Schurter
c5f222e4a6
Update created resources before exiting cleanup
2017-01-19 16:48:23 -08:00
Michael Schurter
a93d43a9cf
Exit early when cleanup succeeds
2017-01-19 15:07:01 -08:00
Michael Schurter
85f68aa00c
Fix incorrect lock usage
2017-01-19 11:39:18 -08:00
Michael Schurter
a3a3656dbb
Switch to use recoverable errors from Cleanup
...
TaskRunner handles retrying but Cleanup handles all of CreatedResources.
2017-01-13 16:46:08 -08:00
Michael Schurter
dc68aa1a5a
Return errors from cleanup and let TaskRunner retry
2017-01-12 17:21:54 -08:00
Michael Schurter
4d081490e6
Add Cleanup method to Driver interface
...
Cleanup can be used for cleaning up resources created by drivers to run
a task. Initially the Docker driver is the only user (to remove
downloaded images).
2017-01-11 17:23:33 -08:00
Alex Dadgar
4127a3ea9d
Merge pull request #2173 from hashicorp/b-stats
...
Don't retrieve Driver Stats if unsupported
2017-01-10 13:32:03 -08:00
Alex Dadgar
2be221d664
Don't retrieve Driver Stats if unsupported
...
This PR makes us only try to collect stats once if the Driver doesn't
support collecting stats.
Fixes https://github.com/hashicorp/nomad/issues/1986
2017-01-09 13:47:06 -08:00
Alex Dadgar
26e2c5bb74
Merge pull request #2164 from hashicorp/b-dispatch
...
Create Task directory structure in the Run method
2017-01-09 11:24:46 -08:00
Alex Dadgar
2a5fd85e3b
Move to Run()
2017-01-08 13:55:12 -08:00
Alex Dadgar
2affef2972
Create task directory during Prestart()
2017-01-08 13:55:12 -08:00
Alex Dadgar
4ffd9a69e5
Send Driver events to servers immediately
...
This PR causes driver events to be sent to the server immediately rather
than waiting for Prestart() to finish.
2017-01-08 13:54:43 -08:00
Michael Schurter
3ea09ba16a
Move chroot building into TaskRunner
...
* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
2017-01-05 16:31:49 -08:00
Alex Dadgar
8d5f0fea69
Merge pull request #2128 from hashicorp/f-dispatch
...
Nomad Constructor Jobs and Dispatch
2017-01-06 05:22:49 +08:00
Michael Schurter
ea92cd102a
Append host env vars on every task env
2016-12-20 12:24:24 -08:00
Michael Schurter
2aa235f8f2
Rename InitializationMessage to DriverMessage
2016-12-20 11:51:09 -08:00
Alex Dadgar
159c819e08
Client writes payload to disk
2016-12-16 15:11:56 -08:00
Michael Schurter
770ed703d0
Add Driver.Prestart method
...
The Driver.Prestart method currently does very little but lays the
foundation for where lifecycle plugins can interleave execution _after_
task environment setup but _before_ the task starts.
Currently Prestart does two things:
* Any driver specific task environment building
* Download Docker images
This change also attaches a TaskEvent emitter to Drivers, so they can
emit events during task initialization.
2016-12-02 11:03:48 -08:00
Alex Dadgar
960424f086
Merge pull request #1941 from hashicorp/b-complete-transistion
...
Task state "dead" is terminal
2016-11-04 17:16:10 -07:00
Alex Dadgar
e6465e138b
More precise marking of dead
2016-11-04 17:11:07 -07:00
Alex Dadgar
0fb7742c3c
Task state "dead" is terminal
2016-11-04 16:57:24 -07:00
Alex Dadgar
8b7adb20e9
Fix tests
2016-11-04 15:10:18 -07:00
Alex Dadgar
4e8d39d674
Unique task
2016-11-04 14:53:37 -07:00
Alex Dadgar
4741a4b129
Create container much more robust
2016-11-04 14:39:56 -07:00
Alex Dadgar
cd1791ed09
Download artifacts before templates
2016-10-31 11:29:26 -07:00
Alex Dadgar
6618f7a03d
Fix passing of recoverable error from docker pull
2016-10-28 17:49:46 -07:00
Alex Dadgar
fde7a24865
Consul-template fixes + PreviousAlloc in api
2016-10-28 15:50:35 -07:00
Alex Dadgar
4082732d3a
Interpolate and then validate services
2016-10-25 14:27:49 -07:00
Alex Dadgar
da8b05ba17
Fix merge
2016-10-24 17:04:10 -07:00
Alex Dadgar
03eba049ed
Merge pull request #1848 from hashicorp/f-vault-error
...
Thread through whether DeriveToken error is recoverable or not
2016-10-24 15:01:18 -07:00
Alex Dadgar
ede3a814ba
Small fixes
2016-10-22 18:20:50 -07:00
Alex Dadgar
0070178741
Thread through whether DeriveToken error is recoverable or not
2016-10-22 18:08:30 -07:00
Alex Dadgar
46a7d1a0d7
Change how we mark tasks as failed and allow consul-template to fail tasks
2016-10-20 17:27:16 -07:00
Alex Dadgar
b384bff053
Feedback
2016-10-18 15:01:04 -07:00
Alex Dadgar
ba0b3963ef
Comments
2016-10-18 11:36:04 -07:00
Alex Dadgar
4f8bfd7b18
Tests
2016-10-18 11:24:20 -07:00
Alex Dadgar
36cfe6e89e
Large refactor of task runner and Vault token rehandling
2016-10-18 11:24:20 -07:00
Alex Dadgar
53eeec9bc1
Merge pull request #1801 from hashicorp/f-signals
...
Consul-template signal change mode
2016-10-18 11:23:47 -07:00
Ben Barnard
83f647ed84
Replace "the the" with "the" in documentation and comments
2016-10-11 15:31:40 -04:00
Alex Dadgar
bc35eaee21
Task runner sends signals
2016-10-10 15:09:00 -07:00
Alex Dadgar
e2d49eb4a2
Comments
2016-10-06 15:21:59 -07:00
Alex Dadgar
68c5fe78f8
Tests
2016-10-06 15:17:34 -07:00
Alex Dadgar
8fb07bb083
Fix handling of restart in TaskEvents
2016-10-06 15:06:54 -07:00
Alex Dadgar
8eb7fa91cf
Start of integration
2016-10-06 15:05:49 -07:00
Alex Dadgar
50efdb00e9
Merge pull request #1713 from hashicorp/f-alloc-runner-vault
...
Vault integration in client
2016-09-20 16:15:55 -07:00
Diptanu Choudhury
f7a9b39e8c
Ensuring that we are not emitting stats when handle is nil ( #1723 )
...
* Ensuring that we are not emitting stats when handle is nil
* Updated the changelog
2016-09-20 11:29:34 -07:00
Alex Dadgar
ec152a6d12
Clean up vault client
2016-09-14 18:10:56 -07:00
Alex Dadgar
6702a29071
Vault token threaded
2016-09-14 13:30:01 -07:00
Michael Schurter
6cb6d9cdf1
Lock around saving state
...
Prevent interleaving state syncs as it could conceivably lead to
empty state files as per #1367
2016-09-02 16:07:06 -07:00
Vishal Nayak
b6b73545ea
Merge pull request #1606 from hashicorp/f-vault-client
...
VaultClient for Nomad client's interactions with Vault
2016-08-30 13:13:54 -04:00
Michael Schurter
d31f373a5b
Merge pull request #1653 from hashicorp/b-fix-artifact-retry
...
Don't fail other tasks when retrying artifact get
2016-08-26 09:53:39 -07:00
Michael Schurter
5ce26f82fe
Don't fail other tasks when retrying artifact get
...
The artifact fetching may be retried and succeed, so don't set the task
as dead.
Fixes #1558
2016-08-25 13:16:41 -07:00
Ivo Verberk
9113244131
Don't duplicate TaskKilled event and check for TaskSiblingFailed.
2016-08-25 20:11:10 +02:00
vishalnayak
56e42cf03d
Employ DeriveVaultToken API and flesh-up DeriveToken
2016-08-24 12:29:59 -04:00
Alex Dadgar
1da8566322
Merge pull request #1580 from hashicorp/f-disk-usage-monitoring
...
Monitor and enforce shared allocation directory disk usage
2016-08-23 09:49:53 -07:00
Diptanu Choudhury
4ca623bcfe
blocking chained allocations until previous allocation hasn't terminated
2016-08-22 11:34:24 -05:00
Ivo Verberk
2a17895a83
Disk resource monitoring and enforcement
2016-08-18 07:59:03 +02:00
Diptanu Choudhury
28b3f511e0
Fixed some error messages
2016-08-10 15:17:32 -07:00
Kenjiro Nakayama
5c621b74e5
tiny: Return fmt.Errorf instead of duplicated error messages
2016-08-09 08:57:26 +09:00
Diptanu Choudhury
70d2f8ef1d
Merge pull request #1534 from nak3/fix-intask_runner
...
tiny: print task name and error message for SaveState error
2016-08-08 13:37:25 -04:00
Kenjiro Nakayama
e7863ea8ee
tiny: print task name and error message for the SaveState error in task_runner
2016-08-07 13:33:58 +09:00
Kenjiro Nakayama
60b58eed84
Update GetArtifact by removing unused logger
2016-08-06 23:37:32 +09:00
Diptanu Choudhury
41b540fbc8
Allow operators to opt into publishing node and alloc metrics
2016-08-01 19:52:20 -07:00
Alex Dadgar
90748cedad
Add killing event and mark task as not running when killed
2016-07-21 15:49:54 -07:00
Alex Dadgar
c35b1be845
Set running when restoring
2016-06-28 13:47:59 -07:00
Diptanu Choudhury
88ac1b33a4
Not emitting per-pid stats and added the total ticks consumed by a Task
2016-06-20 17:30:25 -07:00
Alex Dadgar
fe588a2469
Guard against restoring a nil task in task_runner
2016-06-16 11:55:40 -07:00
Alex Dadgar
fdda90229f
only support latest and remove ring buffer
2016-06-12 09:32:38 -07:00
Alex Dadgar
e952540f6f
Allocation resources returned in a struct
2016-06-11 21:04:10 -07:00
Diptanu Choudhury
fd60cfd585
Emitting client resource usage metrics as guages instead of k/v pairs
2016-06-11 22:17:32 +02:00
Alex Dadgar
b7e3a45fef
fix channel being nil on restore
2016-06-07 15:03:08 -07:00
Diptanu Choudhury
c21d606ebb
Getting inodes used percent back
2016-06-06 16:10:34 -07:00
Alex Dadgar
ba1a92eb8c
Handle errors during stats collection
2016-06-03 14:23:18 -07:00
Diptanu Choudhury
667b478f3f
Merge pull request #1226 from hashicorp/f-push-stats
...
Push Resource Usage stats to remote sinks
2016-06-02 23:14:59 +02:00
Diptanu Choudhury
35e31c1b81
Enqueing metrics only if they are not nil
2016-06-02 17:14:15 -04:00
Diptanu Choudhury
7efde782fa
Sending metrics for tasks as well
2016-06-01 16:42:16 +02:00
Alex Dadgar
4e15611339
fix wait result being nil and some panics in the cli
2016-05-31 23:09:05 +00:00