Commit Graph

166 Commits

Author SHA1 Message Date
Alex Dadgar c52000f792 FinishedAt only records when the task has actually started 2017-03-31 17:06:05 -07:00
Alex Dadgar 81b78f77e1 Track task start/finish time & improve logs errors
This PR adds tracking to when a task starts and finishes and the logs
API takes advantage of this and returns better errors when asking for
logs that do not exist.
2017-03-31 16:14:11 -07:00
Alex Dadgar 8238a8601e Address comment 2017-03-09 21:05:34 -08:00
Alex Dadgar 9011a7984c Add metrics to show allocations on the client
This PR adds the following metrics to the client:
client.allocations.migrating
client.allocations.blocked
client.allocations.pending
client.allocations.running
client.allocations.terminal

Also adds some missing fields to the API version of the evaluation.
2017-03-09 12:37:41 -08:00
Alex Dadgar 5be806a3df Fix vet script and fix vet problems
This PR fixes our vet script and fixes all the missed vet changes.

It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
2017-02-27 16:00:19 -08:00
Alex Dadgar 238b4bcafd Add Leader support to client 2017-02-10 17:55:19 -08:00
Michael Schurter acd11f678d Add COMPAT comment 2017-01-06 11:39:17 -08:00
Michael Schurter 86fcf96f72 Put a logger in AllocDir/TaskDir 2017-01-05 16:31:56 -08:00
Michael Schurter 5a6bd19eb7 Fix upgrade path for #2132
AllocRunner's state dropped the Context struct which needs to be
converted to the new AllocDir+TaskDir structs in RestoreState.

TaskRunner added a TaskDirBuilt flag, but it's safe to just let that
default to `false` and rebuild all task dirs once on upgrade.
2017-01-05 16:31:55 -08:00
Michael Schurter 774afd8800 Fail fast on taskdir errors 2017-01-05 16:31:55 -08:00
Michael Schurter 3ea09ba16a Move chroot building into TaskRunner
* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
2017-01-05 16:31:49 -08:00
Bastiaan Bakker 2c864172eb use snap.Alloc.TaskStates only after confirming snap.Alloc is not nil 2016-11-07 22:35:00 +01:00
Alex Dadgar 0fb7742c3c Task state "dead" is terminal 2016-11-04 16:57:24 -07:00
Alex Dadgar e85d0ebace Merge pull request #1840 from hashicorp/f-kill-fail
Change how we mark tasks as failed and allow consul-template to fail tasks
2016-10-24 13:40:52 -07:00
Michael Schurter 285e80ac0f Remove disk usage enforcement
Many thanks to @iverberk for the original PR (#1609), but we ended up
not wanting to ship this implementation with 0.5.

We'll come back to it after 0.5 and hopefully find a way to leverage
filesystem accounting and quotas, so we can skip the expensive polling.
2016-10-21 13:55:51 -07:00
Alex Dadgar 46a7d1a0d7 Change how we mark tasks as failed and allow consul-template to fail tasks 2016-10-20 17:27:16 -07:00
Alex Dadgar 36cfe6e89e Large refactor of task runner and Vault token rehandling 2016-10-18 11:24:20 -07:00
Diptanu Choudhury d50c395421 Getting snapshot of allocation from remote node (#1741)
* Added the alloc dir move

* Moving allocdirs when starting allocations

* Added the migrate flag to ephemeral disk

* Stopping migration if the allocation doesn't need migration any more

* Added the GetAllocDir method

* refactored code

* Added a test for alloc runner

* Incorporated review comments
2016-10-03 09:59:57 -07:00
Alex Dadgar 50efdb00e9 Merge pull request #1713 from hashicorp/f-alloc-runner-vault
Vault integration in client
2016-09-20 16:15:55 -07:00
Alex Dadgar 83905075e5 Fix comment 2016-09-17 11:31:17 -07:00
Alex Dadgar 0f40bd41a3 Handle recovery failure 2016-09-15 12:50:44 -07:00
Alex Dadgar 688e616200 Fix token renewal 2016-09-15 11:20:51 -07:00
Alex Dadgar ec152a6d12 Clean up vault client 2016-09-14 18:10:56 -07:00
Alex Dadgar 6702a29071 Vault token threaded 2016-09-14 13:30:01 -07:00
Michael Schurter cd8606b9e3 Revert "A nil context isn't an error"
This reverts commit fe9fe4c26259c1ad3bd7e94bd711418aaf819b20.
2016-09-12 12:56:12 -07:00
Michael Schurter 8a57913a44 A nil context isn't an error 2016-09-02 16:24:53 -07:00
Michael Schurter f601361d58 Don't serialize task states twice in state files 2016-09-02 16:07:06 -07:00
Michael Schurter 6cb6d9cdf1 Lock around saving state
Prevent interleaving state syncs as it could conceivably lead to
empty state files as per #1367
2016-09-02 16:07:06 -07:00
Michael Schurter e7dd443447 Add sanity check to SaveState
Also just reuse the task states snapshot taken by `Alloc()` instead of
doing a redundant copy.
2016-09-02 16:07:06 -07:00
Alex Dadgar 2c8dd8bbd3 Revert "Introduce a Secret/ directory" 2016-09-01 17:23:15 -07:00
Alex Dadgar 5d3b47e648 Address comments and reserve 2016-08-31 18:11:02 -07:00
Alex Dadgar d59e14eed4 Interface + tests 2016-08-30 21:40:32 -07:00
Alex Dadgar 14b7126511 Secret dir, hello world 2016-08-29 15:41:52 -07:00
Ivo Verberk 2a17895a83 Disk resource monitoring and enforcement 2016-08-18 07:59:03 +02:00
Diptanu Choudhury 28b3f511e0 Fixed some error messages 2016-08-10 15:17:32 -07:00
Kenjiro Nakayama 6a810e6f1e Update after review 2016-08-09 08:57:26 +09:00
Kenjiro Nakayama 5c621b74e5 tiny: Return fmt.Errorf instead of duplicated error messages 2016-08-09 08:57:26 +09:00
Alex Dadgar 898435d372 Retrieve task runners in helper 2016-07-21 13:41:01 -07:00
Alex Dadgar 7b83503596 finer grain locking 2016-06-20 10:19:06 -07:00
Alex Dadgar 744270590b Guard against bad restore 2016-06-17 14:58:53 -07:00
Alex Dadgar fdda90229f only support latest and remove ring buffer 2016-06-12 09:32:38 -07:00
Alex Dadgar e952540f6f Allocation resources returned in a struct 2016-06-11 21:04:10 -07:00
Diptanu Choudhury a64062d6a6 Fixed the compilation on linux 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 0b0d0764e4 Changed signature of Allocation Stats Reporter 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 666b419dba Acquiring locks before iterating allocations and tasks 2016-05-28 19:59:20 -07:00
Diptanu Choudhury 91d2cf319e Added some documentation 2016-05-28 19:42:34 -07:00
Diptanu Choudhury f3d0aecafe Reporting time series of stats 2016-05-28 19:42:34 -07:00
Diptanu Choudhury 0fb0e0237f Added a client API to display resource usage of an allocation 2016-05-28 19:42:34 -07:00
Sean Chittenden dc28ab0cb5
Speling police 2016-05-15 09:41:34 -07:00
Alex Dadgar f64f03f87e Test task failure killing TG and fix setting the task as received on a restore 2016-03-25 12:51:40 -07:00
Alex Dadgar dced530c7c kill tasks in alloc when one fails 2016-03-25 12:50:25 -07:00
Alex Dadgar b80e61a66c Merge pull request #975 from hashicorp/f-rename-complete-alloc
Successful allocations are marked as complete instead of dead
2016-03-25 10:35:11 -07:00
Alex Dadgar 94522e7bed Successful allocations are marked as complete instead of dead 2016-03-23 18:08:19 -07:00
Diptanu Choudhury f6a932194f Removing references to old consul services and adding consul config to executor context 2016-03-23 12:19:19 -07:00
Alex Dadgar ad92e50a24 Avoid serializes Allocation.Resources 2016-03-01 14:09:25 -08:00
Alex Dadgar 61972c9ddc Refactor task runner to include driver starting into restart policy and add recoverable errors 2016-02-28 16:56:05 -08:00
Diptanu Choudhury e3d6c4a9dd Adding version information to snapshots 2016-02-24 19:06:30 -08:00
Alex Dadgar 51bacf674e address feedback 2016-02-21 21:32:32 -08:00
Alex Dadgar 281e2ca198 Batch client allocation updates to the server 2016-02-21 21:15:02 -08:00
Alex Dadgar 13e5597ca2 Reduce alloc lock contention in client 2016-02-19 19:51:55 -08:00
Alex Dadgar 99d2c173ff import 2016-02-19 16:31:04 -08:00
Alex Dadgar 2706aa2100 Better comment 2016-02-19 16:02:48 -08:00
Alex Dadgar d1011c9668 Fixes 2016-02-19 15:49:32 -08:00
Alex Dadgar e2a4c4ccc5 Client stores when it receives a task 2016-02-19 14:49:43 -08:00
Alex Dadgar 96fd272422 Increase Alloc channel buffers 2016-02-18 20:43:48 -08:00
Alex Dadgar f3d5598830 Unlock in error path 2016-02-11 08:38:16 -08:00
Alex Dadgar 4d7ed4f164 Strip as much copystructure as possible 2016-02-10 17:54:43 -08:00
Alex Dadgar 0c4c3fc4ee safe but slow 2016-02-10 13:44:53 -08:00
Alex Dadgar c744e2f4f1 Update the consul service when the task/alloc changes 2016-02-06 17:08:20 -08:00
Alex Dadgar e8067029cc Small fixes 2016-02-04 14:19:27 -08:00
Alex Dadgar 117bef6515 Fix AllocRunner not capturing destroy signal and tests 2016-02-04 13:09:53 -08:00
Alex Dadgar 41e1174f72 Client handles updates to KillTimeout and Restart Policy 2016-02-03 19:43:44 -08:00
Alex Dadgar b6f9e9c61c Move restart tracker creation into task runner 2016-02-03 16:16:48 -08:00
Alex Dadgar 6f20d3f435 Restart on-success shouldn't be user specifiable 2016-02-02 17:35:06 -08:00
Alex Dadgar cf1e152f44 Clean interaction between alloc-runner and task-runner 2016-02-02 11:09:29 -08:00
Alex Dadgar a72d39bd04 Don't share task state with the alloc in the task runner 2016-02-01 17:47:53 -08:00
Alex Dadgar b5260fc14e Fix locks and use task runners state not alloc state 2016-02-01 15:43:59 -08:00
Alex Dadgar 2d98c0eadd Fix double pull with introduction of AllocModifyIndex 2016-02-01 15:43:59 -08:00
Alex Dadgar a5e9e2068c Make NewRestartTracker private 2015-12-18 12:17:54 -08:00
Alex Dadgar 31c3e12957 merge 2015-12-18 12:17:13 -08:00
Diptanu Choudhury 76486d71e2 Making the allocs hold service ids 2015-12-14 15:08:35 -08:00
Diptanu Choudhury 135006699b Renamed consul client to service 2015-11-24 12:34:26 -08:00
Diptanu Choudhury 33472fd594 Making sure we sync the real state of services of a Task with Consul 2015-11-18 18:35:22 -08:00
Diptanu Choudhury 404810043a Added the implementation of consul client 2015-11-18 00:50:45 -08:00
Alex Dadgar b649039448 Fix the capacity 2015-11-16 15:14:21 -08:00
Alex Dadgar 82f51601db Track Task State in the client and capture Wait results 2015-11-16 15:14:21 -08:00
Diptanu Choudhury 3b4cb6dbc9 Saving state of the Task Runner while it's trying to update it 2015-11-12 15:53:42 -08:00
Alex Dadgar d3e2455459 Merge pull request #408 from hashicorp/f-client-restore
Client Restore State Fixes
2015-11-11 12:32:11 -08:00
Diptanu Choudhury 210f744503 Returning back if alloc runner isn't saved 2015-11-09 17:04:53 -08:00
Diptanu Choudhury 4ba5434898 Exctracted a method to save the alloc runner state 2015-11-09 16:59:02 -08:00
Diptanu Choudhury c618d7ed58 Inlining error check 2015-11-09 16:45:42 -08:00
Diptanu Choudhury 38a047ec6d Refactored Save State of Alloc runner 2015-11-09 16:15:11 -08:00
Alex Dadgar 19d0c97da7 Client restores state properly 2015-11-09 15:55:31 -08:00
Diptanu Choudhury 5ef34f7560 Returning once we have an error while saving a task that doesn't exist 2015-11-09 12:44:14 -08:00
Diptanu Choudhury 0252b49c17 Updating snapshots of a TaskRunner when status of Task changes 2015-11-09 12:36:07 -08:00
Diptanu Choudhury b035661d7d Passing in the AllocID to exec context so that it can be used in the drivers 2015-11-05 18:34:04 -08:00
Diptanu Choudhury 3d5e02b3d7 Fixed some tests and refactored logic 2015-11-05 17:30:41 -08:00
Diptanu Choudhury 44569d908f Passing restart tracker in the task runner 2015-11-05 16:38:19 -08:00
Diptanu Choudhury 86be2bf0be Cleaned up the logic to calculate restart duration 2015-11-05 15:16:29 -08:00
Diptanu Choudhury b64ed61022 Setting the restart policy to AllocRunner and Task Runners 2015-11-05 11:13:04 -08:00