Commit Graph

2133 Commits

Author SHA1 Message Date
Michael Schurter f94210b4bc Fix image based drivers having host env vars set
Add detailed tests for GetTaskEnv to avoid this issue happening again!

Fixes #2211
2017-01-18 10:27:03 -08:00
Michael Schurter 578272b7f2 Add CreatedResources.Remove and use it 2017-01-17 16:41:59 -08:00
Michael Schurter 1bcf7cdbfe Remove outdated comment 2017-01-17 16:23:29 -08:00
Michael Schurter 82b49d4547 Updated CreatedResources as images are cleaned 2017-01-17 16:13:40 -08:00
Michael Schurter beed31ff6f Remove outdated comment 2017-01-17 16:05:21 -08:00
Michael Schurter b9d6d2c8d6 Return error from Prestart 2017-01-17 16:04:09 -08:00
Michael Schurter ea87091e58 Prevent race between alloc runners
Block ar1's periodic syncing which could recreate the state file ar2 was
destroying.
2017-01-17 13:10:20 -08:00
Michael Schurter 15952e5d17 Try to get test passing in Travis 2017-01-17 12:51:19 -08:00
Michael Schurter 255698e8af Use Image ID instead of Image Name 2017-01-13 16:53:58 -08:00
Michael Schurter a3a3656dbb Switch to use recoverable errors from Cleanup
TaskRunner handles retrying but Cleanup handles all of CreatedResources.
2017-01-13 16:46:08 -08:00
Alex Dadgar 78deb8b292 Support setting class_path and class name.
This PR enhances the java driver to allow setting the class path and
class name to run. It also fixes an issue that would make the Java
driver attempt to chroot regardless of operating system (this never
effected a released version of Nomad).
2017-01-13 16:03:11 -08:00
Michael Schurter c90cd0d874 Stop trying to use mount for image based drivers
Fixes #2178 and allows using Docker and other image based drivers even
when nomad is run as a non-root user.

`client/allocdir` tests can be run as a non-root user to ensure this
behavior and tests that rely on root or non-root users properly detect
their effective user and skip instead of fail.
2017-01-13 13:04:12 -08:00
Michael Schurter 25bf266606 Add ID to output 2017-01-13 12:46:55 -08:00
Michael Schurter dc68aa1a5a Return errors from cleanup and let TaskRunner retry 2017-01-12 17:21:54 -08:00
Diptanu Choudhury 6809a4b104 Added executorconfig 2017-01-12 15:47:58 -08:00
Diptanu Choudhury b1d0078db5 Filter executor log messages 2017-01-12 11:54:19 -08:00
Michael Schurter ec81325ddc Stop being so confusing 2017-01-12 11:17:35 -08:00
Michael Schurter 4d081490e6 Add Cleanup method to Driver interface
Cleanup can be used for cleaning up resources created by drivers to run
a task. Initially the Docker driver is the only user (to remove
downloaded images).
2017-01-11 17:23:33 -08:00
Alex Dadgar aafb9ca8b2 Merge pull request #2177 from hashicorp/b-blocking-getallocs
GetAllocs uses a blocking query
2017-01-11 13:24:32 -08:00
Alex Dadgar 5d2b56b387 Random wait 2017-01-11 13:24:23 -08:00
Cameron Davison c910f9b304
using new ctx instead of getting both params back 2017-01-10 16:54:01 -06:00
Cameron Davison 7ccbd8a000
fixing typo in comment 2017-01-10 16:54:01 -06:00
Cameron Davison 88a462d5b9
add force_pull to docker driver 2017-01-10 16:54:01 -06:00
Alex Dadgar cdc08bbd22 Merge pull request #2179 from hashicorp/b-panic
Fix nil dereference
2017-01-10 14:15:17 -08:00
Alex Dadgar bb329977a4 Fix nil dereference 2017-01-10 14:14:58 -08:00
Michael Schurter 0b7f8163d2 Merge pull request #2174 from hashicorp/b-fix-executor-test
Switch to a less timing dependent test command
2017-01-10 13:45:12 -08:00
Michael Schurter 1347a941b0 Fix missing value in test failure message 2017-01-10 13:39:05 -08:00
Alex Dadgar 4127a3ea9d Merge pull request #2173 from hashicorp/b-stats
Don't retrieve Driver Stats if unsupported
2017-01-10 13:32:03 -08:00
Alex Dadgar c19985244a GetAllocs uses a blocking query
This PR makes GetAllocs use a blocking query as well as adding a sanity
check to the clients watchAllocation code to ensure it gets the correct
allocations.

This PR fixes https://github.com/hashicorp/nomad/issues/2119 and
https://github.com/hashicorp/nomad/issues/2153.

The issue was that the client was talking to two different servers, one
to check which allocations to pull and the other to pull those
allocations.  However the latter call was not with a blocking query and
thus the client would not retreive the allocations it requested.

The logging has been improved to make the problem more clear as well.
2017-01-10 13:30:35 -08:00
Michael Schurter 7462379086 Switch to a less timing dependent test command
`/usr/bin/yes` could produce output very quickly (100s of MBps on my
laptop) and therefore could cause log files to roll over.

A bash loop with a sleep avoids that issue. The test is slower but
should be much more resilient to the massive timing differences between
workstations and Travis.
2017-01-09 15:40:53 -08:00
Alex Dadgar 2be221d664 Don't retrieve Driver Stats if unsupported
This PR makes us only try to collect stats once if the Driver doesn't
support collecting stats.

Fixes https://github.com/hashicorp/nomad/issues/1986
2017-01-09 13:47:06 -08:00
Alex Dadgar 26e2c5bb74 Merge pull request #2164 from hashicorp/b-dispatch
Create Task directory structure in the Run method
2017-01-09 11:24:46 -08:00
Alex Dadgar 2a5fd85e3b Move to Run() 2017-01-08 13:55:12 -08:00
Alex Dadgar 2affef2972 Create task directory during Prestart() 2017-01-08 13:55:12 -08:00
Alex Dadgar 4ffd9a69e5 Send Driver events to servers immediately
This PR causes driver events to be sent to the server immediately rather
than waiting for Prestart() to finish.
2017-01-08 13:54:43 -08:00
Alex Dadgar 724edb6659 Fix fingerprint tests 2017-01-08 13:53:27 -08:00
Diptanu Choudhury eb123416c5 Fixed namespacing for the cpu arch 2017-01-06 14:23:22 -08:00
Michael Schurter 65fb580216 Fix inconsistent task env setting
Consolidate task environment building in GetTaskEnv since it can
determine what kind of filesystem isolation is used.

This means drivers no longer have to manipulate task environment paths.
2017-01-06 12:19:32 -08:00
Michael Schurter d3270799f0 Fix executor tests 2017-01-06 11:39:18 -08:00
Michael Schurter acd11f678d Add COMPAT comment 2017-01-06 11:39:17 -08:00
Michael Schurter e203928d64 Driver is now required in test tasks 2017-01-06 11:39:17 -08:00
Michael Schurter 90f6ac7490 Fix tests post rebase 2017-01-06 11:39:13 -08:00
Michael Schurter 579f378bee Remove debug logging 2017-01-05 16:31:56 -08:00
Michael Schurter baf6f078d6 Remove task name prefix from executor logs 2017-01-05 16:31:56 -08:00
Michael Schurter 86fcf96f72 Put a logger in AllocDir/TaskDir 2017-01-05 16:31:56 -08:00
Michael Schurter f43d3f074a Add comments to TaskDir 2017-01-05 16:31:55 -08:00
Michael Schurter 5a6bd19eb7 Fix upgrade path for #2132
AllocRunner's state dropped the Context struct which needs to be
converted to the new AllocDir+TaskDir structs in RestoreState.

TaskRunner added a TaskDirBuilt flag, but it's safe to just let that
default to `false` and rebuild all task dirs once on upgrade.
2017-01-05 16:31:55 -08:00
Michael Schurter 774afd8800 Fail fast on taskdir errors 2017-01-05 16:31:55 -08:00
Michael Schurter 7260d0bca3 Test tasks now require driver name 2017-01-05 16:31:55 -08:00
Michael Schurter 3ea09ba16a Move chroot building into TaskRunner
* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
2017-01-05 16:31:49 -08:00
Alex Dadgar 8d5f0fea69 Merge pull request #2128 from hashicorp/f-dispatch
Nomad Constructor Jobs and Dispatch
2017-01-06 05:22:49 +08:00
Alex Dadgar 34fc25757e Merge pull request #2157 from hashicorp/t-client-tests
Fix client tests deadlocking
2017-01-06 05:21:05 +08:00
Alex Dadgar a29f253a12 use helper 2017-01-05 13:19:01 -08:00
Diptanu Choudhury 247bda9a88 Unlocking if we return before adding a new alloc runner 2017-01-05 13:18:48 -08:00
Alex Dadgar ee523062d1 Fix TestClient_BlockedAllocations 2017-01-05 13:15:08 -08:00
Diptanu Choudhury 9721a1ab04 Fixed how alloc lock is held 2017-01-05 13:06:56 -08:00
Alex Dadgar 205caf341f Fix SaveRestoreState 2017-01-05 12:32:44 -08:00
Michael Schurter 13064768ac Fix race when shutting down in dev mode
Client.Shutdown holds the allocLock when destroying alloc runners in dev
mode.

Client.updateAllocStatus can be called during AllocRunner shutdown and
calls getAllocRunners which tries to acquire allocLock.RLock. This
deadlocks since Client.Shutdown already has the write lock.

Switching Client.Shutdown to use getAllocRunners and not hold a lock
during AllocRunner shutdown is the solution.
2017-01-03 17:21:50 -08:00
Michael Schurter 4a9a574d9d Merge pull request #2054 from hashicorp/f-prestart
Add Driver.Prestart method
2016-12-20 16:18:56 -08:00
Michael Schurter 8e1ae14feb Remove unneeded env building 2016-12-20 16:14:42 -08:00
Michael Schurter 39f587a2af Fix tests broken by TaskEnv change 2016-12-20 14:37:35 -08:00
Michael Schurter 0d90e96925 lxc: Set image local env vars 2016-12-20 14:37:18 -08:00
Michael Schurter 05b49008eb Remove unneeded waitClient field 2016-12-20 14:29:57 -08:00
Michael Schurter ea92cd102a Append host env vars on every task env 2016-12-20 12:24:24 -08:00
Michael Schurter 458c2ed5f1 Fix formatting of downloading image message 2016-12-20 11:57:26 -08:00
Michael Schurter e34d1e5d23 Use startContainer wrapper 2016-12-20 11:55:40 -08:00
Diptanu Choudhury 93091f7902 Fixed a test 2016-12-20 11:53:37 -08:00
Michael Schurter 2aa235f8f2 Rename InitializationMessage to DriverMessage 2016-12-20 11:51:09 -08:00
Michael Schurter 85b0cecff2 Emit "Downloading image" event 2016-12-20 11:40:34 -08:00
Diptanu Choudhury 6c11f38cb0 Merge pull request #2081 from hashicorp/f-gc
Garbage collector for allocations
2016-12-20 11:19:32 -08:00
Diptanu Choudhury b6120e2fc8 Removing the alloc runner from GC if it is destroyed by the server 2016-12-20 11:14:22 -08:00
Diptanu Choudhury 6e6e0d364a Added comments 2016-12-20 10:49:48 -08:00
Alex Dadgar 746d4c7ee3 Small cleanups 2016-12-19 14:22:08 -08:00
Alex Dadgar 18739a4433 Merge pull request #1980 from dmexe/network-aliases
Add network_aliases for docker driver
2016-12-19 14:17:48 -08:00
Alex Dadgar 7cdf24f05f Fix Docker Logging Type interpolation
This PR fixes an issue that made Logging.Type un-interpretable in the
docker driver.
2016-12-19 13:42:58 -08:00
Alex Dadgar 2f3aeed2f8 Merge pull request #2063 from tmichaud314/fix-docker-driver-auth-interpolation
Fixes docker-driver Auth-config interpolation
2016-12-19 13:41:27 -08:00
Diptanu Choudhury e072961cea Added tests 2016-12-19 13:21:47 -08:00
Alex Dadgar 4e8035756b Fix test and prevent job with payload from being submitted 2016-12-18 16:32:14 -08:00
Alex Dadgar 072ff1c3ee ensure file doesn't escape 2016-12-18 15:48:30 -08:00
Diptanu Choudhury 36b5545d6b Making the gc allocator understand real disk usage 2016-12-16 18:34:59 -08:00
Alex Dadgar 159c819e08 Client writes payload to disk 2016-12-16 15:11:56 -08:00
Alex Dadgar b1883daae8 Use new combined meta data function in env 2016-12-16 10:45:09 -08:00
Alex Dadgar 7778339f03 Fix mapstructure tag formatting for lxc driver 2016-12-16 10:24:17 -08:00
Diptanu Choudhury 7aef9bcabe Added the stats collector to GC 2016-12-14 15:11:11 -08:00
Diptanu Choudhury e855cd587b Refactored hoststats collector 2016-12-14 15:07:42 -08:00
Diptanu Choudhury 0ffd92668d GC-ing before we start a new allocation 2016-12-14 15:04:06 -08:00
Diptanu Choudhury afdaa979f7 Added a garbage collector for allocations 2016-12-14 15:01:12 -08:00
Alex Dadgar 648ad2ebc5 Merge pull request #2096 from hashicorp/b-addAlloc
Fix race and remove panic
2016-12-13 13:50:17 -08:00
Diptanu Choudhury 53fb09023c cancelling waiting for remote allocation if the alloc doesn't need migration 2016-12-13 13:06:33 -08:00
Alex Dadgar 3cbd237512 Fix race and remove panic 2016-12-13 12:34:23 -08:00
Christoffer Kylvåg 6a1f32b8ba #1680: Continue after not being able to stat a mountpoint 2016-12-13 12:28:57 +01:00
Tom Michaud d0c01c8816 Fixes docker-driver Auth-config interpolation 2016-12-06 13:30:23 -07:00
Diptanu Choudhury cbf73908ff Setting the appropriate file permissions which un-archiving compressed alloc dir 2016-12-05 17:04:43 -08:00
Diptanu Choudhury bc17cacca0 Merge pull request #2017 from hashicorp/b-sticky
Not moving alloc data when sticky is turned off
2016-12-05 14:11:45 -08:00
Diptanu Choudhury 21f49564d3 Not moving alloc data when sticky is turned off 2016-12-05 14:00:01 -08:00
Michael Schurter 770ed703d0 Add Driver.Prestart method
The Driver.Prestart method currently does very little but lays the
foundation for where lifecycle plugins can interleave execution _after_
task environment setup but _before_ the task starts.

Currently Prestart does two things:

* Any driver specific task environment building
* Download Docker images

This change also attaches a TaskEvent emitter to Drivers, so they can
emit events during task initialization.
2016-12-02 11:03:48 -08:00
Michael Schurter 1c4195b985 Fix string formatting 2016-12-01 11:22:51 -08:00
Alex Dadgar 86ed1fb2e5 Disallow stale queries when deriving Vault tokens
This PR disallows stale queries when deriving a Vault token. Allowing
stale queries could result in the allocation not existing on the server
that is servicing the request.
2016-12-01 11:13:36 -08:00
Alex Dadgar 70396c464b Make errors starting a container recoverable
This PR makes errors starting a container recoverable and tries to
optimistically handle 500 errors.
2016-11-30 15:59:47 -08:00
Diptanu Choudhury 6c179d1695 Merge pull request #2045 from hashicorp/b-docker-create-container
Returning a container if it exists instead of creating a new one
2016-11-29 17:55:33 -08:00
Diptanu Choudhury 50452520bf Returning a container if it exists instead of creating a new one 2016-11-29 17:52:19 -08:00
Michael Schurter e1d63f6c0f Bump timeout on test 2016-11-29 16:19:40 -08:00
Alex Dadgar ec4d6936ff add debug panic 2016-11-29 15:57:40 -08:00
Alex Dadgar 712e18707b add debugging 2016-11-29 14:29:37 -08:00
Diptanu Choudhury f67217297c Ensuring allocs are not added multiple times to blocking queue 2016-11-29 11:19:37 -08:00
Diptanu Choudhury bff172939b Fixes an issue with purging containers with the same name Nomad is trying to start 2016-11-28 17:37:22 -08:00
Michael Schurter 1f0bfa00aa rkt: Support host and none dns options
Fixes #2025
2016-11-28 13:13:40 -08:00
Michael Schurter 44e4414490 Fix rkt volumes
I forgot to validate the volumes field!
2016-11-28 13:13:40 -08:00
Alex Dadgar 4f2a6eae8b Merge pull request #2029 from gliptak/dockerauth1
Log when lookup in docker.auth.config fails
2016-11-28 12:45:19 -08:00
Alex Dadgar d8048ad75d Merge pull request #2033 from hashicorp/b-docker-container-exists
Make container exist errors non-retriable
2016-11-28 12:38:52 -08:00
Michael Schurter b3ede6a5b7 Use net.JoinHostPort instead of fmt.Sprintf
Using fmt.Sprintf breaks IPv6 addresses.
2016-11-28 10:38:54 -08:00
Alex Dadgar 8a641a8672 Make container exist errors non-retriable
This change makes it so that the task runner does not retry container
exists errors and also a sleep is added on the local retry.
2016-11-25 19:22:58 -08:00
Gábor Lipták 6268112e86 Log when lookup in docker.auth.config fails 2016-11-23 18:43:58 -05:00
Ranjib Dey 0b29ad8787 Fix error message. Pass on template args 2016-11-21 20:12:59 -08:00
Dmitry Galinsky 3ec7ebac9c Add network_aliases for docker driver 2016-11-16 11:16:07 +03:00
Alex Dadgar 0f426d219a Merge pull request #1993 from hashicorp/b-upgrade-path
Check for Ephemeral Disk being nil
2016-11-15 16:27:48 -08:00
Alex Dadgar c2697123a9 Merge pull request #1996 from hashicorp/t-failing-tests
Fix some failing tests
2016-11-15 16:27:19 -08:00
Alex Dadgar 3e5bfcdbc4 respond to comment 2016-11-15 16:27:07 -08:00
Alex Dadgar c47ebd508e Remove old TODOs 2016-11-15 16:23:37 -08:00
Alex Dadgar cb187ffce6 Fix TestRktDriver_PortsMapping and TestAgent_LoadKeyrings 2016-11-15 15:49:05 -08:00
Alex Dadgar 9497991590 Updated AWS speeds and network_speed now overrides
This PR:

* Makes AWS network speeds more granular
* Makes `network_speed` an override and not a default
* Adds a default of 1000 MBits if no network link speed is detected.

Fixes #1985
2016-11-15 13:55:51 -08:00
Alex Dadgar 88c7e04348 Check for Ephemeral Disk being nil 2016-11-15 10:03:06 -08:00
Alex Dadgar eba98da487 Merge pull request #1977 from hashicorp/b-volume-mount
Change relative path from joining against the alloc dir to the task's directory.
2016-11-10 15:20:49 -08:00
Alex Dadgar a11d66f639 Remove todo 2016-11-10 15:20:19 -08:00
Alex Dadgar 74a736155c Always disable renew_token for CT config
This PR makes Nomad always disable token renewal even if Vault is
disabled. The problem was when there was a vault token in the
environment variable and Nomad/Vault integration was disabled, the
template runner would still try to renew the token.
2016-11-10 15:16:08 -08:00
Alex Dadgar eea35626b7 Changes the relative path from joining against the alloc dir to the
task's directory.

This PR changes the behavior when given a relative host path when
mounting docker containers. Prior to this, the behavior was to mount by
joining against the alloc/ directory. This PR changes it to be against
the task/ directory.
2016-11-10 14:47:54 -08:00
Alex Dadgar e8d6227b20 Do not validate the command does not contain spaces.
This PR removes validation that the command string does not contain
spaces. This can cause issues where the path contains a folder that
includes a space ("C:\Program Files\Python35\python.exe").

Fixes #1737
2016-11-10 10:22:17 -08:00
Alex Dadgar ee921ccbb2 Merge pull request #1949 from carlpett/blacklist-fingerprints-and-drivers
Support blacklisting fingerprinters
2016-11-09 10:31:17 -08:00
Calle Pettersson 4304755c12 Address comments from PR 2016-11-09 11:50:16 +01:00
Alex Dadgar fe9a200979 Merge pull request #1952 from hashicorp/b-reserved-ports-aws
Run environmental fingerprinters after host fingerprinters and AWS overrides network
2016-11-08 15:35:46 -08:00
Alex Dadgar 20a5b6fa6b Merge pull request #1965 from hashicorp/b-docker-interpolate
Interpolate all docker driver configs that are strings
2016-11-08 15:35:27 -08:00
Alex Dadgar 3b33f49cde Merge pull request #1966 from hashicorp/b-service-interpolate
Interpolate all service/check fields
2016-11-08 15:35:19 -08:00
Alex Dadgar f1689bc7f9 Rkt env var 2016-11-08 15:14:04 -08:00
Alex Dadgar ddf101d7a2 Interpolate all check related variables 2016-11-08 14:43:46 -08:00
Alex Dadgar 691e09f863 remove debug 2016-11-08 14:21:37 -08:00
Alex Dadgar 9f2c0cb0c2 Interpolate everything that is a string 2016-11-08 14:20:51 -08:00
Diptanu Choudhury e4fdb849f9 Merge pull request #1960 from hashicorp/fix-perm-issues
Fixed permission issues on client
2016-11-08 12:57:18 -08:00
Diptanu Choudhury d9f8e3a75a Fixed comments 2016-11-08 12:55:15 -08:00
Alex Dadgar 742e11ddb4 Fix env vars relating to secretdir 2016-11-08 12:28:43 -08:00
Diptanu Choudhury 2132fbb68a Fixed permission issues on client 2016-11-08 10:57:29 -08:00
Alex Dadgar 79e55a9797 Merge pull request #1954 from hashicorp/b-secret-id
Add compatibility code for secret ID while upgrading cluster in both …
2016-11-08 09:39:52 -08:00
Calle Pettersson 8632696e2d Add blacklisting of drivers 2016-11-08 18:30:07 +01:00
Calle Pettersson b603bb007e Add blacklisting of fingerprinters 2016-11-08 18:29:44 +01:00
Alex Dadgar 9015e79aaa Add compatibility code for secret ID while upgrading cluster in both server/client mode on single nodes 2016-11-07 16:52:08 -08:00
Bastiaan Bakker 2c864172eb use snap.Alloc.TaskStates only after confirming snap.Alloc is not nil 2016-11-07 22:35:00 +01:00
Alex Dadgar 46893c7558 Merge pull request #1921 from hashicorp/f-abs-templ
Allow absolute paths for template sources
2016-11-07 12:28:49 -08:00
Alex Dadgar 92f526d902 Run environmental fingerprinters after host fingerprinters and do an override 2016-11-07 12:21:50 -08:00
Alex Dadgar 960424f086 Merge pull request #1941 from hashicorp/b-complete-transistion
Task state "dead" is terminal
2016-11-04 17:16:10 -07:00
Alex Dadgar 3643534531 Test fix 2016-11-04 17:15:58 -07:00
Alex Dadgar a9e9b61216 Merge pull request #1938 from hashicorp/b-docker-reattach
Fix Docker container creation and task runner updating
2016-11-04 17:14:40 -07:00