Commit Graph

3051 Commits

Author SHA1 Message Date
Chelsea Holland Komlo f5ea83da8d emit metrics using labels, add option for backwards compatibility 2017-09-05 14:12:57 +00:00
Chelsea Holland Komlo 0175f80775 add metrics options to client config 2017-09-05 14:12:57 +00:00
Armon Dadgar b8bf35f087 ACL RPCs allow stale reads for scalability 2017-09-04 13:07:44 -07:00
Armon Dadgar f31cd6a618 client: fixing policy resolution after ACL endpoint enforcement 2017-09-04 13:05:53 -07:00
Armon Dadgar ddcc5f89bc Add ErrPermissionDenied, rename TokenNotFound 2017-09-04 13:05:53 -07:00
Armon Dadgar 76a03f2d8e Address @dadgar feedback 2017-09-04 13:05:53 -07:00
Armon Dadgar e3f32ca6f1 client: adding token resolution logic 2017-09-04 13:05:36 -07:00
Armon Dadgar 688897561b client: adding token cache for ACL resolution 2017-09-04 13:05:36 -07:00
Armon Dadgar c2e72e8a9c client: create ACL and Policy cache 2017-09-04 13:05:35 -07:00
Armon Dadgar 792f176a44 agent: thread ACL config to client 2017-09-04 13:04:45 -07:00
Clint Armstrong b5c2636313 Always purge stopped containers 2017-08-31 14:28:48 -04:00
Clint Armstrong 7e35ab6abb fix logging re-init 2017-08-30 12:36:31 -04:00
Michael Schurter 78823d559b Squelch logspam when unable to get disk usage stats
To reproduce logspam:

```
$ docker plugin install --grant-all-permissions vieux/sshfs
$ nomad agent -dev
...
2017/08/25 17:09:03.282868 [WARN] client: error fetching host disk usage stats for /var/lib/docker/plugins/a8b4a69b07e5180f828d19e1e9e102ccc0e26f9c9939eaef85357260c30b20a7/rootfs/mnt/volumes: permission denied
... repeats every collection period ...
```
2017-08-28 12:04:32 -07:00
Alex Dadgar 876732833f Merge pull request #3073 from clinta/docker-500
Allow retry of 500 API errors to be handled by restart policies
2017-08-24 16:57:36 -07:00
Alex Dadgar fd7d614ae4 Handle interfaces that only have link-local addrs
This PR changes the fingerprint handling of network interfaces that only
contain link local addresses. The new behavior is to prefer globally
routable addresses and if none are detected, to fall back to link local
addresses if the operator hasn't disallowed it. This gives us pre 0.6
behavior for interfaces with only link local addresses but 0.6+ behavior
for IPv6 interfaces that will always have a link-local address.

Fixes https://github.com/hashicorp/nomad/issues/3005

/cc diptanuc
2017-08-23 15:32:22 -07:00
Alex Dadgar 211a793530 resolve feedback 2017-08-23 14:17:00 -07:00
Alex Dadgar 653733e093 Clean up docker mounts 2017-08-22 14:12:44 -07:00
Clint Armstrong ae230395ba Allow retry of 500 API errors to be handled by restart policies 2017-08-22 14:04:46 -04:00
Michael Schurter 51a27cc83d Merge pull request #3031 from hashicorp/f-2924-consul-headers
Add Header and Method support for HTTP checks
2017-08-18 13:35:08 -07:00
Michael Schurter 7ebd429a86 Merge mistake made go fmt fail 2017-08-18 13:19:44 -07:00
Michael Schurter 5c015da3cb Merge pull request #3021 from clinta/docker-mount2
Expose docker mount options
2017-08-17 16:57:09 -07:00
Michael Schurter ff3944a981 Update and test service/check interpolation 2017-08-17 16:49:14 -07:00
Michael Schurter b4813747d0 Merge pull request #3043 from hashicorp/f-2441-shutdown-delay
Add optional shutdown delay to tasks
2017-08-17 14:37:48 -07:00
Michael Schurter c709251ed6 Lower ShutdownDelay for non-Travis testing 2017-08-17 14:23:42 -07:00
Michael Schurter b33b2fb4c0 Lower shutdown delay in test 2017-08-17 13:57:22 -07:00
Michael Schurter 0726ca75e3 Make shutdown delay log DEBUG, not INFO 2017-08-17 11:28:33 -07:00
Clint Armstrong f0460156ae restrict mount to volume type 2017-08-17 09:52:13 -04:00
Michael Schurter d529b422b2 Add optional shutdown delay to tasks
Fixes #2441

Defaults to 0 (no delay) for backward compat and because this feature
should be opt-in.
2017-08-16 17:59:46 -07:00
Alex Dadgar d6187cd3e8 Fix tests 2017-08-16 16:26:52 -07:00
Alex Dadgar 1a86aecf55 Add version package
This PR adds a version package and consolidates version strings into a
Version struct.
2017-08-16 15:44:21 -07:00
Alex Dadgar 3d69961c3a Must be root for TestAllocDir_CreateDir 2017-08-16 10:46:14 -07:00
Alex Dadgar 7dd86b5dfe Merge pull request #3025 from hashicorp/f-health-events
Emit task events explaining alloc health
2017-08-15 12:23:46 -07:00
Alex Dadgar bb165b97ef comments 2017-08-15 12:23:29 -07:00
Michael Schurter 1126268a81 Fix formatting 2017-08-15 10:37:02 -07:00
Michael Schurter 74d5c272c6 Cleanup comments and return val 2017-08-14 16:59:03 -07:00
Michael Schurter 46b7fd45d7 spelling 2017-08-14 16:55:59 -07:00
Michael Schurter de8ea243b6 Return move errors from local Migrate like remote
Since alloc runner just logs these errors and continues there's no
reason not to return it.
2017-08-14 16:48:56 -07:00
Michael Schurter 7342e23669 Move migrating state into prevAllocWatcher 2017-08-14 16:02:28 -07:00
Alex Dadgar fdc0115427 test 2017-08-12 14:42:53 -07:00
Alex Dadgar 56801349eb Refactor health watcher and emit events 2017-08-12 14:23:36 -07:00
Michael Schurter 4601419d63 Soft fail on migration errors 2017-08-11 16:50:30 -07:00
Michael Schurter 3dbd764969 Exit if alloc listener closes
Add test for that case, add comments, remove debug logging
2017-08-11 16:22:02 -07:00
Michael Schurter b7915bdac7 Update tests for new blocking/migrating code 2017-08-11 16:21:57 -07:00
Michael Schurter ad6cec9e82 Set failed status instead of panic'ing
Fixup some TODOs and formatting left from new prevAllocWatcher code.
2017-08-11 16:21:35 -07:00
Michael Schurter e41a654917 switch from alloc blocker to new interface
interface has 3 implementations:

1. local for blocking and moving data locally
2. remote for blocking and moving data from another node
3. noop for allocs that don't need to block
2017-08-11 16:21:35 -07:00
Michael Schurter ee04717a0b initial attempt at refactoring blocked/migrating 2017-08-11 16:21:35 -07:00
Michael Schurter ec6e6e6c66 Only set alloc status if it's not already terminal 2017-08-11 16:21:35 -07:00
Alex Dadgar 0d5127d5fc Merge pull request #3011 from hashicorp/b-cv-fix-TestEnvAWSFingerprint_aws
Updated AWS fingerprint test for ami-id
2017-08-11 10:58:22 -07:00
Alex Dadgar 2fdfd9af4a Merge pull request #2992 from decoomanj/master
Added dnsoptions to the docker driver
2017-08-11 10:12:36 -07:00
Charlie Voiselle 507c75bd16 Updated AWS fingerprint test for ami-id
In https://github.com/hashicorp/nomad/pull/2999, I changed ami-id
to non-unique.  This updates the test to reflect that.
2017-08-11 12:54:27 -04:00
Jan De Cooman 8b88d56c01 updated message in test 2017-08-11 09:24:15 +02:00
Alex Dadgar 1b061b8f47 Unmount task directories when alloc is terminal
This PR unmounts directories from tasks when the alloc is terminal
rather than when it is garbage collected.

/cc @angrycub
2017-08-10 13:28:17 -07:00
Alex Dadgar 6e20acb503 Merge pull request #2984 from hashicorp/b-tags
Fix alloc health with checks using interpolation
2017-08-10 13:07:25 -07:00
Alex Dadgar 6b238edc22 Merge pull request #3001 from hashicorp/f-template-events
Template emits events explaining why it is blocked
2017-08-10 13:00:58 -07:00
Alex Dadgar bd9f63d20e address comments 2017-08-10 13:00:06 -07:00
Clint Armstrong 9063b500e0 expose mount options to nomad 2017-08-10 12:37:17 -04:00
Alex Dadgar 83ba2f1814 Template emits events explaining why it is blocked
This PR does the following:
* Adds a mechanism to emit events in the TaskRunner
* Vendors a new version of Consul-Template that allows extraction of
missing dependencies
* Adds logic to our consul_template.go to determine missing events and
emit them in a batched fashion.
* Refactors the consul_template code to split the run method and take in
a config struct rather than many parameters.

Fixes https://github.com/hashicorp/nomad/issues/2578
2017-08-09 18:01:27 -07:00
Charlie Voiselle ae466eaaa7 AMI ID is potentally non-unique
Changed the keys map to reflect that.
2017-08-09 12:53:54 -04:00
Jan De Cooman 633bcee661 fixed typo 2017-08-09 14:44:38 +02:00
Jan De Cooman 804fc0d06f added dnsoptions to the docker driver 2017-08-09 13:30:06 +02:00
Alex Dadgar aba107be99 Merge pull request #2979 from lfarnell/cleanup
Code cleanup
2017-08-08 10:21:15 -07:00
Alex Dadgar 4f6f6a13c8 Emit generic task events 2017-08-07 21:26:04 -07:00
Alex Dadgar 79d25b7db9 Merge pull request #2947 from hashicorp/f-vault-grace
Allow template to set Vault grace
2017-08-07 16:29:53 -07:00
Alex Dadgar 93b9a1bf20 Rename runnerConfig 2017-08-07 16:29:42 -07:00
Alex Dadgar d86b3977b9 Fix alloc health with checks using interpolation
Fixes an issue in which the allocation health watcher was checking for
allocations health based on un-interpolated services and checks. Change
the interface for retrieving check information from Consul to retrieving
all registered services and checks by allocation. In the future this
will allow us to output nicer messages.

Fixes https://github.com/hashicorp/nomad/issues/2969
2017-08-07 16:27:08 -07:00
Luke Farnell f0ced87b95 fixed all spelling mistakes for goreport 2017-08-07 17:13:05 -04:00
Michael Schurter c76b3b54b9 Merge branch 'master' into fix-pending-state 2017-08-03 17:27:03 -07:00
Alex Dadgar 067a638478 Allow template to set Vault grace
This PR allows a template to specify the Vault grace duration.

Fixes https://github.com/hashicorp/nomad/issues/2922
2017-08-01 14:14:08 -07:00
Alex Dadgar 562ea52c8e vendor vault api 2017-08-01 09:30:55 -07:00
Michael Schurter 6243c9eb86 Merge pull request #2883 from kmalec/add-support-for-readonly-mount
rkt driver support for read-only volumes mounts
2017-07-31 10:58:22 -07:00
Alex Dadgar 010567dba8 Fix leaked plugin files for syslog server
This PR fixes a leaking of the unix socket used when launching a syslog
server for the Docker driver.

Fixes https://github.com/hashicorp/nomad/issues/2844
2017-07-30 17:51:38 -07:00
Alex Dadgar a9c786a4fe Make test Vault pick random ports 2017-07-25 17:40:59 -07:00
Michael Schurter b01dd31f26 Don't attempt to restore tasks that never sync'd 2017-07-24 15:58:46 -07:00
Alex Dadgar 031da7a21c fix vet 2017-07-22 22:43:33 -07:00
Alex Dadgar 0f3f1ea68b travis check fixes 2017-07-22 21:01:22 -07:00
Alex Dadgar c1a72d24e6 fingerprinters 2017-07-22 20:38:03 -07:00
Alex Dadgar 62c55c8fc9 fix slow resolve on mac 2017-07-22 19:58:30 -07:00
Alex Dadgar 72d055aa9c drop rkt deadline 2017-07-22 19:54:06 -07:00
Alex Dadgar 219fecc705 Merge branch 'master' of github.com:hashicorp/nomad 2017-07-22 19:48:54 -07:00
Alex Dadgar d760e68774 darwin test fixes 2017-07-22 19:48:47 -07:00
Alex Dadgar 553bc91725 Parallel client tests (#2890)
* alloc_runner

* Random tests

* parallel task_runner and no exec compatible check

* Parallel client

* Fail fast and use random ports

* Fix docker port mapping

* Make concurrent pull less timing dependant

* up parallel

* Fixes

* don't build chroots in parallel on travis

* Reduce parallelism on travis with lxc/rkt

* make java test app not run forever

* drop parallelism a little

* use docker ports that are out of the os's ephemeral port range

* Limit even more on travis

* rkt deadline
2017-07-22 19:04:36 -07:00
Alex Dadgar b6f0782732 typo 2017-07-22 12:55:30 -07:00
Alex Dadgar 8cf9d15b01 typo 2017-07-22 12:33:07 -07:00
Alex Dadgar 9e9c20ca77 small fixes 2017-07-22 12:25:02 -07:00
Alex Dadgar 5a3df2ed89 Merge pull request #2888 from hashicorp/b-fix-allocrunner-test
Fix TestAllocRunner_TaskLeader_StopTG and unrelated races
2017-07-22 11:44:04 -07:00
Alex Dadgar 46c8bec9b0 faster vaultclient 2017-07-21 19:38:37 -07:00
Michael Schurter d840fc8c95 Fix tr race by not sharing alloc/task
prestart only needs the original alloc/task so pass their pointers in.
Task updates may concurrently replace the pointer on tr.
2017-07-21 16:17:42 -07:00
Michael Schurter a22cfa8387 Minor test race fix 2017-07-21 16:17:23 -07:00
Michael Schurter 9a7a1d8c13 Fix race by not accessing tr.task from ar 2017-07-21 16:16:53 -07:00
Michael Schurter 2e9a1e3fa6 Remove unneeded saveTaskRunnerState method
Collapse it into the one place it's called
2017-07-21 16:16:02 -07:00
Michael Schurter 996ce9286e Fix test race by locking around ar.tasks access 2017-07-21 14:25:51 -07:00
Michael Schurter 8d1d8eac46 Fix handle race 2017-07-21 14:00:32 -07:00
Michael Schurter 5f40901422 Fix more test races 2017-07-21 14:00:21 -07:00
Michael Schurter b9ba447399 Fixup a few more even rarer test races 2017-07-21 13:43:32 -07:00
Michael Schurter 38cb2021dd Always interpolate task before calling with Consul
Also switch to returning a copy of the task to avoid races between
altering the Task and persitence.
2017-07-21 13:37:16 -07:00
Michael Schurter 6e80a8ee39 Fix TestAllocRunner_TaskLeader_StopTG
Also make alloc runner tests less racy. Basically every alloc runner
test used to have races with `upd.{Count,Allocs}`
2017-07-21 13:37:16 -07:00
Alex Dadgar e509661cf9 executor and logging pkg 2017-07-21 12:14:54 -07:00
Alex Dadgar 7c433a1767 Parallel 2017-07-21 12:06:39 -07:00
Karel Malec 4b98f94a88 Allow rkt driver to mount volumes read-only 2017-07-21 13:05:15 +02:00
Alex Dadgar 56f9cf86df Speed up client startup 2017-07-20 22:34:24 -07:00
Michael Schurter 0d7f7e2b9d Merge pull request #2878 from hashicorp/b-save-state
Fix state handling on restart
2017-07-20 17:16:46 -07:00
Karel Malec cf985f011c Pass task group name as NOMAD_GROUP_NAME environment variable 2017-07-21 01:22:54 +02:00
Alex Dadgar 09c8ee621b Destroy tasks that are part of terminal alloc 2017-07-20 12:02:04 -07:00
Michael Schurter 9a7f649e56 Don't save task runner state if it is destroyed 2017-07-20 10:17:41 -07:00
Alex Dadgar 64776b1370 Should not persist state after alloc_runner is garbage collected 2017-07-19 17:31:30 -07:00
Michael Schurter c1b8bef813 Use broadcast send retry logic everywhere 2017-07-18 14:36:32 -07:00
Alex Dadgar d2381c9263 Merge pull request #2853 from hashicorp/b-watcher
Improve alloc health watcher
2017-07-18 14:12:28 -07:00
Alex Dadgar bd43bd509c Save deployment status 2017-07-18 12:37:52 -07:00
Alex Dadgar 41f67e3535 Small fixes 2017-07-18 12:19:57 -07:00
Michael Schurter c24e73ede7 Fix deadlock caused by syncing during destroy
When replacing an alloc the new alloc is blocked until the old alloc is
destroyed. This could cause a deadlock:

1. Destroying the old alloc includes a final sync of its status
2. Syncing status causes a GC
3. A GC looks for terminal allocs to cleanup
4. The GC waits for an alloc to stop completely before GC'ing

If the GC chooses the currently-being-destroyed-alloc to GC, the GC
deadlocks. If `client.max_parallel` deadlocks happen the GC is wedged
until the Nomad process is restarted.

Performing the final sync asynchronously is an ugly hack but prevents
the deadlock by allowing the final sync to occur after the alloc runner
has shutdown and been destroyed.
2017-07-18 11:12:56 -07:00
Michael Schurter 420be86e39 Test AllocDir.Copy 2017-07-17 15:46:54 -07:00
Michael Schurter cdb2e96d99 Add AllocRunner.allocID for ease-of-use
Since the AllocRunner.alloc struct can be mutated, most of AllocRunner
needs to acquire a lock to get the alloc's ID. Log lines always need to
include the alloc ID, so we often skipped acquiring a lock just to grab
the ID and accepted the race.

Let's make the race detector a little happier by storing the ID in a
single assignment field.
2017-07-17 15:46:54 -07:00
Michael Schurter 181fda825a Fix log level 2017-07-17 15:46:54 -07:00
Michael Schurter 98f6e7f10f Don't fail if task dirs don't exist on creation
Task dir metadata is created in AllocRunner.Run which may not run before
an alloc is sync'd and Nomad exits. There's no reason not to just create
task dir metadata on restore if it doesn't exist.
2017-07-17 15:46:54 -07:00
Michael Schurter 51515cbe0c Ensure allocDir is never nil and persisted safely
Fixes #2834
2017-07-17 15:46:54 -07:00
Alex Dadgar 0821ee67f5 Fix alloc broadcaster panic on double close 2017-07-17 14:09:05 -07:00
Michael Schurter 0a6bf87365 Fix nil panic in Docker error condition
Fixes #2835

Yet another bug caused by overwriting container and then trying to
reference container.ID in the err handling block. Did a quick audit of
docker.go and it seems to be the last offender. See #2804 for previous
bug.
2017-07-14 10:48:19 -07:00
Michael Schurter e9a416b731 Merge branch 'master' into fix-pending-state 2017-07-10 10:43:23 -07:00
unknown 26b16fa3ce #2563 fixed pending state for allocations with terminal status 2017-07-09 16:18:06 +03:00
Alex Dadgar 05894f4611 Small fixes 2017-07-07 17:34:50 -07:00
Michael Schurter fecb16cfb2 Merge pull request #2793 from hashicorp/b-2776-ct-vault-servername
Propagate vault.tls_server_name to consul-template
2017-07-07 16:44:19 -07:00
Michael Schurter 95a9a5da71 Merge pull request #2787 from hashicorp/f-docker-test-mac
Test #2652 - Docker MAC Address option
2017-07-07 16:22:10 -07:00
Michael Schurter 4be4df21c9 Merge pull request #2797 from hashicorp/f-2785-docker-bridge-ip
Add driver.docker.bridge_ip node attribute
2017-07-07 16:20:20 -07:00
Michael Schurter 94389c3ecc Remove debug logging 2017-07-07 16:19:42 -07:00
Michael Schurter 5e3e3818db Merge pull request #2804 from hashicorp/b-2802-docker-panic
Don't panic in container list/remove/inspect race
2017-07-07 15:35:51 -07:00
Michael Schurter 67a7b0eac9 Don't panic in container list/remove/inspect race
Fixes #2802

While it's hard to reproduce the theoretical race is:

1. This goroutine calls ListContainers()
2. Another goroutine removes a container X
3. This goroutine attempts to InspectContainer(X)

However, this bug could be hit in the much simpler case of
InspectContainer() timing out.

In those cases an error is returned and the old code attempted to wrap
the error with the now-nil container.ID. Storing the container ID fixes
that panic.
2017-07-07 15:10:59 -07:00
Alex Dadgar bf97a2455c Vet and small improvement on watcher failure detection 2017-07-07 14:53:01 -07:00
Alex Dadgar 45712c6ca3 test fixes 2017-07-07 14:11:27 -07:00
Alex Dadgar ade9a7c768 @jippi Changed my mind! Good suggestion 2017-07-07 12:12:48 -07:00
Alex Dadgar c063eba836 Warn log 2017-07-07 12:10:04 -07:00
Alex Dadgar 067ed86a47 Client watches for allocation health using task state and Consul checks
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
2017-07-07 12:10:04 -07:00
Alex Dadgar 001058227e watcher per alloc 2017-07-07 12:07:08 -07:00
Alex Dadgar 2e2fd26bed Update index 2017-07-07 12:07:08 -07:00
Alex Dadgar ecee5e370e initial watcher 2017-07-07 12:07:08 -07:00
Alex Dadgar c77944ed29 assign names 2017-07-07 12:03:11 -07:00
Michael Schurter 084dd384c1 Add driver.docker.bridge_ip node attribute
Fixes #2785
2017-07-07 10:14:10 -07:00
Michael Schurter d38d48151a Propagate vault.tls_server_name to consul-template
Fixes #2776
2017-07-06 16:56:50 -07:00
Michael Schurter 39edf23fd5 Merge pull request #2786 from hashicorp/f-docker-auth-soft-fail
Default to auth hard fail but optionally soft fail
2017-07-06 13:25:56 -07:00
Michael Schurter bae1b7db2d Test #2652
Also cleanup docker config opts docs
2017-07-06 12:46:25 -07:00
Michael Schurter 8f4353779a Merge branch 'master' into master 2017-07-06 12:09:36 -07:00
Michael Schurter 2900f941b5 Default to auth hard fail but optionally soft fail 2017-07-06 11:35:34 -07:00
Michael Schurter 08b452adf5 Merge pull request #2781 from hashicorp/f-2678-getter-mode
Add support for go-getter modes
2017-07-06 11:06:40 -07:00
Michael Schurter b000bb8598 Merge pull request #2744 from aep/master
Do not fail when no docker registry auth is available
2017-07-06 11:04:11 -07:00
Michael Schurter 0d3bdf7210 Add support for go-getter modes
Fixes #2678
2017-07-06 10:45:44 -07:00
Michael Schurter 644f0cfaa4 Consistently quote alloc ids in client logs 2017-07-06 10:24:52 -07:00
Michael Schurter 4fd9ef6a8c Tiny client race condition fix
Plus some logging improvements that may help with #2563
2017-07-05 16:15:19 -07:00
Michael Schurter 8e2e26c607 rkt: use %s instead of %q when interpolating env
Fixes #2686
2017-07-05 09:36:17 -07:00
Michael Schurter b2382f99f2 0 compute == error 2017-07-03 14:51:02 -07:00
Michael Schurter ecf090e980 Fix cpu_total_compute override 2017-07-03 14:51:02 -07:00
Michael Schurter 2d741c770b Merge pull request #2732 from hashicorp/b-persist-alloc-updates
Persist Alloc when EvalID changes
2017-07-03 14:46:43 -07:00
Michael Schurter 56a6f8ca8a Merge pull request #2763 from hashicorp/f-bad-state-help
Add more logging to restore state errors
2017-07-03 14:45:03 -07:00
Michael Schurter 9d4b0651ef Merge pull request #2753 from hashicorp/b-leader-dies-first
Destroy task group leader first
2017-07-03 14:38:04 -07:00
Michael Schurter 6e7cc3964e Merge pull request #2709 from hashicorp/f-advertise-docker-ips
Advertise driver-specific addresses
2017-07-03 14:04:12 -07:00
Michael Schurter 5ec52ec24a Destroy task group leader first
Before this commit all tasks in a task group were destroyed
concurrently. This meant logging sidecars might be stopped before the
leader task whose logs still need to be shipped.

This commit blocks on the leader shutting down before signalling to
followers to shutdown.
2017-07-03 13:56:56 -07:00
Michael Schurter 596727230b Suggest wiping out alloc dir too 2017-07-03 12:29:21 -07:00
Michael Schurter 11f68bfca2 Add more logging to restore state errors 2017-07-03 11:58:41 -07:00
Arvid E. Picciani aa4f029f10 Do not fail when no docker registry auth is available
this amends the behaviour introduced with #2651
and allows pulling public images when docker.auth.helper is set
2017-06-27 11:11:18 +02:00
Michael Schurter 8fcf866a7d Fix some tests still expecting reverted behavior 2017-06-23 16:51:38 -07:00
Michael Schurter e81252ba45 Default no_host_uuid to true instead of false
The host UUID isn't unique in many virtualized cases and of dubious
value even when it is univerally unique. Default to a random UUID.
2017-06-23 16:23:01 -07:00
Michael Schurter 5a274e6683 Style and comments 2017-06-23 15:20:04 -07:00
Michael Schurter cff8546035 Fix spelling & re-add immutable state struct 2017-06-23 13:01:39 -07:00
Michael Schurter d359d3b554 Rename immutable -> alloc
meh; naming is hard
2017-06-23 10:58:36 -07:00
Michael Schurter af2fc0f1bc Persist Alloc when EvalID changes 2017-06-22 17:33:12 -07:00
Michael Schurter f3a6ddc57d Remove DRIVER env vars
Also make NOMAD_ADDR_* use host ip:port for consistency. NOMAD_PORT_*
varies based on port map and the driver IP isn't exposed as an env var
as the only place it can be used is in script checks anyway.
2017-06-21 17:19:08 -07:00
Michael Schurter 0633d0c286 Have Qemu return PortMap 2017-06-21 17:19:08 -07:00
Michael Schurter 38a0695687 Simplify Docker Networks processing 2017-06-21 17:19:08 -07:00
Michael Schurter fec83b271a Bump error log level 2017-06-21 17:19:08 -07:00
Michael Schurter 8d677bc6b9 Fix lxc tests 2017-06-21 17:19:08 -07:00
Michael Schurter 8d440b1675 Skip DRIVER env vars for labels without a port mapping 2017-06-21 17:19:08 -07:00
Michael Schurter c0eff81383 Fix Service.AddressMode changes during task updates 2017-06-21 17:19:08 -07:00
Michael Schurter 67d154a274 Test driver network advertisement and checks 2017-06-21 17:19:08 -07:00
Michael Schurter b9bfb84b53 Implement DriverNetwork and Service.AddressMode
Ideally DriverNetwork would be fully populated in Driver.Prestart, but
Docker doesn't assign the container's IP until you start the container.

However, it's important to setup the port env vars before calling
Driver.Start, so Prestart should populate that.
2017-06-21 17:19:08 -07:00
Hynek Schlawack 59ab34c264 Fix typos 2017-06-16 16:10:12 +02:00
Michael Schurter b69e060071 Log PID when sending signals 2017-06-12 11:11:36 -07:00
Michael Schurter ffb417a300 Merge pull request #2697 from hashicorp/b-port-map
Fix port map interpolation for docker
2017-06-09 13:29:36 -07:00
Michael Schurter a3827d2cc6 Fix bad merge conflict resolution 2017-06-09 10:40:47 -07:00
Michael Schurter eabd6759c6 Merge branch 'master' into add-no-overlay-option 2017-06-09 09:59:35 -07:00
Alex Dadgar 5ba2662b30 Merge pull request #2687 from mmickan/issue-2685
Include symlinks in snapshots when migrating disks
2017-06-08 13:35:46 -07:00
Michael Schurter 784d69789e Merge branch 'master' into add-no-overlay-option 2017-06-08 13:15:56 -07:00
Alex Dadgar 7695e636d5 Fix port map interpolation for docker
This PR fixes an issue in which the value of the portmap could not be
interpolated.

Fixes https://github.com/hashicorp/nomad/issues/2680
2017-06-08 13:12:32 -07:00
Karel Malec b55f4bf601 Fix backticks in docs; refine --debug comment 2017-06-07 21:11:22 +02:00
Karel Malec a258a803f2 Added insecure_options config list 2017-06-07 09:58:42 +02:00
Karel Malec 1957e9dfa6 Add a no_overlay option for the rkt task config. 2017-06-07 00:17:33 +02:00
Mark Mickan c196d320f8 Add tests for migrating symlinks in alloc and local directories 2017-06-04 15:56:22 +09:30
Mark Mickan 236f24c9a4 Include symlinks in snapshots when migrating disks
Fixes #2685
2017-06-04 00:36:18 +09:30
Michael Schurter d1dd380890 Switch to hashicorp/go-envparse 2017-06-02 15:58:52 -07:00
Michael Schurter a552bcdb55 Move env file parsing to a library 2017-06-02 15:03:27 -07:00
Alex Dadgar 3b46fe136f small cleanup 2017-05-31 15:56:54 -07:00
Alex Dadgar 8d6e28ace8 Merge branch 'master' into feature/2334 2017-05-31 14:27:07 -07:00
Alex Dadgar 044f1da5ff Merge pull request #2681 from hashicorp/b-deadlock
Fix a deadlock relating to blocked allocations
2017-05-31 14:26:54 -07:00
Alex Dadgar ec9cb2c751 Merge pull request #2672 from eyberg/master
dont throw away errors in log rotation
2017-05-31 14:14:22 -07:00
Alex Dadgar b1eea2269a Fix deadlock 2017-05-31 14:05:47 -07:00
Michael Schurter cb568a5cf6 Cleanup lots of leaked alloc runners in tests 2017-05-31 11:39:50 -07:00
Ulrik Mikaelsson 6138564f00 Implement support for docker-credential-helpers
Solves: #2334
2017-05-31 12:45:02 +02:00
Michael Schurter ffc2b36dc7 Merge pull request #2636 from hashicorp/f-gc-alloc-limit
Add new gc_max_allocs tuneable
2017-05-30 16:14:09 -07:00
Michael Schurter dd51aa1cb9 Merge pull request #2654 from hashicorp/f-env-consul
Add envconsul-like support and refactor environment handling
2017-05-30 14:40:14 -07:00
Michael Schurter e1a7c2d6d7 Fix Error -> Errorf 2017-05-30 12:08:59 -07:00
Michael Schurter 53d713bacb Fix getter tests
And use an interface for ReplaceEnv since its all getter needs.
2017-05-26 16:52:47 -07:00
Michael Schurter 51d8231911 Fix executor tests 2017-05-26 16:46:03 -07:00
Michael Schurter 3184616936 Always use PATH-only env for rkt commands 2017-05-26 15:41:26 -07:00
Michael Schurter 83543b1d80 Use custom TaskEnv to provide PATH for rkt 2017-05-26 15:24:14 -07:00
Michael Schurter 3afade9675 Let's pretend I never committed this 2017-05-26 15:06:59 -07:00
Michael Schurter 3eb0827c19 Fix and test multi-env-template loading 2017-05-25 17:13:33 -07:00
Alex Dadgar d279c1f5e6 Merge pull request #2675 from hashicorp/b-perms
Fix perms to just set exec bit
2017-05-25 14:45:08 -07:00
Alex Dadgar 28aef447e9 Fix perms to just set exec bit 2017-05-25 14:44:13 -07:00
Ian Eyberg 3fbc58997b dont throw away errors in log rotation 2017-05-25 11:49:33 -07:00
Michael Schurter 6d67d4dfbf Fix formatting 2017-05-25 09:38:49 -07:00
Michael Schurter 796d3250a5 Comment and correct formatting 2017-05-25 09:30:58 -07:00
Michael Schurter 5f9cb4c514 Switch tests to mock_driver 2017-05-25 09:28:10 -07:00
Michael Schurter ce8c4fa520 Fail fast on env template failures 2017-05-24 13:44:52 -07:00
Michael Schurter 4963cf07d7 Add env file test 2017-05-23 17:07:25 -07:00
Michael Schurter a233b0401d Add env testing 2017-05-23 16:46:29 -07:00
Michael Schurter d793dde4e9 Shrink chroot to avoid timing test failure 2017-05-23 16:11:24 -07:00
Michael Schurter 15ef740ab6 Add env.Builder.UpdateTask for alloc updates 2017-05-23 16:00:57 -07:00
Michael Schurter f324ca4683 Fix test data 2017-05-23 13:53:34 -07:00
Michael Schurter e7db2c9b0e Handle Driver.Prestart returning nil, nil 2017-05-23 13:53:34 -07:00
Michael Schurter fd9bef768f Move task env into execcontext
Also inject PATH into rkt commands since we're no longer appending host
env vars for it.
2017-05-23 13:53:34 -07:00
Michael Schurter 59ad200d1a Fix env var interpolation and env tests 2017-05-23 13:53:34 -07:00
Michael Schurter 37b148fb60 Add PortMap to struct returned by Driver.Prestart
Moves env.Builder out of drivers entirely so one less thing to worry
about when implementing driver plugins.
2017-05-23 13:53:34 -07:00
Michael Schurter 83c641ced0 Move env template handling into consul_template.go 2017-05-23 13:53:34 -07:00
Michael Schurter a2eb7d17c9 Improve PortMap handling and simplify Builder creation 2017-05-23 13:53:34 -07:00
Michael Schurter 8da110855e Move path building to task dir initialization 2017-05-23 13:53:34 -07:00
Michael Schurter d2c08ff24b Refactor TaskEnvironment into Builder and TaskEnv 2017-05-23 13:53:33 -07:00
Michael Schurter 10b6610e56 Functional consul template env file support 2017-05-23 13:45:14 -07:00
Brandon Fulljames 6d7caa5f6b Fix for test 2017-05-19 16:59:07 -07:00
Brandon Fulljames 6b2d5bd4fd Add SecurityOpt as a config field in Docker driver 2017-05-19 16:18:49 -07:00
Alex Dadgar 6d76fb2f17 Merge branch 'master' of github.com:hashicorp/nomad 2017-05-17 14:46:03 -07:00
Alex Dadgar ee8dd84965 Fix nil job on allocation
The way the copying was happening on the alloc_runner was by temporarily
setting the alloc.Job to nil, copying and then restoring it. This
created an issue in which when the alloc was shared (which it is in
server/client mode and between alloc_runner/task_runner) there were race
conditions that could create a panic.

Fixes https://github.com/hashicorp/nomad/issues/2605
2017-05-17 14:07:06 -04:00
Michael Schurter 06f937bf28 Merge pull request #2591 from hashicorp/b-2180-script-updates
Properly interpolate services on updated tasks
2017-05-17 09:09:01 -07:00
johannesa 323a0a78f3 Fixed wrong newline 2017-05-17 16:51:22 +02:00
johannesa ea644237cf mac address pinning in docker driver
This commit adds mac address pining to the docker driver.
2017-05-17 16:41:00 +02:00
Michael Schurter 3841692138 gc_max_allocs should include blocked & migrating 2017-05-12 16:03:22 -07:00
Michael Schurter 49ce86ee0a Lower default gc_max_allocs to 50 2017-05-12 15:57:27 -07:00
Michael Schurter 0453c2709c Add new gc_max_allocs tuneable
More than gc_max_allocs may be running on a node, but terminal allocs
will be garbage collected to try to keep the total number below the
limit.
2017-05-11 17:18:02 -07:00
Alex Dadgar 68c3a2bd98 Fix vet errors 2017-05-11 13:08:08 -07:00
Alex Dadgar 23685d65d0 Merge pull request #2610 from hashicorp/f-bolt-db
Client persist state using bolt-db and more efficient write patterns
2017-05-09 13:01:36 -07:00
Alex Dadgar 3cd7e06fba Fix test 2017-05-09 11:35:48 -07:00
Michael Schurter 3b15db1ac4 Ignore Consul deregister errors on executors
Errors here only occur if Consul is not running when Nomad is restarted.

Errors here are only an issue if:
 * Consul is being used but is down or misbehaving
 * The executor is old (<0.6)
 * The task has services
 * The services hit a pre-0.6 consul.Syncer bug

If all of those conditions are met the pre-0.6 bugs will persist for
this task until Nomad is restarted.
2017-05-09 11:28:27 -07:00
Alex Dadgar ba70cc4f01 Merge branch 'master' into f-bolt-db 2017-05-09 11:11:55 -07:00
Alex Dadgar 843bc26e5d Respond to comments 2017-05-09 10:50:24 -07:00
Michael Schurter 85210eb92f Update consul/api to support unix socket addrs
Fixes #2594
2017-05-08 11:57:04 -07:00
Michael Schurter 5b8415df2c Merge pull request #2585 from hashicorp/b-2554-container-exec
Execute exec/java script checks in containers
2017-05-05 10:31:18 -07:00
Michael Schurter 28e17b7a09 Test pre06ScriptCheck 2017-05-04 16:49:00 -07:00
Michael Schurter b11d1ad1ce Oops, remove dev logging 2017-05-04 16:27:04 -07:00
Michael Schurter 897b516117 Reuse ExecScript implementation 2017-05-04 16:21:40 -07:00
Michael Schurter f286b6b798 Build new env from new alloc before interpolating 2017-05-04 15:06:15 -07:00
Alex Dadgar 2d54ee2925 Fix tests 2017-05-03 15:14:19 -07:00
Kate Taggart 706e09748b fix test. 2017-05-03 12:45:59 -07:00
Kate Taggart 2fb6301b37 responding to feedback on PR: remove Region from Node struct, some grammatical niceties. 2017-05-03 12:45:59 -07:00
Kate Taggart af22cb722e I think I did it. 2017-05-03 12:45:59 -07:00
Kate Taggart 277d5ddb36 allow region to also be interpolated, like datacenter. 2017-05-03 12:45:59 -07:00
Kate Taggart 9fa97c38a0 just found out dc and region are going to be a bit different, as Region didn't previously exist in the Node struct. 2017-05-03 12:45:59 -07:00
Kate Taggart 1a5ebace03 add helper functions to set and clear dc, region names. 2017-05-03 12:45:59 -07:00
Kate Taggart 2062d5d1dc add dc and region to task env building function. flipped order of a couple functions to be consistent with the ordering of the rest of that section. 2017-05-03 12:45:59 -07:00
Kate Taggart e572ec8997 starting the jelly bean trail for two new env vars 2017-05-03 12:45:59 -07:00
Alex Dadgar 730e49a598 Helpful comment 2017-05-03 11:27:33 -07:00
Alex Dadgar 1d8444bc1e Fix tests 2017-05-03 11:15:30 -07:00
Alex Dadgar e00f9c9413 Restore state + upgrade path 2017-05-02 18:21:49 -07:00
Alex Dadgar ec101b4760 Revert "metrics"
This reverts commit 4d6a012c6fb6f1fba6c62985d091b1a20c3198e7.
2017-05-02 09:28:11 -07:00
Alex Dadgar 8e516b5dc2 Async and sync saving of client state 2017-05-01 16:16:53 -07:00
Alex Dadgar a7fd08d42a perf 2017-05-01 16:01:50 -07:00
Alex Dadgar e010fdf8c0 metrics 2017-05-01 14:51:27 -07:00
Alex Dadgar d779defe65 Use batching 2017-05-01 14:50:34 -07:00
Alex Dadgar b94f855326 boltDB database for client state 2017-05-01 14:50:34 -07:00
Alex Dadgar bddedd7aba Don't deepcopy job when retrieving copy of Alloc
This PR removes deepcopying of the job attached to the allocation in the
alloc runner. This operation is called very often so removing reflect
from the code path and the potentially large number of mallocs need to
create a job reduced memory and cpu pressure.
2017-05-01 14:50:34 -07:00
Pete Wildsmith 6a09d5300f address feedback 2017-04-28 10:27:37 +01:00
Pete Wildsmith 418d59fd00 clean up consul earlier when destroying a task 2017-04-27 23:29:30 +01:00
Michael Schurter cafefa049b Properly interpolate services on updated tasks
Previously was interpolating the original task's services again.

Fixes #2180

Also fixes a slight memory leak in the new consul agent. Script check
handles weren't being deleted after cancellation.
2017-04-26 11:22:01 -07:00
Michael Schurter 20322a5e92 Test pre-0.6 script check upgrade path 2017-04-25 11:41:03 -07:00
Michael Schurter b965708c51 Test env+cgroups for exec driver checks 2017-04-25 11:13:06 -07:00
Michael Schurter c069ca0bfe Change raw_exec to use simplified exec wrapper 2017-04-21 16:50:20 -07:00
Michael Schurter 095d2ee340 Switch java/exec to use Exec in Executor 2017-04-21 16:25:49 -07:00
Michael Schurter a305b68159 Restart tasks on upgrade with script checks and old executors 2017-04-21 16:25:49 -07:00
Michael Schurter a63162c7ff Fix Windows build. 2017-04-19 13:16:48 -07:00
Michael Schurter 83f9591d75 Thanks go vet! 2017-04-19 13:05:41 -07:00
Michael Schurter c8d3e869c6 Unregister from Consul when waiting for restart 2017-04-19 12:42:48 -07:00
Michael Schurter 16ac08ac8c Remove stale comment 2017-04-19 12:42:47 -07:00
Michael Schurter a3fc157233 Fix circular test imports 2017-04-19 12:42:47 -07:00
Michael Schurter caf317e3f2 Use a DriverAbility to expose Exec functionality 2017-04-19 12:42:47 -07:00
Michael Schurter 0a59982936 Move removal from Consul into TaskRunner cleanup 2017-04-19 12:42:47 -07:00
Michael Schurter 745ad9521f Move ScriptExecutor to driver 2017-04-19 12:42:47 -07:00
Michael Schurter 244251490a Add UpdateTask method instead of Remove/Add 2017-04-19 12:42:47 -07:00
Michael Schurter e204a287ed Refactor Consul Syncer into new ServiceClient
Fixes #2478 #2474 #1995 #2294

The new client only handles agent and task service advertisement. Server
discovery is mostly unchanged.

The Nomad client agent now handles all Consul operations instead of the
executor handling task related operations. When upgrading from an
earlier version of Nomad existing executors will be told to deregister
from Consul so that the Nomad agent can re-register the task's services
and checks.

Drivers - other than qemu - now support an Exec method for executing
abritrary commands in a task's environment. This is used to implement
script checks.

Interfaces are used extensively to avoid interacting with Consul in
tests that don't assert any Consul related behavior.
2017-04-19 12:42:47 -07:00
Michael Schurter 4b18f916d9 Set ownership on directories in chroot
Also support getOwner on all Unixes as they all have `Stat_t.{U,G}id`
2017-04-17 12:41:33 -07:00
Michael Schurter 83fb374d0b Don't disable hardlinking! 2017-04-17 11:03:15 -07:00
Michael Schurter 547d5a81b4 Chown files when copying into chroot
Fixes #2552

Not needed when hardlinking. Only adds Linux support but other OS's may
be easy.
2017-04-17 11:03:15 -07:00
Alex Dadgar 61f4a2dac6 Sync allocation state before waiting for a destroy
This change ensures that the client syncs allocation state with the
servers before entering its wait loop for the allocation to be
destroyed.

Fixes https://github.com/hashicorp/nomad/issues/2563
2017-04-14 13:09:54 -07:00
Alex Dadgar 67d29f684d Merge pull request #2519 from hashicorp/b-sticky-tmp
Add sticky bit to temp directory
2017-04-12 14:59:14 -07:00
Alex Dadgar ad5fea9875 Merge pull request #2541 from hashicorp/f-stable-distributed-id
Hash host ID so its stable and well distributed
2017-04-11 11:27:53 -07:00
Alex Dadgar cdd624ff5b Add ExtraHosts to Docker driver
This PR allows job submitters to add extra hosts to the containers
/etc/hosts file.

Fixes https://github.com/hashicorp/nomad/issues/2546
2017-04-11 10:52:41 -07:00
Diptanu Choudhury a96018fccc Fixed typo 2017-04-10 11:45:11 -07:00
Alex Dadgar 2321e8a4a0 Hash host ID so its stable and well distributed
This PR takes the host ID and runs it through a hash so that it is well
distributed. This makes it so that machines that report similar host IDs
are easily distinguished.

Instances of similar IDs occur on EC2 where the ID is prefixed and on
motherboards created in the same batch.

Fixes https://github.com/hashicorp/nomad/issues/2534
2017-04-10 11:44:51 -07:00
Diptanu Choudhury 45ad1298e6 Added a test 2017-04-08 13:44:21 -07:00
Alex Dadgar eed4a9124e Merge branch 'f-ipv6-fingerprint' of github.com:hashicorp/nomad into f-ipv6-fingerprint 2017-04-07 18:37:19 -07:00
Diptanu Choudhury 353f193983 Removed redundant code 2017-04-07 18:28:22 -07:00
Alex Dadgar 4e3e008864 Merge branch 'master' into f-ipv6-fingerprint 2017-04-07 17:11:24 -07:00
Diptanu Choudhury 963b9baf14 Ignoring link local addresses 2017-04-07 16:04:36 -07:00
Alex Dadgar 81188906a5 Merge pull request #2535 from clinta/docker-ip
Allow specifying container IP with docker driver
2017-04-07 12:59:12 -07:00
Clint Armstrong d71ddcb756 Allow specifying container IP with docker driver 2017-04-07 11:56:07 -04:00
Michael Schurter aede1478db Create AssertUntil helper func 2017-04-06 17:05:09 -07:00
Diptanu Choudhury b9e71f4349 Finding the appropriate cidr block 2017-04-06 16:04:59 -07:00
Michael Schurter 521354ee99 Improve test timings
1234ms was far longer than needed and not sleeping between iterations
was just mean.
2017-04-06 11:10:36 -07:00
Michael Schurter a81c387adf Require TLS for server RPC when enabled
Fixes #2525

We used to be checking a RequireTLS field that was never set. Instead we
can just check the TLSConfig.EnableRPC field and require TLS if it's
enabled.

Added a few unfortunately slow integration tests to assert the intended
behavior of misconfigured RPC TLS.

Also disable a lot of noisy test logging when -v isn't specified.
2017-04-06 09:34:36 -07:00
Diptanu Choudhury 0944d9d640 Making the fingerprinter support ipv6 ips 2017-04-06 01:06:32 -07:00
Alex Dadgar d0da30324e Split driver tests 2017-04-04 15:12:14 -07:00
Alex Dadgar f95af65c75 Split task runner tests 2017-04-04 15:02:09 -07:00
Alex Dadgar f2ba6db607 Add sticky bit to temp directory
Fixes an issue where the Ruby runtime expects the sticky bit to be set
on the temp directory. The sticky bit is commonly set on the temp
directory since it is usually shared by many users. This change brings
ours in line with that assumption.
2017-04-04 10:48:29 -07:00
Alex Dadgar c52000f792 FinishedAt only records when the task has actually started 2017-03-31 17:06:05 -07:00
Alex Dadgar 81b78f77e1 Track task start/finish time & improve logs errors
This PR adds tracking to when a task starts and finishes and the logs
API takes advantage of this and returns better errors when asking for
logs that do not exist.
2017-03-31 16:14:11 -07:00
Alex Dadgar 6bee23047a Fix variable capture and add tests
This PR fixes token revocation and adds tests to make sure it is
working. The 0.5.6 RC1's token revocation does not work becasue the
token's value is captured at the instantiation of the deferred
stoprenewal statement rather than its exectution.
2017-03-29 13:17:50 -07:00
Robert Neumayer 804f83f1fb Fix typo and be consistent in naming 2017-03-29 09:33:54 +02:00
Michael Schurter ae3810052d Merge pull request #2482 from hashicorp/f-2289-better-artifact-err
Improve artifact download error message
2017-03-28 12:48:22 -07:00
Michael Schurter 2d07b75308 Recoverable struct field must be exported 2017-03-28 12:18:57 -07:00
Michael Schurter 1653b42400 Merge branch 'master' into b-2491-task-env 2017-03-28 11:28:00 -07:00
Alex Dadgar 1e95ae7e6a Merge pull request #2495 from hashicorp/b-vault-stop-renew
Stop Vault token renew on task exit
2017-03-28 11:14:18 -07:00
Alex Dadgar 5e7e19de4b Merge pull request #2461 from hashicorp/b-groups
Various fixes for setting user/group of task
2017-03-28 11:13:27 -07:00
Alex Dadgar 3b493b29b0 Merge pull request #2488 from hashicorp/f-node-values-template
Access Node Meta and Attrs in template
2017-03-28 11:10:29 -07:00
Alex Dadgar 5b36a90723 feedback 2017-03-28 11:10:11 -07:00
Michael Schurter d58bcde183 Make test more accurate and add changelog entry 2017-03-28 10:53:26 -07:00
Alex Dadgar d1645f47b1 Stop Vault token renew on task exit
This PR fixes an oversight in which the client would attempt to renew a
token even after the task exits.

Fixes https://github.com/hashicorp/nomad/issues/2475
2017-03-28 10:53:15 -07:00
Michael Schurter 9a0c66b55b Don't take a reference a var in a loop
Fixes #2491
2017-03-28 10:16:27 -07:00
Michael Schurter 41dcc72033 Recoverable is a method now, not a field 2017-03-27 15:41:35 -07:00
Michael Schurter 507862ade3 Add WrapRecoverable helper 2017-03-27 15:37:15 -07:00
Alex Dadgar 23562d77c9 Access Node Meta and Attrs in template
This PR allows accessing the Node's attributes and metadata as in a
template.

```
template {
    data = "{{ env \"attr.unique.network.ip-address\" }}"
    destination = "local/out"
}
```
2017-03-27 15:15:33 -07:00
Alex Dadgar 4ecebe7d8c Proper reference counting through task restarts
This PR fixes an issue in which the reference count on a Docker image
would become inflated through task restarts.
2017-03-25 17:05:53 -07:00
Michael Schurter 0e6c564406 Improve artifact download error message
Fixes #2289

Unfortunately took more RecoverableError hijinx than I would have liked.
There might be a better way.
2017-03-24 15:26:05 -07:00
Alex Dadgar 7e6c08191d Fix vet 2017-03-24 12:24:47 -07:00
Alex Dadgar c3551c761e Fix panic when restarting non-running task
This PR fixes an issue that is hit when running templates with restart
mode in which the client could panic when the handle is not running.

Fixes https://github.com/hashicorp/nomad/issues/2479
2017-03-24 12:04:22 -07:00
Alex Dadgar a171a014b3 Various fixes for setting user/group of task
This PR fixes two issues:
* Folder permissions in -dev mode were incorrect and not suitable for
running as a particular user.
* Was not setting the group membership properly for the launched
process.

Fixes https://github.com/hashicorp/nomad/issues/2160
2017-03-20 14:21:13 -07:00
Michael Schurter 39a8ce6cdb Fix test deadlock due to alloc runner not running
Don't actualy call AllocRunner.Run() because that executes so much
unneeded code for this test. Manually close the waitCh to simulate Run()
exiting.
2017-03-16 13:22:08 -07:00
Michael Schurter 9f8067e340 Fix test by adding new env vars 2017-03-14 17:15:12 -07:00
Alex Dadgar d220d9405b return the recoverable err 2017-03-14 16:33:36 -07:00
Alex Dadgar 177bd14718 rename cpu_total_compute and docs 2017-03-14 14:15:49 -07:00
Alex Dadgar 5f7bf577cd Docker doesn't need to init the stats helper 2017-03-14 13:41:57 -07:00
Alex Dadgar a1a7941dec Various fixes
This PR:
* Uses Go 1.8 executable lookup
* Stores any err message from stats init method
* Allows overriding of Cpu Compute for hosts where it can't be detected
2017-03-14 12:56:31 -07:00
Alex Dadgar 2a105c4eaa Merge pull request #2427 from hashicorp/f-gc-limit
Limit parallelism during garbage collection
2017-03-14 10:45:29 -07:00
Alex Dadgar 7700a3447e Review fixes 2017-03-14 10:45:15 -07:00
Michael Schurter 047b41f469 Merge pull request #2408 from hashicorp/b-2358-rkt-volume-name
Replace `_` with `-` in task names for rkt volumes
2017-03-14 10:10:00 -07:00
Michael Schurter a116d6441d Merge pull request #2426 from hashicorp/f-tg-port-ip-env-var
Add NOMAD_{IP,PORT}_<task>_<label> env vars
2017-03-13 12:00:59 -07:00
Alex Dadgar b5e53652aa Handle git ssh artifacts
This PR adds handling for downloading git artifacts using ssh with the
format git@github.com:hashicorp/go-getter.git

Fixes https://github.com/hashicorp/nomad/issues/2430
2017-03-11 15:12:41 -08:00
Alex Dadgar 3f68aae7ab TODO 2017-03-11 13:56:19 -08:00
Alex Dadgar fd8d6d0d89 Merge pull request #2424 from jen20/solaris-alloc-dirs
client/allocdir: Add missing functions on Solaris
2017-03-11 13:55:38 -08:00
Alex Dadgar 5716eee753 remove escape 2017-03-10 16:46:22 -08:00
Alex Dadgar 70e4feb045 Limit parallelism during garbage collection
This PR introduces a parallelism limit during garbage collection. This
is used to avoid large resource usage spikes if garbage collecting many
allocations at once.
2017-03-10 16:27:00 -08:00
Michael Schurter 71783e8e41 Add NOMAD_{IP,PORT}_<task>_<label> env vars 2017-03-10 12:17:43 -08:00
Alex Dadgar a60127b72e Merge pull request #2425 from hashicorp/f-client-metrics
Add metrics to show allocations on the client
2017-03-09 21:05:53 -08:00
Alex Dadgar 8238a8601e Address comment 2017-03-09 21:05:34 -08:00
Alex Dadgar 9011a7984c Add metrics to show allocations on the client
This PR adds the following metrics to the client:
client.allocations.migrating
client.allocations.blocked
client.allocations.pending
client.allocations.running
client.allocations.terminal

Also adds some missing fields to the API version of the evaluation.
2017-03-09 12:37:41 -08:00
James Nugent b3d1ef0af6 client/allocdir: Add missing functions on Solaris
This commit adds Solaris versions of the following functions:

- `linkDir`
- `unlinkDir`
- `createSecretDir`
- `removeSecretDir`

I believe this requires Go 1.8 in order to compile, as the unlink
syscall was previously missing.
2017-03-09 13:49:14 -05:00
Michael Schurter 16adc44358 Round two of env var cleaning
Should bring us into conformance with IEEE Std 1003.1, 2004 Edition:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html

1 alloc/op and ~80ns/op on my machine.
2017-03-08 16:46:13 -08:00
Michael Schurter a724779720 Replace `_` with `-` in task names for rkt volumes
Fixes #2358
2017-03-07 14:34:57 -08:00
Michael Schurter 518e4f75a6 Replace "-" in env var keys with "_"
Fixes #2405
2017-03-06 19:57:44 -08:00
Alex Dadgar 36f3788cb7 Fix lint errors 2017-03-02 21:03:05 -08:00
Alex Dadgar 3fb285f7d3 Fix TestAllocRunner_SaveRestoreState 2017-03-02 20:45:46 -08:00
Alex Dadgar 4cba170e4b Mkdir -> MkdirAll to avoid error when folder already exists 2017-03-02 19:35:31 -08:00
Alex Dadgar 5cd43e837a Merge branch 'master' into b-remount 2017-03-02 19:23:13 -08:00
Michael Schurter 200d31e2e6 Fix tests broken by API change 2017-03-02 15:54:12 -08:00
Michael Schurter 8ac632c3ff Fix API breaks in tests 2017-03-02 15:50:18 -08:00
Michael Schurter d12949fbbd Cleanup comments/names 2017-03-02 15:44:52 -08:00
Michael Schurter e03f64ea6a Safely ensure {dev,proc,alloc} are mounted
If they're unmounted by a reboot they'll be properly remounted.
2017-03-02 13:21:34 -08:00
Michael Schurter d5b7c0c302 unlinkDir should not error if already unlinked 2017-03-02 13:20:47 -08:00
Michael Schurter a9f9c485b4 Fix typos 2017-03-02 13:20:05 -08:00
Michael Schurter f5ce13598b Merge pull request #2391 from hashicorp/b-idempotent-umount
Make sure unmounting the secrets dir is idempotent
2017-03-02 11:24:01 -08:00
Michael Schurter 1cfe9f88b8 Make sure unmounting the secrets dir is idemptoent 2017-03-01 17:19:39 -08:00
Alex Dadgar 2aa962a357 Merge pull request #2389 from tmichaud314/2388-docker-driver-auth-file
Fixes docker-driver docker.auth.config processing
2017-03-01 15:39:35 -08:00
Tom Michaud b4db9d33f9 Fixes docker-driver docker.auth.config processing 2017-03-01 16:16:11 -07:00
Alex Dadgar af4e400b36 Update go-getter and add support for git and hg
Fixes https://github.com/hashicorp/nomad/issues/2042
2017-03-01 14:46:04 -08:00
Dave Walker (Daviey) 4958154640 Enable kernel.version on Windows
Previously, this value was guarded against running on Windows
because it called the `uname` command which is unlikely to
be there.

This change now sets the value from gopsutil, which might
well be an empty string.

Signed-off-by: Dave Walker (Daviey) <email@daviey.com>
2017-03-01 18:48:23 +00:00
Dave Walker (Daviey) 090e760928 Don't exec uname for node attribute kernel.version
Previously with client fingerprinting, sys/exec's Command
function was being used to execute `uname -r` and the return
string processed into the kernel.version node attribute.

This change uses gopsutil/host KernelVersion function
instead.  This means we can drop the os/exec, strings and
fmt imports... and not execute an external binary.

Signed-off-by: Dave Walker (Daviey) <email@daviey.com>
2017-03-01 09:45:56 +00:00
Alex Dadgar d2193048ce Fix caching of pull future 2017-02-28 18:19:13 -08:00
Alex Dadgar fa853c9696 Fix two issues during client restore state
This PR fixes two issues:

1) A close of a nil stopCollection channel when restoring and prestart
fails. The failure will cause the killCh to be triggered which will
close collection before it has been initialized.

2) Fixes a deadlock in which the handleWaitCh is never triggered since
it is not initialized when there is an error in prestart and the killCh
is triggered.

Both fixes are by maintaining the loop invariant that the two channels
are valid after there is a handle.
2017-02-28 10:29:12 -08:00
Alex Dadgar 48ab2a6c6a Delete on increment and add a safety guard 2017-02-27 20:23:21 -08:00
Alex Dadgar 3dc0e38feb Merge branch 'master' of github.com:hashicorp/nomad 2017-02-27 19:09:50 -08:00
Alex Dadgar 254b0e4ca6 Fix delete future 2017-02-27 19:09:13 -08:00
Alex Dadgar 5be806a3df Fix vet script and fix vet problems
This PR fixes our vet script and fixes all the missed vet changes.

It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
2017-02-27 16:00:19 -08:00
Alex Dadgar c423626484 Docker open stores image/image ID for next ID 2017-02-27 13:49:54 -08:00
Alex Dadgar 6910678c21 Allow random UUID 2017-02-27 13:42:37 -08:00
Alex Dadgar aa05ecc37c Merge pull request #2361 from hashicorp/f-docker-coordination
Docker Image Coordinator and caching
2017-02-24 14:18:21 -08:00
Alex Dadgar b5d4f39734 Docker Image Coordinator
This PR introduces a coordinator for doing CRUD on a Docker image. It
should fix racy deletion of images. The issue before was images would be
deleted between prestart and start causing an error.
2017-02-24 13:20:40 -08:00
Michael Schurter f61b7975a3 Put access to Docker volume drivers behind flag 2017-02-23 15:47:36 -08:00
Alex Dadgar 2239503e6e Fix noisy log 2017-02-23 14:40:24 -08:00
Pietro Menna 2f88d6cb68 Docker Volume Drivers
This commit adds the functionality to use Docker Volume Drivers.
2017-02-23 14:36:32 -03:00
Alex Dadgar c7e004b8c5 Merge pull request #2344 from hashicorp/b-fix-allocdir-move-test
Fix allocdir Move test and make code more defensive
2017-02-22 18:43:10 -08:00
Alex Dadgar 6936c81720 Merge pull request #2302 from hashicorp/f-task-state-setup
Adding a task event for setup
2017-02-22 18:42:57 -08:00
Alex Dadgar d2cbbac76e Merge pull request #2347 from hashicorp/b-parse
Remove defaulting from parse and fix tests
2017-02-22 18:42:43 -08:00
Alex Dadgar cef7882827 Fix tests and docs 2017-02-22 18:28:07 -08:00
Diptanu Choudhury 98921575af Adding a task event for setup 2017-02-22 18:28:07 -08:00
Alex Dadgar b5995c6cd7 Vendor + test fixes 2017-02-22 15:55:33 -08:00
Michael Schurter e5a29b82db Fix allocdir Move test and make code more defensive
A change in the behavior of `os.Rename` in Go 1.8 brought to light a
difference in the logic between `{Alloc,Task}Runner` and this test:

AllocRunner builds the alloc dir, moves dirs if necessary, and then lets
TaskRunner call TaskDir.Build().

This test called `TaskDir.Build` *before* `AllocDir.Move`, so in Go 1.8
it failed to `os.Rename over` the empty {data,local} dirs.

I updated the test to behave like the real code, but I defensively added
`os.Remove` calls as a subtle change in call order shouldn't break this
code. `os.Remove` won't remove a non-empty directory, so it's still
safe.
2017-02-21 17:22:10 -08:00
Michael Schurter 12c549d6a0 Skip setting MemorySwap on Windows
Windows doesn't support this Docker setting.

Fixes #2193
2017-02-21 13:21:42 -08:00
Alex Dadgar d0fd0e277a Don't force uppercase meta keys in env vars 2017-02-20 19:51:24 -08:00
Alex Dadgar f9323f03df Drivers log during fingerprinting
This PR fixes a regression in which some drivers did not log during
fingerprinting.
2017-02-20 19:35:51 -08:00
Alex Dadgar b49fceb491 rebase 2017-02-20 16:43:28 -08:00
Alex Dadgar c586cb653b Merge pull request #2318 from hashicorp/f-vault-debug
Server side Vault telemetry
2017-02-17 11:34:37 -08:00
Alex Dadgar 4223eb25eb Docs 2017-02-17 10:16:15 -08:00
Alex Dadgar 7203dee7ab Add allocated/unallocated metrics to client 2017-02-16 18:28:11 -08:00
Alex Dadgar c62cd5cc55 Revendor docker client 2017-02-14 17:34:05 -08:00
Alex Dadgar 0940fc08af Better derive token logging 2017-02-14 16:46:54 -08:00
Alex Dadgar 238b4bcafd Add Leader support to client 2017-02-10 17:55:19 -08:00
Michael Schurter fb59e42473 Fix test to assert new never-nil behavior 2017-02-10 10:11:15 -08:00
Michael Schurter 51e4fe9915 Use getters & setters with nil guards 2017-02-09 17:44:58 -08:00
Michael Schurter 37e7e7a3e5 Fix upgrade path for created resources
This *might* be a fix for #2295 -- I've been unable to reproduce the
bug. However, this guard seems wise regardless. I should never be
overwriting an intialized created resources with a nil.
2017-02-09 13:54:33 -08:00
Alex Dadgar b51ba01d54 Merge pull request #2293 from hashicorp/f-vendor-memdb
Vendor MemDB
2017-02-08 14:51:01 -08:00
Alex Dadgar 04862ca10e Tests compile 2017-02-07 21:30:57 -08:00
Sean Chittenden c4c321c770
Unconditionally lowercase the node ID read from disk. 2017-02-06 16:20:17 -08:00
Sean Chittenden adb5be23ef
Add better verification of a host's HostID. 2017-02-02 16:24:32 -08:00
Alex Dadgar d1f8301e73 Merge pull request #2277 from hashicorp/f-durable-uuid
Reproducible Node ID
2017-02-02 11:54:49 -08:00
Alex Dadgar 1562a7aeee Merge pull request #2262 from hashicorp/f-permissions
Template destination file permissions.
2017-02-02 11:28:44 -08:00
Sean Chittenden bb4347e277
Slight mis-merge: secret-id in dev mode is random and needs to be returned. 2017-02-01 22:20:52 -08:00
Sean Chittenden bb422a2258
Generate a durable NodeID if possible, otherwise fall back to a random HostID. 2017-02-01 22:11:33 -08:00
Alex Dadgar b40f284bd1 Merge branch 'master' into f-permissions 2017-02-01 16:57:46 -08:00
Alex Dadgar ec7bc783a4 Merge branch 'master' into b-vault-race 2017-02-01 16:57:26 -08:00
Alex Dadgar 9987a235a5 Fix race condition with Deriving vault tokens
This PR fixes a race condition in which the client was not locked while
deriving Vault tokens. This allowed the token to be set which would
cause subsequent Vault requests to fail with permission denied because
the incorrect Vault token was being used.

Further this PR makes the unsetting and unlocking of the client atomic
to avoid an even harder to hit race condition (not sure it was ever hit
but was still incorrect).
2017-02-01 16:25:59 -08:00
Diptanu Choudhury 90ac3e2774 Checking in updated gc tests 2017-02-01 13:12:46 -08:00
Alex Dadgar 6b7efa1ba6 Template destination file permissions.
This PR allows setting the file permissions of the rendered template.
2017-01-31 20:10:01 -08:00
Diptanu Choudhury 11d7cb1230 Making the GC related fields tunable 2017-01-31 15:51:20 -08:00
Michael Schurter aef3c2e380 Handle createdResourcs=nil
Combined with b522c472fdf this fixes #2256

Without these two commits in place upgrades to 0.5.3 panics.
2017-01-31 10:51:32 -08:00
Alex Dadgar 44d63318a4 Add nil guard 2017-01-31 09:19:59 -08:00
Alex Dadgar 91f485a189 slightly more verbose test 2017-01-30 16:16:26 -08:00
Alex Dadgar 7fa99037a4 up timing 2017-01-26 13:25:37 -08:00
Alex Dadgar 8196a58c4c Rename dispatch_input to dispatch_payload 2017-01-25 21:27:44 -08:00
Alex Dadgar 0933bcadf9 Fix import 2017-01-25 21:27:36 -08:00
Alex Dadgar b7e8bae128 Fix unreachable function in tests 2017-01-25 20:58:24 -08:00
Alex Dadgar 4f99243fa3 More timing fixes 2017-01-25 11:57:38 -08:00
Alex Dadgar 68fd18f5cf small env fix 2017-01-25 10:42:11 -08:00
Alex Dadgar 1e3ffec95f Fix flaky test 2017-01-24 09:40:13 -08:00
Alex Dadgar 75e3f2c996 Fix fsm/env test 2017-01-24 09:36:22 -08:00
Alex Dadgar 4e5de453c7 Fix template test 2017-01-23 16:42:02 -08:00
Alex Dadgar 985598d4b5 Fix test 2017-01-23 16:33:35 -08:00
Diptanu Choudhury 264a0c46cf Merge pull request #2223 from hashicorp/f-env-addr
Setting the Addrs of sibling tasks in the env
2017-01-23 15:05:48 -08:00
Diptanu Choudhury fda3a5250c Merge pull request #2222 from hashicorp/b-docker-image-not-found
Making the docker driver retry when an image is not found
2017-01-23 14:58:00 -08:00
Diptanu Choudhury dfd1f03ec8 Added a comment 2017-01-23 14:57:30 -08:00
Alex Dadgar 5fc629d4b7 Merge pull request #2190 from hashicorp/b-docker-auth
Better Docker Auth lookup
2017-01-23 14:50:50 -08:00
Alex Dadgar e160b2b38f Add test and better logs
This commit adds a test to retrieving auth configurations, use either
the auth block in the config or specified via the agent config and adds
a log if lookup fails.
2017-01-23 14:48:02 -08:00
Alex Dadgar 6b02229eb0 fix flaky test 2017-01-23 14:12:38 -08:00
Alex Dadgar 43a17b7d74 Close file 2017-01-23 11:37:45 -08:00
Alex Dadgar ee5296ba70 Deprecation notice 2017-01-23 11:32:51 -08:00
Alex Dadgar 3365688571 Remove SSL 2017-01-23 11:32:51 -08:00
Alex Dadgar 4a627b02ca undo 2017-01-23 11:32:51 -08:00
Alex Dadgar 08733b68d2 vendor 2017-01-23 11:32:51 -08:00
Alex Dadgar b1823a5fdb Better auth lookup 2017-01-23 11:32:51 -08:00
Alex Dadgar 3d5b31fc96 WIP 2017-01-23 11:32:51 -08:00
Alex Dadgar 625f738a6f fix test build 2017-01-23 11:26:34 -08:00
Alex Dadgar 024e9ec02a remove xxx 2017-01-23 11:04:44 -08:00
Alex Dadgar 66909bb7fc Finalize 2017-01-23 11:03:38 -08:00
Alex Dadgar 55fdf82513 test nomad env 2017-01-23 11:00:12 -08:00
Alex Dadgar 0db766872c fix lookup 2017-01-23 11:00:12 -08:00
Alex Dadgar f93f2249e9 Clear SSL 2017-01-23 11:00:12 -08:00
Alex Dadgar b71ec08b80 rebase 2017-01-23 11:00:12 -08:00
Diptanu Choudhury 84a491f85a Locking appropriately before closing the channel to indicate migration 2017-01-23 10:46:57 -08:00
Alex Dadgar ec6b6e69d4 remove reference to entrypoint 2017-01-22 14:04:41 -08:00
Alex Dadgar d051065229 Actually randomize the splay 2017-01-22 13:50:33 -08:00
Michael Schurter 054ee8df59 Fix index we get allocs by 2017-01-20 16:30:40 -08:00
Diptanu Choudhury fe4ae25eea Setting the Addrs of sibling tasks in the env 2017-01-20 14:15:53 -08:00
Michael Schurter de7dbf5ff9 Fix java tests 2017-01-20 14:12:30 -08:00
Michael Schurter c621ad3f75 Merge pull request #2216 from hashicorp/b-rootless-raw-exec
Don't link shared alloc dir into task dir for raw_exec
2017-01-20 13:18:54 -08:00
Michael Schurter a28fc4798a Merge pull request #2215 from hashicorp/b-2211-host-env
Fix image based drivers having host env vars set
2017-01-20 13:16:44 -08:00
Michael Schurter c2c1b0a049 Merge pull request #2198 from hashicorp/b-rootless-docker
Stop trying to use mount for image based drivers
2017-01-20 13:03:56 -08:00
Michael Schurter 295c2bf74a Merge pull request #2186 from hashicorp/f-driver-cleanup
Add Cleanup method to Driver interface
2017-01-20 13:02:14 -08:00
Michael Schurter 1f7b5b4b47 Rename Constructor -> Parameterized Job 2017-01-20 12:43:10 -08:00
Diptanu Choudhury be7081568c Making the docker driver retry when an image is not found 2017-01-20 12:38:04 -08:00
Michael Schurter c5f222e4a6 Update created resources before exiting cleanup 2017-01-19 16:48:23 -08:00
Michael Schurter a93d43a9cf Exit early when cleanup succeeds 2017-01-19 15:07:01 -08:00
Michael Schurter 85f68aa00c Fix incorrect lock usage 2017-01-19 11:39:18 -08:00
Michael Schurter 010a7709f1 Bump unknown resource to ERR 2017-01-19 09:48:07 -08:00
Diptanu Choudhury 1999b7eebb Merge pull request #2159 from hashicorp/b-consul-config
Fixed merging consul config
2017-01-18 16:14:54 -08:00
Diptanu Choudhury e927de02d2 Moved functions to helper from structs 2017-01-18 15:55:14 -08:00
Michael Schurter d8b4372904 Don't link shared alloc dir into task dir for raw_exec
Fixes running raw_exec tasks when nomad isn't root.
2017-01-18 11:28:34 -08:00
Michael Schurter f94210b4bc Fix image based drivers having host env vars set
Add detailed tests for GetTaskEnv to avoid this issue happening again!

Fixes #2211
2017-01-18 10:27:03 -08:00
Michael Schurter 578272b7f2 Add CreatedResources.Remove and use it 2017-01-17 16:41:59 -08:00
Michael Schurter 1bcf7cdbfe Remove outdated comment 2017-01-17 16:23:29 -08:00
Michael Schurter 82b49d4547 Updated CreatedResources as images are cleaned 2017-01-17 16:13:40 -08:00
Michael Schurter beed31ff6f Remove outdated comment 2017-01-17 16:05:21 -08:00
Michael Schurter b9d6d2c8d6 Return error from Prestart 2017-01-17 16:04:09 -08:00
Michael Schurter ea87091e58 Prevent race between alloc runners
Block ar1's periodic syncing which could recreate the state file ar2 was
destroying.
2017-01-17 13:10:20 -08:00
Michael Schurter 15952e5d17 Try to get test passing in Travis 2017-01-17 12:51:19 -08:00
Michael Schurter 255698e8af Use Image ID instead of Image Name 2017-01-13 16:53:58 -08:00
Michael Schurter a3a3656dbb Switch to use recoverable errors from Cleanup
TaskRunner handles retrying but Cleanup handles all of CreatedResources.
2017-01-13 16:46:08 -08:00
Alex Dadgar 78deb8b292 Support setting class_path and class name.
This PR enhances the java driver to allow setting the class path and
class name to run. It also fixes an issue that would make the Java
driver attempt to chroot regardless of operating system (this never
effected a released version of Nomad).
2017-01-13 16:03:11 -08:00
Michael Schurter c90cd0d874 Stop trying to use mount for image based drivers
Fixes #2178 and allows using Docker and other image based drivers even
when nomad is run as a non-root user.

`client/allocdir` tests can be run as a non-root user to ensure this
behavior and tests that rely on root or non-root users properly detect
their effective user and skip instead of fail.
2017-01-13 13:04:12 -08:00
Michael Schurter 25bf266606 Add ID to output 2017-01-13 12:46:55 -08:00
Michael Schurter dc68aa1a5a Return errors from cleanup and let TaskRunner retry 2017-01-12 17:21:54 -08:00
Diptanu Choudhury 6809a4b104 Added executorconfig 2017-01-12 15:47:58 -08:00
Diptanu Choudhury b1d0078db5 Filter executor log messages 2017-01-12 11:54:19 -08:00
Michael Schurter ec81325ddc Stop being so confusing 2017-01-12 11:17:35 -08:00
Michael Schurter 4d081490e6 Add Cleanup method to Driver interface
Cleanup can be used for cleaning up resources created by drivers to run
a task. Initially the Docker driver is the only user (to remove
downloaded images).
2017-01-11 17:23:33 -08:00
Alex Dadgar aafb9ca8b2 Merge pull request #2177 from hashicorp/b-blocking-getallocs
GetAllocs uses a blocking query
2017-01-11 13:24:32 -08:00
Alex Dadgar 5d2b56b387 Random wait 2017-01-11 13:24:23 -08:00
Cameron Davison c910f9b304
using new ctx instead of getting both params back 2017-01-10 16:54:01 -06:00
Cameron Davison 7ccbd8a000
fixing typo in comment 2017-01-10 16:54:01 -06:00
Cameron Davison 88a462d5b9
add force_pull to docker driver 2017-01-10 16:54:01 -06:00
Alex Dadgar cdc08bbd22 Merge pull request #2179 from hashicorp/b-panic
Fix nil dereference
2017-01-10 14:15:17 -08:00
Alex Dadgar bb329977a4 Fix nil dereference 2017-01-10 14:14:58 -08:00
Michael Schurter 0b7f8163d2 Merge pull request #2174 from hashicorp/b-fix-executor-test
Switch to a less timing dependent test command
2017-01-10 13:45:12 -08:00
Michael Schurter 1347a941b0 Fix missing value in test failure message 2017-01-10 13:39:05 -08:00
Alex Dadgar 4127a3ea9d Merge pull request #2173 from hashicorp/b-stats
Don't retrieve Driver Stats if unsupported
2017-01-10 13:32:03 -08:00
Alex Dadgar c19985244a GetAllocs uses a blocking query
This PR makes GetAllocs use a blocking query as well as adding a sanity
check to the clients watchAllocation code to ensure it gets the correct
allocations.

This PR fixes https://github.com/hashicorp/nomad/issues/2119 and
https://github.com/hashicorp/nomad/issues/2153.

The issue was that the client was talking to two different servers, one
to check which allocations to pull and the other to pull those
allocations.  However the latter call was not with a blocking query and
thus the client would not retreive the allocations it requested.

The logging has been improved to make the problem more clear as well.
2017-01-10 13:30:35 -08:00
Michael Schurter 7462379086 Switch to a less timing dependent test command
`/usr/bin/yes` could produce output very quickly (100s of MBps on my
laptop) and therefore could cause log files to roll over.

A bash loop with a sleep avoids that issue. The test is slower but
should be much more resilient to the massive timing differences between
workstations and Travis.
2017-01-09 15:40:53 -08:00
Alex Dadgar 2be221d664 Don't retrieve Driver Stats if unsupported
This PR makes us only try to collect stats once if the Driver doesn't
support collecting stats.

Fixes https://github.com/hashicorp/nomad/issues/1986
2017-01-09 13:47:06 -08:00
Alex Dadgar 26e2c5bb74 Merge pull request #2164 from hashicorp/b-dispatch
Create Task directory structure in the Run method
2017-01-09 11:24:46 -08:00
Alex Dadgar 2a5fd85e3b Move to Run() 2017-01-08 13:55:12 -08:00
Alex Dadgar 2affef2972 Create task directory during Prestart() 2017-01-08 13:55:12 -08:00
Alex Dadgar 4ffd9a69e5 Send Driver events to servers immediately
This PR causes driver events to be sent to the server immediately rather
than waiting for Prestart() to finish.
2017-01-08 13:54:43 -08:00
Alex Dadgar 724edb6659 Fix fingerprint tests 2017-01-08 13:53:27 -08:00
Diptanu Choudhury eb123416c5 Fixed namespacing for the cpu arch 2017-01-06 14:23:22 -08:00
Michael Schurter 65fb580216 Fix inconsistent task env setting
Consolidate task environment building in GetTaskEnv since it can
determine what kind of filesystem isolation is used.

This means drivers no longer have to manipulate task environment paths.
2017-01-06 12:19:32 -08:00
Michael Schurter d3270799f0 Fix executor tests 2017-01-06 11:39:18 -08:00
Michael Schurter acd11f678d Add COMPAT comment 2017-01-06 11:39:17 -08:00
Michael Schurter e203928d64 Driver is now required in test tasks 2017-01-06 11:39:17 -08:00
Michael Schurter 90f6ac7490 Fix tests post rebase 2017-01-06 11:39:13 -08:00
Michael Schurter 579f378bee Remove debug logging 2017-01-05 16:31:56 -08:00
Michael Schurter baf6f078d6 Remove task name prefix from executor logs 2017-01-05 16:31:56 -08:00
Michael Schurter 86fcf96f72 Put a logger in AllocDir/TaskDir 2017-01-05 16:31:56 -08:00
Michael Schurter f43d3f074a Add comments to TaskDir 2017-01-05 16:31:55 -08:00
Michael Schurter 5a6bd19eb7 Fix upgrade path for #2132
AllocRunner's state dropped the Context struct which needs to be
converted to the new AllocDir+TaskDir structs in RestoreState.

TaskRunner added a TaskDirBuilt flag, but it's safe to just let that
default to `false` and rebuild all task dirs once on upgrade.
2017-01-05 16:31:55 -08:00
Michael Schurter 774afd8800 Fail fast on taskdir errors 2017-01-05 16:31:55 -08:00
Michael Schurter 7260d0bca3 Test tasks now require driver name 2017-01-05 16:31:55 -08:00
Michael Schurter 3ea09ba16a Move chroot building into TaskRunner
* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
2017-01-05 16:31:49 -08:00
Alex Dadgar 8d5f0fea69 Merge pull request #2128 from hashicorp/f-dispatch
Nomad Constructor Jobs and Dispatch
2017-01-06 05:22:49 +08:00
Alex Dadgar 34fc25757e Merge pull request #2157 from hashicorp/t-client-tests
Fix client tests deadlocking
2017-01-06 05:21:05 +08:00
Alex Dadgar a29f253a12 use helper 2017-01-05 13:19:01 -08:00
Diptanu Choudhury 247bda9a88 Unlocking if we return before adding a new alloc runner 2017-01-05 13:18:48 -08:00
Alex Dadgar ee523062d1 Fix TestClient_BlockedAllocations 2017-01-05 13:15:08 -08:00
Diptanu Choudhury 9721a1ab04 Fixed how alloc lock is held 2017-01-05 13:06:56 -08:00
Alex Dadgar 205caf341f Fix SaveRestoreState 2017-01-05 12:32:44 -08:00
Michael Schurter 13064768ac Fix race when shutting down in dev mode
Client.Shutdown holds the allocLock when destroying alloc runners in dev
mode.

Client.updateAllocStatus can be called during AllocRunner shutdown and
calls getAllocRunners which tries to acquire allocLock.RLock. This
deadlocks since Client.Shutdown already has the write lock.

Switching Client.Shutdown to use getAllocRunners and not hold a lock
during AllocRunner shutdown is the solution.
2017-01-03 17:21:50 -08:00
Michael Schurter 4a9a574d9d Merge pull request #2054 from hashicorp/f-prestart
Add Driver.Prestart method
2016-12-20 16:18:56 -08:00
Michael Schurter 8e1ae14feb Remove unneeded env building 2016-12-20 16:14:42 -08:00
Michael Schurter 39f587a2af Fix tests broken by TaskEnv change 2016-12-20 14:37:35 -08:00
Michael Schurter 0d90e96925 lxc: Set image local env vars 2016-12-20 14:37:18 -08:00
Michael Schurter 05b49008eb Remove unneeded waitClient field 2016-12-20 14:29:57 -08:00
Michael Schurter ea92cd102a Append host env vars on every task env 2016-12-20 12:24:24 -08:00
Michael Schurter 458c2ed5f1 Fix formatting of downloading image message 2016-12-20 11:57:26 -08:00
Michael Schurter e34d1e5d23 Use startContainer wrapper 2016-12-20 11:55:40 -08:00
Diptanu Choudhury 93091f7902 Fixed a test 2016-12-20 11:53:37 -08:00
Michael Schurter 2aa235f8f2 Rename InitializationMessage to DriverMessage 2016-12-20 11:51:09 -08:00
Michael Schurter 85b0cecff2 Emit "Downloading image" event 2016-12-20 11:40:34 -08:00
Diptanu Choudhury 6c11f38cb0 Merge pull request #2081 from hashicorp/f-gc
Garbage collector for allocations
2016-12-20 11:19:32 -08:00
Diptanu Choudhury b6120e2fc8 Removing the alloc runner from GC if it is destroyed by the server 2016-12-20 11:14:22 -08:00
Diptanu Choudhury 6e6e0d364a Added comments 2016-12-20 10:49:48 -08:00
Alex Dadgar 746d4c7ee3 Small cleanups 2016-12-19 14:22:08 -08:00
Alex Dadgar 18739a4433 Merge pull request #1980 from dmexe/network-aliases
Add network_aliases for docker driver
2016-12-19 14:17:48 -08:00
Alex Dadgar 7cdf24f05f Fix Docker Logging Type interpolation
This PR fixes an issue that made Logging.Type un-interpretable in the
docker driver.
2016-12-19 13:42:58 -08:00
Alex Dadgar 2f3aeed2f8 Merge pull request #2063 from tmichaud314/fix-docker-driver-auth-interpolation
Fixes docker-driver Auth-config interpolation
2016-12-19 13:41:27 -08:00
Diptanu Choudhury e072961cea Added tests 2016-12-19 13:21:47 -08:00
Alex Dadgar 4e8035756b Fix test and prevent job with payload from being submitted 2016-12-18 16:32:14 -08:00
Alex Dadgar 072ff1c3ee ensure file doesn't escape 2016-12-18 15:48:30 -08:00
Diptanu Choudhury 36b5545d6b Making the gc allocator understand real disk usage 2016-12-16 18:34:59 -08:00
Alex Dadgar 159c819e08 Client writes payload to disk 2016-12-16 15:11:56 -08:00
Alex Dadgar b1883daae8 Use new combined meta data function in env 2016-12-16 10:45:09 -08:00
Alex Dadgar 7778339f03 Fix mapstructure tag formatting for lxc driver 2016-12-16 10:24:17 -08:00
Diptanu Choudhury 7aef9bcabe Added the stats collector to GC 2016-12-14 15:11:11 -08:00
Diptanu Choudhury e855cd587b Refactored hoststats collector 2016-12-14 15:07:42 -08:00
Diptanu Choudhury 0ffd92668d GC-ing before we start a new allocation 2016-12-14 15:04:06 -08:00
Diptanu Choudhury afdaa979f7 Added a garbage collector for allocations 2016-12-14 15:01:12 -08:00
Alex Dadgar 648ad2ebc5 Merge pull request #2096 from hashicorp/b-addAlloc
Fix race and remove panic
2016-12-13 13:50:17 -08:00
Diptanu Choudhury 53fb09023c cancelling waiting for remote allocation if the alloc doesn't need migration 2016-12-13 13:06:33 -08:00
Alex Dadgar 3cbd237512 Fix race and remove panic 2016-12-13 12:34:23 -08:00
Christoffer Kylvåg 6a1f32b8ba #1680: Continue after not being able to stat a mountpoint 2016-12-13 12:28:57 +01:00
Tom Michaud d0c01c8816 Fixes docker-driver Auth-config interpolation 2016-12-06 13:30:23 -07:00
Diptanu Choudhury cbf73908ff Setting the appropriate file permissions which un-archiving compressed alloc dir 2016-12-05 17:04:43 -08:00
Diptanu Choudhury bc17cacca0 Merge pull request #2017 from hashicorp/b-sticky
Not moving alloc data when sticky is turned off
2016-12-05 14:11:45 -08:00
Diptanu Choudhury 21f49564d3 Not moving alloc data when sticky is turned off 2016-12-05 14:00:01 -08:00
Michael Schurter 770ed703d0 Add Driver.Prestart method
The Driver.Prestart method currently does very little but lays the
foundation for where lifecycle plugins can interleave execution _after_
task environment setup but _before_ the task starts.

Currently Prestart does two things:

* Any driver specific task environment building
* Download Docker images

This change also attaches a TaskEvent emitter to Drivers, so they can
emit events during task initialization.
2016-12-02 11:03:48 -08:00
Michael Schurter 1c4195b985 Fix string formatting 2016-12-01 11:22:51 -08:00
Alex Dadgar 86ed1fb2e5 Disallow stale queries when deriving Vault tokens
This PR disallows stale queries when deriving a Vault token. Allowing
stale queries could result in the allocation not existing on the server
that is servicing the request.
2016-12-01 11:13:36 -08:00
Alex Dadgar 70396c464b Make errors starting a container recoverable
This PR makes errors starting a container recoverable and tries to
optimistically handle 500 errors.
2016-11-30 15:59:47 -08:00
Diptanu Choudhury 6c179d1695 Merge pull request #2045 from hashicorp/b-docker-create-container
Returning a container if it exists instead of creating a new one
2016-11-29 17:55:33 -08:00
Diptanu Choudhury 50452520bf Returning a container if it exists instead of creating a new one 2016-11-29 17:52:19 -08:00
Michael Schurter e1d63f6c0f Bump timeout on test 2016-11-29 16:19:40 -08:00
Alex Dadgar ec4d6936ff add debug panic 2016-11-29 15:57:40 -08:00
Alex Dadgar 712e18707b add debugging 2016-11-29 14:29:37 -08:00
Diptanu Choudhury f67217297c Ensuring allocs are not added multiple times to blocking queue 2016-11-29 11:19:37 -08:00
Diptanu Choudhury bff172939b Fixes an issue with purging containers with the same name Nomad is trying to start 2016-11-28 17:37:22 -08:00
Michael Schurter 1f0bfa00aa rkt: Support host and none dns options
Fixes #2025
2016-11-28 13:13:40 -08:00
Michael Schurter 44e4414490 Fix rkt volumes
I forgot to validate the volumes field!
2016-11-28 13:13:40 -08:00
Alex Dadgar 4f2a6eae8b Merge pull request #2029 from gliptak/dockerauth1
Log when lookup in docker.auth.config fails
2016-11-28 12:45:19 -08:00
Alex Dadgar d8048ad75d Merge pull request #2033 from hashicorp/b-docker-container-exists
Make container exist errors non-retriable
2016-11-28 12:38:52 -08:00
Michael Schurter b3ede6a5b7 Use net.JoinHostPort instead of fmt.Sprintf
Using fmt.Sprintf breaks IPv6 addresses.
2016-11-28 10:38:54 -08:00
Alex Dadgar 8a641a8672 Make container exist errors non-retriable
This change makes it so that the task runner does not retry container
exists errors and also a sleep is added on the local retry.
2016-11-25 19:22:58 -08:00
Gábor Lipták 6268112e86 Log when lookup in docker.auth.config fails 2016-11-23 18:43:58 -05:00
Ranjib Dey 0b29ad8787 Fix error message. Pass on template args 2016-11-21 20:12:59 -08:00
Dmitry Galinsky 3ec7ebac9c Add network_aliases for docker driver 2016-11-16 11:16:07 +03:00
Alex Dadgar 0f426d219a Merge pull request #1993 from hashicorp/b-upgrade-path
Check for Ephemeral Disk being nil
2016-11-15 16:27:48 -08:00
Alex Dadgar c2697123a9 Merge pull request #1996 from hashicorp/t-failing-tests
Fix some failing tests
2016-11-15 16:27:19 -08:00
Alex Dadgar 3e5bfcdbc4 respond to comment 2016-11-15 16:27:07 -08:00
Alex Dadgar c47ebd508e Remove old TODOs 2016-11-15 16:23:37 -08:00
Alex Dadgar cb187ffce6 Fix TestRktDriver_PortsMapping and TestAgent_LoadKeyrings 2016-11-15 15:49:05 -08:00
Alex Dadgar 9497991590 Updated AWS speeds and network_speed now overrides
This PR:

* Makes AWS network speeds more granular
* Makes `network_speed` an override and not a default
* Adds a default of 1000 MBits if no network link speed is detected.

Fixes #1985
2016-11-15 13:55:51 -08:00
Alex Dadgar 88c7e04348 Check for Ephemeral Disk being nil 2016-11-15 10:03:06 -08:00
Alex Dadgar eba98da487 Merge pull request #1977 from hashicorp/b-volume-mount
Change relative path from joining against the alloc dir to the task's directory.
2016-11-10 15:20:49 -08:00
Alex Dadgar a11d66f639 Remove todo 2016-11-10 15:20:19 -08:00
Alex Dadgar 74a736155c Always disable renew_token for CT config
This PR makes Nomad always disable token renewal even if Vault is
disabled. The problem was when there was a vault token in the
environment variable and Nomad/Vault integration was disabled, the
template runner would still try to renew the token.
2016-11-10 15:16:08 -08:00
Alex Dadgar eea35626b7 Changes the relative path from joining against the alloc dir to the
task's directory.

This PR changes the behavior when given a relative host path when
mounting docker containers. Prior to this, the behavior was to mount by
joining against the alloc/ directory. This PR changes it to be against
the task/ directory.
2016-11-10 14:47:54 -08:00
Alex Dadgar e8d6227b20 Do not validate the command does not contain spaces.
This PR removes validation that the command string does not contain
spaces. This can cause issues where the path contains a folder that
includes a space ("C:\Program Files\Python35\python.exe").

Fixes #1737
2016-11-10 10:22:17 -08:00
Alex Dadgar ee921ccbb2 Merge pull request #1949 from carlpett/blacklist-fingerprints-and-drivers
Support blacklisting fingerprinters
2016-11-09 10:31:17 -08:00
Calle Pettersson 4304755c12 Address comments from PR 2016-11-09 11:50:16 +01:00
Alex Dadgar fe9a200979 Merge pull request #1952 from hashicorp/b-reserved-ports-aws
Run environmental fingerprinters after host fingerprinters and AWS overrides network
2016-11-08 15:35:46 -08:00
Alex Dadgar 20a5b6fa6b Merge pull request #1965 from hashicorp/b-docker-interpolate
Interpolate all docker driver configs that are strings
2016-11-08 15:35:27 -08:00
Alex Dadgar 3b33f49cde Merge pull request #1966 from hashicorp/b-service-interpolate
Interpolate all service/check fields
2016-11-08 15:35:19 -08:00