Commit Graph

2230 Commits

Author SHA1 Message Date
Diptanu Choudhury 9d3cdded9a Merge pull request #1699 from hashicorp/f-lxc-driver
LXC Support
2016-10-25 15:17:44 -07:00
Diptanu Choudhury b5cc153d54 Added lxc related dependencies 2016-10-25 15:17:02 -07:00
Alex Dadgar 9bba036f13 Merge pull request #1852 from hashicorp/b-service-validation
Interpolate and then validate services
2016-10-25 14:28:32 -07:00
Alex Dadgar 4082732d3a Interpolate and then validate services 2016-10-25 14:27:49 -07:00
Alex Dadgar bf0981363f Merge pull request #1850 from hashicorp/f-fs-secret
Disallow fs to read secret directory
2016-10-25 11:35:08 -07:00
Alex Dadgar 8e07c2750e Merge pull request #1839 from hashicorp/f-signal-constraints
Signal creates an auto-constraints
2016-10-25 11:09:33 -07:00
Alex Dadgar f8419fdd6e Add CaPath to Vault config in consul-template 2016-10-25 11:01:50 -07:00
Diptanu Choudhury eefc8db3b3 Enabling TLS on cli 2016-10-25 10:39:17 -07:00
Michael Schurter 4f45aece4b Fingerprint rkt volume support and make periodic
Fix rkt docs and custom volume mounting
2016-10-25 09:46:49 -07:00
Michael Schurter 5d358c7eba Allow mounting alloc-dir-relative paths in docker 2016-10-25 09:46:49 -07:00
Michael Schurter 49ed6da0ad Enable rkt and docker volume mounting by default 2016-10-25 09:46:49 -07:00
Michael Schurter f075bda9b9 Make volume name unique 2016-10-25 09:46:49 -07:00
Michael Schurter 83a11fc93b Bump minimum required rkt version; update docs
Make section names match between docker and rkt
2016-10-25 09:46:49 -07:00
Michael Schurter edf657b58a Fix docker reference in rkt test 2016-10-25 09:46:49 -07:00
Michael Schurter 02ed35bd1c Add arbitrary volume support to rkt 2016-10-25 09:46:49 -07:00
Michael Schurter 473c28824c Fix standard mounts in rkt and tests 2016-10-25 09:46:49 -07:00
Alex Dadgar da8b05ba17 Fix merge 2016-10-24 17:04:10 -07:00
Alex Dadgar 03eba049ed Merge pull request #1848 from hashicorp/f-vault-error
Thread through whether DeriveToken error is recoverable or not
2016-10-24 15:01:18 -07:00
Diptanu Choudhury e03927bb5c Changed the way TLS config is parsed 2016-10-24 13:56:19 -07:00
Alex Dadgar e85d0ebace Merge pull request #1840 from hashicorp/f-kill-fail
Change how we mark tasks as failed and allow consul-template to fail tasks
2016-10-24 13:40:52 -07:00
Alex Dadgar 4ae735c8ba Disallow fs to read secret directory 2016-10-24 11:14:05 -07:00
Alex Dadgar 692a809919 Merge pull request #1842 from hashicorp/f-version-and-id
Print the version and client node ID
2016-10-24 10:13:33 -07:00
Diptanu Choudhury 2e3118e69c Implemented TLS support for http and rpc 2016-10-23 22:22:00 -07:00
Alex Dadgar 5577e53b20 Merge pull request #1845 from hashicorp/f-remove-disk-usage-acct
Remove disk usage enforcement
2016-10-22 19:01:51 -07:00
Alex Dadgar e85a67a49a Fix signal test for docker 2016-10-22 18:32:48 -07:00
Alex Dadgar ede3a814ba Small fixes 2016-10-22 18:20:50 -07:00
Alex Dadgar 0070178741 Thread through whether DeriveToken error is recoverable or not 2016-10-22 18:08:30 -07:00
Michael Schurter d937d3aede Fix comment form 2016-10-21 16:56:33 -07:00
Michael Schurter 285e80ac0f Remove disk usage enforcement
Many thanks to @iverberk for the original PR (#1609), but we ended up
not wanting to ship this implementation with 0.5.

We'll come back to it after 0.5 and hopefully find a way to leverage
filesystem accounting and quotas, so we can skip the expensive polling.
2016-10-21 13:55:51 -07:00
Alex Dadgar aa0d8d0d8d Print the version and client node ID 2016-10-20 17:46:04 -07:00
Alex Dadgar 46a7d1a0d7 Change how we mark tasks as failed and allow consul-template to fail tasks 2016-10-20 17:27:16 -07:00
Alex Dadgar 41b5679015 Advertise signalling abilities 2016-10-19 15:06:23 -07:00
Alex Dadgar ae1ea0e5ba Actually mount the local directory 2016-10-18 15:57:12 -07:00
Alex Dadgar b384bff053 Feedback 2016-10-18 15:01:04 -07:00
Alex Dadgar ba0b3963ef Comments 2016-10-18 11:36:04 -07:00
Alex Dadgar 4f8bfd7b18 Tests 2016-10-18 11:24:20 -07:00
Alex Dadgar 36cfe6e89e Large refactor of task runner and Vault token rehandling 2016-10-18 11:24:20 -07:00
Alex Dadgar 53eeec9bc1 Merge pull request #1801 from hashicorp/f-signals
Consul-template signal change mode
2016-10-18 11:23:47 -07:00
Michael Schurter 34f7cbd10f Disable lxc by default 2016-10-13 13:22:12 -07:00
Michael Schurter 1dbb2b7164 Cleanup comments/whitespace 2016-10-13 13:05:55 -07:00
Michael Schurter 38b2020291 Mount secret dir 2016-10-13 12:45:33 -07:00
Diptanu Choudhury 4e86a5f906 throwing an error if stats line can't be converted to k/v pair 2016-10-12 17:18:58 -07:00
Diptanu Choudhury ea5d9d959a Bind mounting alloc dir into container 2016-10-12 17:18:58 -07:00
Diptanu Choudhury 6312ea3f8f Setting the network type 2016-10-12 17:18:58 -07:00
Diptanu Choudhury ce334e0e04 Adding cpu resource limits 2016-10-12 17:18:58 -07:00
Diptanu Choudhury bb2a580ef1 Implemented an LXC Driver 2016-10-12 17:18:58 -07:00
Evan Phoenix e7a98d5500 Make EvalSymlink errors more verbose 2016-10-12 17:07:21 -07:00
Evan Phoenix 8864a506aa Disable the syslog logging system on Docker For Mac
The syslog logging system depends on the ability for a unix socket to be
accessed by the docker daemon in the $TMPDIR of the host. This doesn't
work on Docker For Mac because the docker daemon is running inside a VM,
and while /tmp is accessible, the filesystem used to share them doesn't
support unix socket files, and thus it doesn't work.
2016-10-12 17:07:21 -07:00
Evan Phoenix f8a65a3b9d Resolve alloc/state directories to make Docker For Mac happy
* In -dev mode, `ioutil.TempDir` is used for the alloc and state
directories.
* `TempDir` uses `$TMPDIR`, which os OS X contains a per user
directory which is under `/var/folder`.
* `/var` is actually a symlink to `/private/var`
* Docker For Mac validates the directories that are passed to bind and on
OS X. That whitelist contains `/private`, but not `/var`. It does not
expand the path, and so any paths in `$TMPDIR` fail the whitelist check.

And thusly, by expanding the alloc/state directories the value passed
for binding does contain `/private` and Docker For Mac is happy.
2016-10-12 17:06:25 -07:00
Alex Dadgar eec1a154ec add plugin kill 2016-10-12 13:24:22 -07:00
Alex Dadgar 86238387e7 Send Executor Ctx separately 2016-10-12 11:35:29 -07:00
Alex Dadgar db4e676d73 Merge pull request #1803 from hashicorp/b-vault-parse
Fix Vault Config parsing of booleans
2016-10-11 13:47:46 -07:00
Michael Schurter ed5ff3a104 Merge pull request #1804 from hashicorp/f-job-env-var
Add NOMAD_JOB_NAME to task environment
2016-10-11 13:34:54 -07:00
Alex Dadgar 82960c46d8 Tests 2016-10-11 13:28:18 -07:00
Ben Barnard 83f647ed84 Replace "the the" with "the" in documentation and comments 2016-10-11 15:31:40 -04:00
Michael Schurter ca5439eca1 Add NOMAD_JOB_NAME to environment 2016-10-11 11:20:42 -07:00
Alex Dadgar 751aa114bf Fix Vault parsing of booleans 2016-10-10 18:04:39 -07:00
Alex Dadgar bc35eaee21 Task runner sends signals 2016-10-10 15:09:00 -07:00
Alex Dadgar 00a1234c55 Executor + Java/Raw Exec/Exec 2016-10-10 11:47:04 -07:00
Alex Dadgar 5b01b1be1b Rkt 2016-10-10 11:47:04 -07:00
Alex Dadgar 280af8f4d1 Docker + Qemu 2016-10-10 11:47:04 -07:00
Alex Dadgar 08eeef0140 Merge pull request #1796 from hashicorp/f-task-runner
Task runner integrates with TaskTemplateManager
2016-10-07 15:13:14 -07:00
Michael Schurter d8f8048d85 Merge pull request #1767 from hashicorp/f-docker-volumes-logging
Support Docker Volumes and Logging
2016-10-07 12:10:59 -07:00
Michael Schurter f0d04bd798 Add comment and fix log line code style 2016-10-07 11:58:21 -07:00
Michael Schurter 523dbfcc81 Remove VolumesFrom feature
Since containers are named with alloc ids it's difficult to use safely.
Not to mention task scheduling ordering issues could break it as well.
2016-10-07 11:58:13 -07:00
Alex Dadgar e2d49eb4a2 Comments 2016-10-06 15:21:59 -07:00
Alex Dadgar 68c5fe78f8 Tests 2016-10-06 15:17:34 -07:00
Alex Dadgar 8fb07bb083 Fix handling of restart in TaskEvents 2016-10-06 15:06:54 -07:00
Alex Dadgar 8eb7fa91cf Start of integration 2016-10-06 15:05:49 -07:00
Alex Dadgar 3693de99d8 Merge pull request #1783 from hashicorp/f-consul-template
Consul template manager
2016-10-06 15:05:01 -07:00
Alex Dadgar c7f76ea78d Make tests channel based 2016-10-06 14:51:54 -07:00
Alex Dadgar 19a6aefd68 more vendoring 2016-10-06 12:36:44 -07:00
Michael Schurter f777faba00 Add comments to config key constants 2016-10-03 16:04:33 -07:00
Michael Schurter 0d66b8aef0 Only launch syslog server if container uses syslog 2016-10-03 15:22:10 -07:00
Michael Schurter 44219cc083 Put docker volume support behind conf flag
Also add tests and fix bug with logging driver configuration.
2016-10-03 15:02:50 -07:00
Jan-Hendrik Lendholt a26a501120 Fixed a bug when giving in another logging driver than syslog.
Before this commit, if the Logging config did not contain a logging option "syslog-address", it would definitely insert this option.
If then, you decide to take another logdriver than syslog, docker would fail because it received a wrong log option for the selected driver.
Now, nomad will only insert the syslog address in a hard way if there are no logging options at all - this way it keeps the default nomad settings.
2016-10-03 15:02:50 -07:00
Jan-Hendrik Lendholt 6c7cbe5fcb Added support to mount host folders into container. For example if you don't want to bake certificates into the container, you can mount them into the directory directly.
Furthermore, I added support for volumes-from.

Currently, there is no support to move the data from one container to another, hence: If a container spawns on another host, it is very likely, that the data will not be found.
2016-10-03 15:02:49 -07:00
Jan-Hendrik Lendholt ac5cde4641 Added logging options support for docker driver 2016-10-03 15:02:49 -07:00
Alex Dadgar d2837dec44 Do not allow path to escape the alloc dir for the FS commands 2016-10-03 14:58:44 -07:00
Michael Schurter 6dea6df919 Restore lost chan inits 2016-10-03 14:56:50 -07:00
Alex Dadgar 4eaabd675c Consul Template Manager 2016-10-03 12:59:31 -07:00
Diptanu Choudhury d50c395421 Getting snapshot of allocation from remote node (#1741)
* Added the alloc dir move

* Moving allocdirs when starting allocations

* Added the migrate flag to ephemeral disk

* Stopping migration if the allocation doesn't need migration any more

* Added the GetAllocDir method

* refactored code

* Added a test for alloc runner

* Incorporated review comments
2016-10-03 09:59:57 -07:00
Michael Schurter b117725dc9 Only log consul errors once since last succesful run 2016-09-28 17:18:45 -07:00
Alex Dadgar 9beee9c891 Merge pull request #1762 from hashicorp/b-scan-pids
Constant size space tracking of pids
2016-09-27 17:28:35 -07:00
Alex Dadgar 320b89d57a Constant time space tracking of pids 2016-09-27 16:57:26 -07:00
Michael Schurter 80085ddda5 Merge pull request #1735 from hashicorp/b-bootstrap-flapping
Retry all servers on RPC call failure
2016-09-27 16:33:15 -07:00
Michael Schurter d486de3804 Remove unused const 2016-09-27 16:04:01 -07:00
Diptanu Choudhury 2b1d214b0d Avoiding copying files if they are already present in chrootw (#1753) 2016-09-27 11:43:27 -07:00
Michael Schurter 2e696c5e61 Fix lies found in comments by fact checkers 2016-09-26 16:51:53 -07:00
Michael Schurter 11cf9686a6 No need to put reaper ticker on the struct 2016-09-26 16:15:19 -07:00
Michael Schurter 2eb0062959 Drop clumsy timeout on discovery notifications
It's better to just let goroutines fallback to their longer retry
intervals then try to be clever here.
2016-09-26 16:05:21 -07:00
Michael Schurter 307e674eca Flip disco chan; clarify method names/comments 2016-09-26 15:52:40 -07:00
Michael Schurter 888ee21270 Return csv of servers from Stats, not just count 2016-09-26 15:40:26 -07:00
Alex Dadgar b28e817b1d Test fix 2016-09-26 15:35:59 -07:00
Michael Schurter 7dc0079dd2 doDisco -> triggerDiscoveryCh; discovered -> serversDiscoveredCh
Also fix log line formatting
2016-09-26 15:21:28 -07:00
Michael Schurter 434e4be97c noServers -> noServersErr 2016-09-26 15:12:35 -07:00
Michael Schurter b2ddb85a78 consul -> Consul 2016-09-26 15:06:57 -07:00
Diptanu Choudhury 12c7873db2 Closing files when files are removed 2016-09-23 22:17:53 -07:00
Michael Schurter 37cfb2769c Replace periodic handlers with event driven disco
Remove use of periodic consul handlers in the client and just use
goroutines. Consul Discovery is now triggered with a chan instead of
using a timer and deadline to trigger.

Once discovery is complete a chan is ticked so all goroutines waiting
for servers will run.

Should speed up bootstraping and recovery while decreasing spinning on
timers.
2016-09-23 17:02:48 -07:00
Michael Schurter 2ab5264595 Retry all servers on RPC call failure
rpcproxy is refactored into serverlist which prioritizes good servers
over servers in a remote DC or who have had a failure.

Registration, heartbeating, and alloc status updating will retry faster
when new servers are discovered.

Consul discovery will be retried more quickly when no servers are
available (eg on startup or an outage).
2016-09-23 11:44:48 -07:00
Diptanu Choudhury 589356fd55 Adding a snapshot endpoint on the client (#1730) 2016-09-21 21:28:12 -07:00
Alex Dadgar 12de69a66f Struct and parse 2016-09-21 11:31:09 -07:00
Alex Dadgar 50efdb00e9 Merge pull request #1713 from hashicorp/f-alloc-runner-vault
Vault integration in client
2016-09-20 16:15:55 -07:00
Alex Dadgar 64de46432a Merge pull request #1677 from hashicorp/f-vault-implicit-constraint
Vault implicit Task Group constraint + allow root tokens
2016-09-20 16:15:32 -07:00
Diptanu Choudhury f7a9b39e8c Ensuring that we are not emitting stats when handle is nil (#1723)
* Ensuring that we are not emitting stats when handle is nil

* Updated the changelog
2016-09-20 11:29:34 -07:00
Alex Dadgar 83905075e5 Fix comment 2016-09-17 11:31:17 -07:00
Alex Dadgar 40fc1d1dfd Task runner test 2016-09-15 17:39:08 -07:00
Alex Dadgar 0fefaef008 Alloc runner tests 2016-09-15 17:24:09 -07:00
Alex Dadgar 0f40bd41a3 Handle recovery failure 2016-09-15 12:50:44 -07:00
Alex Dadgar 688e616200 Fix token renewal 2016-09-15 11:20:51 -07:00
Alex Dadgar ec152a6d12 Clean up vault client 2016-09-14 18:10:56 -07:00
Alex Dadgar 6702a29071 Vault token threaded 2016-09-14 13:30:01 -07:00
Robert Neumayer 8dc19dbd10 Log adding of servers at INFO level 2016-09-14 22:24:17 +02:00
Michael Schurter 9a2a17b48f Merge pull request #1682 from hashicorp/b-sanity-check-state-file
Prevent state file corruption and check state file sanity on save
2016-09-13 11:24:34 -07:00
Michael Schurter cd8606b9e3 Revert "A nil context isn't an error"
This reverts commit fe9fe4c26259c1ad3bd7e94bd711418aaf819b20.
2016-09-12 12:56:12 -07:00
Diptanu Choudhury ab6a9a5120 Fixing alloc runner tests 2016-09-04 19:09:08 -07:00
Diptanu Choudhury 0f77d81f7d Merge pull request #1683 from mwieczorek/enable-syslog-for-windows
Enable syslog for windows
2016-09-04 17:07:25 -07:00
Michal Wieczorek 94071fc294 Enable syslog server and universal collector for windows 2016-09-05 00:26:36 +02:00
Michael Schurter 8a57913a44 A nil context isn't an error 2016-09-02 16:24:53 -07:00
Michael Schurter f601361d58 Don't serialize task states twice in state files 2016-09-02 16:07:06 -07:00
Michael Schurter 6cb6d9cdf1 Lock around saving state
Prevent interleaving state syncs as it could conceivably lead to
empty state files as per #1367
2016-09-02 16:07:06 -07:00
Michael Schurter e7dd443447 Add sanity check to SaveState
Also just reuse the task states snapshot taken by `Alloc()` instead of
doing a redundant copy.
2016-09-02 16:07:06 -07:00
Alex Dadgar eecef73302 syscall error 2016-09-02 15:00:46 -07:00
Alex Dadgar eef786dd9d Secret dir materialized in alloc/task directory 2016-09-02 12:44:05 -07:00
Alex Dadgar 2c8dd8bbd3 Revert "Introduce a Secret/ directory" 2016-09-01 17:23:15 -07:00
Alex Dadgar 4a8fba5cf7 small fixes 2016-09-01 13:38:31 -07:00
Alex Dadgar b0adaa5301 Allow root token 2016-09-01 12:05:08 -07:00
Alex Dadgar 8ca3a16825 Fingerprint 2016-09-01 11:10:14 -07:00
Alex Dadgar 1ed454dd60 Merge pull request #1671 from hashicorp/f-secret-dir2
Introduce a Secret/ directory
2016-09-01 09:56:17 -07:00
Alex Dadgar 9fa23e3536 Symlink on windows 2016-08-31 21:41:44 -07:00
Alex Dadgar 5d3b47e648 Address comments and reserve 2016-08-31 18:11:02 -07:00
Michael Schurter cbb0b8e31e Merge pull request #1668 from hashicorp/b-fix-consul-updates
Fix old services not getting removed from consul on update
2016-08-31 17:17:09 -07:00
Alex Dadgar 0626eb9619 environment variables 2016-08-31 13:56:11 -07:00
Alex Dadgar d59e14eed4 Interface + tests 2016-08-30 21:40:32 -07:00
Vishal Nayak b6b73545ea Merge pull request #1606 from hashicorp/f-vault-client
VaultClient for Nomad client's interactions with Vault
2016-08-30 13:13:54 -04:00
vishalnayak d0ad1603c3 Print debug message only when error is non-nil 2016-08-30 13:14:34 -04:00
vishalnayak 55a6f06e15 Addressed review feedback 2016-08-30 13:08:13 -04:00
vishalnayak 3808dd0ff8 Return only fatal error to renewal error channel 2016-08-30 12:46:59 -04:00
vishalnayak a0dbfe25b3 Fix tests 2016-08-29 21:30:06 -04:00
vishalnayak 82f6209e97 tokenDeriver function pointer to derive tokens.
Remove rpc*, connPool, node and region from vaultclient.
2016-08-29 20:32:05 -04:00
Alex Dadgar 14b7126511 Secret dir, hello world 2016-08-29 15:41:52 -07:00
vishalnayak f35bb409b6 Use Job.LookupTaskGroup 2016-08-29 16:34:39 -04:00
vishalnayak 160ba48eb4 Address review feedback 2016-08-29 12:47:33 -04:00
Alex Dadgar aaca0bdaf4 Make maxSize exported so that it is serialized 2016-08-28 17:48:35 -07:00
Michael Schurter 32626ac608 Remove unused embedded lock on ExecContext 2016-08-26 11:58:03 -07:00
Michael Schurter d31f373a5b Merge pull request #1653 from hashicorp/b-fix-artifact-retry
Don't fail other tasks when retrying artifact get
2016-08-26 09:53:39 -07:00
Michael Schurter 592b898f43 Make sure bad doesn't fail before web runs 2016-08-25 17:25:51 -07:00
Michael Schurter 139d2e7939 Ensure bad task reaches terminal state within test 2016-08-25 16:05:19 -07:00
Michael Schurter a739058fa3 Ensure web task exited successfully
Web task should run to completion successfully while the `bad` task is
retrying artifact downloads.
2016-08-25 14:42:50 -07:00
Michael Schurter 5ce26f82fe Don't fail other tasks when retrying artifact get
The artifact fetching may be retried and succeed, so don't set the task
as dead.

Fixes #1558
2016-08-25 13:16:41 -07:00
Ivo Verberk d89dc40fe2 Small comment fix. 2016-08-25 20:50:11 +02:00
Ivo Verberk 57012e8d8c Monitor the complete alloc directory, not just the shared part. 2016-08-25 20:48:19 +02:00
Ivo Verberk 9113244131 Don't duplicate TaskKilled event and check for TaskSiblingFailed. 2016-08-25 20:11:10 +02:00
Diptanu Choudhury e9b27a528c Fixed the raw_exec fingerprint test 2016-08-24 13:38:43 -05:00
Diptanu Choudhury 10e10b2fe1 fixed the qemu fingerprinter test 2016-08-24 12:25:02 -05:00
vishalnayak 56e42cf03d Employ DeriveVaultToken API and flesh-up DeriveToken 2016-08-24 12:29:59 -04:00
vishalnayak 6002e596c4 VaultClient for Nomad Client 2016-08-24 09:43:45 -04:00
Diptanu Choudhury 9309247f9d Fixed a java test 2016-08-23 16:51:09 -05:00
Diptanu Choudhury 05fe72e89e fixed the exec fingerprinter test 2016-08-23 16:40:56 -05:00
Diptanu Choudhury 588a4802c1 removing driver_mock 2016-08-23 16:05:03 -05:00
Alex Dadgar 0f3ec9c759 Fix TestDockerDriver_Fingerprint 2016-08-23 10:39:40 -07:00
Alex Dadgar 1da8566322 Merge pull request #1580 from hashicorp/f-disk-usage-monitoring
Monitor and enforce shared allocation directory disk usage
2016-08-23 09:49:53 -07:00
Diptanu Choudhury 1e1eef56a1 Putting the mock driver behind a build flag 2016-08-22 15:02:28 -05:00
Diptanu Choudhury 4ca623bcfe blocking chained allocations until previous allocation hasn't terminated 2016-08-22 11:34:24 -05:00
Kenjiro Nakayama b06c6d9311 driver.docker: tiny: debug messages output task name instead of image name 2016-08-21 19:51:32 +09:00
Alex Dadgar a90dafe9ab handle the upgrade case 2016-08-18 19:01:24 -07:00
Ivo Verberk 2a17895a83 Disk resource monitoring and enforcement 2016-08-18 07:59:03 +02:00
Alex Dadgar 895c31f605 Nodes generate Secret ID and used for retrieving allocations and registering 2016-08-17 16:31:47 -07:00
vishalnayak ff9bb5b08b Disable Vault instead of supplying test token 2016-08-17 16:25:38 -07:00
vishalnayak 1c1457b01b Fix a few client tests 2016-08-17 16:25:38 -07:00
Alex Dadgar 7d899b6c60 Pass Vault config to client 2016-08-17 16:23:29 -07:00
Diptanu Choudhury dc0e395982 re-using copyimage 2016-08-17 15:25:03 -07:00
Diptanu Choudhury 0d7cd53c63 Fixed docker tests 2016-08-17 15:25:03 -07:00
Diptanu Choudhury ab7f8847c1 changing error statement 2016-08-17 13:48:31 -07:00
Diptanu Choudhury 2968ce9399 Fixed the docker script check 2016-08-17 13:23:48 -07:00
Alex Dadgar 3b9188fcf0 Merge pull request #1598 from nak3/rkt-fix5
driver.rkt: Remove unnecessary job validation
2016-08-16 15:12:45 -07:00
Alex Dadgar 423da3e99d Merge pull request #1593 from hashicorp/f-rereg
Reregister Client on failed heartbeat
2016-08-16 13:29:34 -07:00
Kenjiro Nakayama c97beb8deb driver.rkt: Remove unnecessary job validation 2016-08-16 23:33:34 +09:00
ramukima 4f18963a97 go fmt performed code when copied from another directory got messed up again ? Ok, ran go fmt again 2016-08-16 09:20:13 -04:00
ramukima 17b902f20f issue-1588 : Allow extra driver config args as a passthrough for qemu executable from a task specification 2016-08-15 23:36:13 -04:00
Alex Dadgar 84820db86f If the client detects that a heartbeat has failed because it is not registered, reregister 2016-08-15 17:24:09 -07:00
Alex Dadgar 64b86bed0d Merge pull request #1581 from nak3/fix-rkt2
Set host environment variables to taskEnv of rkt driver
2016-08-15 10:31:10 -07:00
Alex Dadgar a2b603bd82 Merge pull request #1586 from nak3/rkt-fix3
tiny: Catch error returned from SyncServices in rkt driver
2016-08-15 10:29:28 -07:00
Kenjiro Nakayama d0ccdacd08 rkt.driver: Fix wrong MB calculation 2016-08-14 14:27:42 +09:00
Kenjiro Nakayama 906526fbe0 tiny: Catch error returned from SyncServices in rkt driver 2016-08-14 13:11:17 +09:00
Kenjiro Nakayama 0efc0e0231 Set host environment variables to taskEnv of rkt driver 2016-08-14 00:42:53 +09:00
Diptanu Choudhury 8c8d00dd5b Merge pull request #1572 from hashicorp/fix-docker-test
Fixed docker tests
2016-08-12 13:04:33 -07:00
Diptanu Choudhury f3420c65e3 Updated the busybox images 2016-08-12 11:39:58 -07:00
Diptanu Choudhury dd7e69006e Not running tests parallal 2016-08-11 21:53:27 -07:00
Alex Dadgar b34d65f1a5 Update test.sh 2016-08-11 21:35:50 -07:00
Diptanu Choudhury 839ecd1df6 Fixed docker tests 2016-08-11 19:28:41 -07:00
Alex Dadgar ed23dff23c Merge pull request #1571 from hashicorp/t-logs-tests
Fixing logs tests on travis
2016-08-11 19:24:46 -07:00
Alex Dadgar b3dca54a1a Wrap file rotator tests in wait for 2016-08-11 19:23:03 -07:00
Alex Dadgar 8323b6a0b5 only use polling 2016-08-11 18:59:48 -07:00
Diptanu Choudhury b71f687f62 Merge pull request #1413 from hashicorp/b-tests
Fix flaky tests
2016-08-11 18:15:29 -07:00
Alex Dadgar 2614b0f06e Update task_runner_test.go 2016-08-11 18:01:27 -07:00
Alex Dadgar ac7749f70e Fix task runner test 2016-08-11 13:16:17 -07:00
Kenjiro Nakayama fe13453012 Update after the review 2016-08-11 10:53:33 +09:00
Kenjiro Nakayama c3b871e90d Return error when client failed to collect host stats 2016-08-11 09:38:28 +09:00
Diptanu Choudhury 9a75052d2c Merge pull request #1518 from pubnub/feature/chroot-map-rebase
Add config field to specify chroot mapping for exec driver
2016-08-10 17:00:03 -07:00
Diptanu Choudhury 28b3f511e0 Fixed some error messages 2016-08-10 15:17:32 -07:00
Diptanu Choudhury 1b74f863fe Merge pull request #1533 from nak3/fix-error-in-client
tiny: Return fmt.Errorf instead of duplicated error messages
2016-08-10 15:13:47 -07:00
Kenjiro Nakayama 6ec6c27cb4 Update debug option from string to bool 2016-08-09 16:51:00 +09:00
Kenjiro Nakayama 11a8a7218e Add debug option to rkt task config 2016-08-09 09:01:05 +09:00
Kenjiro Nakayama 6a810e6f1e Update after review 2016-08-09 08:57:26 +09:00
Kenjiro Nakayama 5c621b74e5 tiny: Return fmt.Errorf instead of duplicated error messages 2016-08-09 08:57:26 +09:00
Jay Oster 09113ffbc8 Fix Linux executor isolation test
- Properly expects the hard-coded mounts (alloc, dev, and proc) and hardcoded local directories (local and tmp)
- Also verifies that etc contains only the requested paths
2016-08-08 14:04:09 -07:00
Diptanu Choudhury c0ec1b2101 Merge pull request #1532 from nak3/fix-fingerprint-cpu-log
tiny: Fix duplicated error message in CPU fingerprint
2016-08-08 13:44:49 -04:00
Diptanu Choudhury 70d2f8ef1d Merge pull request #1534 from nak3/fix-intask_runner
tiny: print task name and error message for SaveState error
2016-08-08 13:37:25 -04:00
Kenjiro Nakayama e7863ea8ee tiny: print task name and error message for the SaveState error in task_runner 2016-08-07 13:33:58 +09:00
Kenjiro Nakayama 71371fc592 tiny: Fix duplicated error message in CPU fingerprint 2016-08-07 12:49:40 +09:00
Kenjiro Nakayama 60b58eed84 Update GetArtifact by removing unused logger 2016-08-06 23:37:32 +09:00
Diptanu Choudhury fb178a1f5f Merge pull request #1477 from nak3/add-syslog_server-test
Add Syslog server start shutdown test
2016-08-05 12:01:24 -07:00
Alex Dadgar ddd8adce96 changelog + use driver config 2016-08-05 10:55:20 -07:00
Alex Dadgar f8e4bc73c9 Merge pull request #1493 from nak3/rkt-fix1
Pass command and trust_prefix to the validation of rkt task configuration
2016-08-05 10:52:02 -07:00
Alex Dadgar 096956257d changelog 2016-08-05 10:47:44 -07:00
Kenjiro Nakayama fc1195b7ea Add Syslog server start shutdown test 2016-08-06 02:01:33 +09:00
Michal Wieczorek b688261a99 Set windows containers default network mode to 'nat' 2016-08-05 06:01:26 +02:00
Jay Oster 2ae059b41d Address review comments
- Simplify map length check in Linux Executor
- Added a `chroot_env` test for config parser
- Moved `ChrootEnv` field from ExecutorCommand to ExecutorContext
- Added a test for `chroot_env` functionality
2016-08-04 15:33:06 -07:00
Diptanu Choudhury 5ff750db96 Merge pull request #1501 from hashicorp/f-stats-opt-in
Allow operators to opt into publishing node and alloc metrics
2016-08-04 13:33:56 -07:00
Diptanu Choudhury 531b619ce4 Merge pull request #1475 from mwieczorek/windows-hostIp-portBindings
Empty host ip for windows containers port bindings
2016-08-04 13:30:43 -07:00
Alex Dadgar 1fe4158097 Merge pull request #1519 from vrenjith/master
Remove docker volumes while removing container
2016-08-04 12:54:00 -07:00
Jay Oster 24e8f752ab Add chroot_env to Java driver (which uses the exec driver internally) 2016-08-04 11:15:35 -07:00
Kenjiro Nakayama cd12645e4c Add TestRktTaskValidate 2016-08-04 23:15:13 +09:00
Kenjiro Nakayama bf27963903 Add TestRktTrustPrefix 2016-08-04 17:26:10 +09:00
Kenjiro Nakayama 1176fd123e Pass command and trust_prefix to the validation of rkt task configuration 2016-08-04 17:24:56 +09:00
vrenjith 41cf7cc623 Update docker.go
Remove container volumes
2016-08-04 11:43:50 +05:30
vrenjith 4e603e1306 Update checks_test.go
Remove docker volumes while exiting container
2016-08-04 11:42:47 +05:30
Jay Oster 7df692226a Add config field to specify chroot mapping for exec driver
- Same format as used by the internal chroot mapping
- Map: source_path -> dest_path
- Example HCL:

client {
  chroot_env {
    "/etc" = "/etc"
    "/lib" = "/lib"
    "/opt/projects/foo/bin" = "/usr/bin"
  }
}
2016-08-03 17:17:17 -07:00
Mathias Lafeldt 0727db4ca0
Test configuration of Docker working directory 2016-08-03 16:35:49 +02:00
Mathias Lafeldt d91f7dbdf8
Docker driver: allow to configure working directory 2016-08-03 16:18:15 +02:00
Alex Dadgar 47f5c8f523 use priviledge of the config 2016-08-02 16:10:15 -07:00
Alex Dadgar cec6d8a1eb remove gating of ipc, user ns and pidmode based on hosts priviledge mode config 2016-08-02 16:02:34 -07:00
Mathias Lafeldt acbee08a0a
Fix typo: atttempts 2016-08-02 18:11:03 +02:00
Diptanu Choudhury 41b540fbc8 Allow operators to opt into publishing node and alloc metrics 2016-08-01 19:52:20 -07:00
Kenjiro Nakayama e8ce8408a4 Fix gofmt in restarts_test.go 2016-07-30 21:11:06 +09:00
Cameron Davison 777bdf4a1e
fix setup consul syncer error message 2016-07-28 22:14:52 -05:00
Alex Dadgar 2999c12ef1 disable swap 2016-07-28 12:17:00 -07:00
Michal Wieczorek 4b82b6c3d4 Empty host ip for windows containers port bindings 2016-07-28 00:00:57 +02:00
Diptanu Choudhury 50842b88c7 Fixed some bugs 2016-07-25 17:26:38 -07:00
Alex Dadgar 42df093939 Merge pull request #1456 from hashicorp/b-system-job
Node Register handles transistioning to ready and creating evals
2016-07-25 12:46:35 -07:00
Alex Dadgar 3ea95bb91c initial log api impl 2016-07-25 11:16:01 -07:00
Alex Dadgar 84c3711989 Merge pull request #1457 from hashicorp/f-kill-event
Add killing event and mark task as not running when killed
2016-07-22 17:33:18 -07:00
Alex Dadgar 3ec9cf3e0d Merge pull request #1454 from hashicorp/b-blocking-stats
Driver Kill() does not block Stats()
2016-07-22 17:29:52 -07:00
Michal Wieczorek b6b3e24541 Link speed for windows network fingerprinting - tests 2016-07-22 22:49:03 +02:00
Alex Dadgar 90748cedad Add killing event and mark task as not running when killed 2016-07-21 15:49:54 -07:00
Alex Dadgar ebac5cb283 Node.Register handles the case of transistioning to ready and creating evals 2016-07-21 15:22:02 -07:00
Alex Dadgar 898435d372 Retrieve task runners in helper 2016-07-21 13:41:01 -07:00
Michal Wieczorek 679fefc155 Link speed for windows network fingerprinting 2016-07-20 22:13:50 +02:00
Diptanu Choudhury 22af229cef Merge pull request #1321 from mwieczorek/f-windows-binds
Volume binds for windows containers
2016-07-18 10:20:44 -06:00
Alex Dadgar c8e7b909c7 Merge pull request #1404 from hashicorp/f-streaming
Implement a streaming API and tail in the fs command
2016-07-12 17:23:04 -06:00
Alex Dadgar 661d100f2f address comments 2016-07-12 17:01:33 -06:00
Alex Dadgar 807bf3cf6c tests wait for the container to start 2016-07-12 11:36:06 -06:00
Alex Dadgar caa0e48841 Debug timeout 2016-07-12 11:07:05 -06:00
Alex Dadgar e1d68f3e9b Docker host network test 2016-07-12 10:59:34 -06:00
Alex Dadgar 9085b5ca0d Merge pull request #1405 from novilabs/delay-on-startup-failure
do not fail for multiple startup failures, delay instead
2016-07-12 09:51:40 -06:00
Cameron Davison d7fe56ecf3 test policy delay for startup error 2016-07-11 20:54:36 -05:00
Cameron Davison dd314ea06e if policy mode is delay, do not fail for multiple startup failures, delay instead 2016-07-11 20:40:53 -05:00
Diptanu Choudhury ed3be34105 Introduced an env var for rkt tests 2016-07-11 15:48:16 -06:00
Diptanu Choudhury 4e0b621ffa Skipping travis tests and not installing rkt on travis 2016-07-11 15:10:09 -06:00
Sean Chittenden a9b3f5e552
Alpha-sort the build platforms 2016-07-11 12:23:46 -07:00
Sean Chittenden 267198742f
Merge branch 'master' into f-resource-isolation-fingerprinter 2016-07-11 12:23:09 -07:00
Sean Chittenden d309649ada
Darwin currently has allocdir support.
Pointed out by: @dadgar
2016-07-11 12:19:17 -07:00
Sean Chittenden 20d87f1782
Remove cgroup fingerprinter from non-linux systems.
If someone wants to extend or reuse Cgroup detenction in the future they
can move `cgroup_linux.go` to `cgroup.go` and add the relevant build
tags.

Requested by: @dadgar
2016-07-11 12:16:56 -07:00
Diptanu Choudhury e2909db9ef Merge pull request #1388 from novilabs/support-docker-syslog-unixformat-and-defaultformat
Support docker syslog unixformat and defaultformat
2016-07-11 11:17:30 -07:00
Alex Dadgar f11b1ce079 Get windows to build 2016-07-11 11:52:41 -06:00
Alex Dadgar e9ffadfdc6 initial comments 2016-07-11 10:58:18 -06:00
Sean Chittenden 9966169596
Merge branch 'f-resource-isolation-cleanup' into f-resource-isolation-fingerprinter 2016-07-11 00:10:21 -07:00
Sean Chittenden d20c5fc327 Merge pull request #1402 from hashicorp/f-resource-isolation-cleanup
Resource isolation cleanup
2016-07-11 02:09:35 -05:00
Sean Chittenden be272168c7
Rename resourceContainer{,Context} and resCon{,Ctx}. 2016-07-11 00:02:55 -07:00
Sean Chittenden 1c14e01ac0
Add a comment describing IsolationConfig 2016-07-10 23:45:44 -07:00
Sean Chittenden 5e2e5b6ccc
Merge branch 'b-exec-cleanup' into f-resource-isolation-cleanup 2016-07-10 23:41:04 -07:00
Sean Chittenden cf6b97ec6c Merge pull request #1400 from hashicorp/b-exec-cleanup
Initialize the list of available fingerprinters per platform.
2016-07-10 23:22:02 -07:00
Sean Chittenden f39e84b672
Improve readability: use of a switch vs two if's 2016-07-10 20:18:57 -07:00
Sean Chittenden 2ffbeee06c
Skip the network fingerprinter test when offline.
Conditionalize the network fingerprinter test so that it works when a
user is offline.  Similarly, when the network fingerprint test fails in
the future pass a HINT to the user to set an env var to allow the test
to be skipped in the future.
2016-07-10 20:16:06 -07:00
Sean Chittenden 2983bd6fce
Fix test for non-Linux platforms.
The following tests now check a whitelist for whether or not their
driver is present or not, or if the OS is supported or not.

* `TestAllocDir_MountSharedAlloc`
* `TestClient_Drivers_InWhitelist` (`exec` driver)
* `TestClient_Drivers` (`exec` driver)
* `TestJavaDriver_Fingerprint` (`java` driver)
2016-07-10 15:19:49 -07:00
Sean Chittenden 710173e9cb
Build the Cgroup fingerprinter on only Linux.
Change the logic from `!linux` to an empty build tag so that *if*
another platform picks up Cgroups support they can add themselves to the
necessary build tags for this fingerprinter and be on their way.
Because this technology isn't inherently Linux-specific and isn't
mutually exclusive of other resource isolation containers, resist the
urge to rename the Cgroup fingerprinter to something generic like the
ResourceContainerFingerprinter.
2016-07-10 13:55:06 -07:00
Alex Dadgar 51ae7ace25 initial tail impl 2016-07-10 13:57:04 -04:00
Sean Chittenden d4fe69ddf9
Update comments and pushdown a lock into the resource container 2016-07-10 00:12:59 -07:00
Sean Chittenden fc9cd8d4af
Push down the Linux-specific bits into resourceContainer 2016-07-10 00:06:53 -07:00
Sean Chittenden 6fc269d2a6
Move unit tests around into per-platform where appropriate. 2016-07-09 23:56:31 -07:00
Sean Chittenden a5dc6c2da9
Push the Client's cleanup of Cgroups down 2016-07-09 23:45:33 -07:00
Sean Chittenden 5dbc0bf382
Rename resourceContainer.cleanup() to executorCleanup()
Not to be confused with the imminent ClientCleanup().
2016-07-09 23:25:33 -07:00
Sean Chittenden 5f8f0a50ac
Begin cgroup pushdown into platform specific files 2016-07-09 23:01:14 -07:00
Sean Chittenden bdd7022fdc
Centralize the fingerprintrs.
Add platform specific fingerprinters per platform.

Requested by: @diptanu
2016-07-09 22:31:14 -07:00
Sean Chittenden 1e2e0ca050
Initialize the list of available fingerprinters per platform. 2016-07-09 00:22:42 -07:00
Diptanu Choudhury 5b39a5db40 Fixed a debug message 2016-07-09 00:12:53 -07:00
Diptanu Choudhury e5310b76a6 Merge pull request #1399 from hashicorp/b-exec-cleanup
WIP: Cleanup exec driver
2016-07-09 00:08:43 -07:00
Sean Chittenden 03c571c61b
Consolidate fingerprinters into a single `map`. 2016-07-08 23:37:14 -07:00
Sean Chittenden 7530b27014 Move all non-Linux Fingerprinter items to the default exec driver 2016-07-08 18:35:46 -07:00
Diptanu Choudhury 5d61fa01f1 Fixed tests 2016-07-08 18:27:51 -07:00
Diptanu Choudhury 3c4002c48b Fixed the client tests 2016-07-08 17:49:58 -07:00
Diptanu Choudhury 1784d98182 Fixed the host port environment variable 2016-07-08 15:37:44 -07:00
Cameron Davison 921a6c889c remove the expected leading space, after the colon in syslog 2016-07-06 11:08:24 -05:00
Cameron Davison 07a9e15560 get into the hour minute second part of the time before looking for spaces, and then looking for the : seperator 2016-07-06 11:08:24 -05:00
Alex Dadgar c35b1be845 Set running when restoring 2016-06-28 13:47:59 -07:00
Wojciech Bederski a73422b4ff Fix docker driver lockup during nomad boot
Unit mismatch caused docker driver to wait almost indefinitely during boot 
(when one or more containers were a bit uncooperative during StopContainer())
This should fix problems described in  #1202
2016-06-28 14:26:47 +02:00
Cameron Davison d1e7d9c50f
write state to temp file and then rename 2016-06-27 12:29:33 -05:00
Jake Champlin f094969c7b
Update failing tests 2016-06-23 11:28:17 -04:00
Alex Dadgar 14e950f882 Treat float as int 2016-06-22 15:09:39 -07:00
Alex Dadgar 4ff8edd2da Floor CPU MHz and total compute and mark hostname as unique 2016-06-22 15:01:36 -07:00
Diptanu Choudhury 0a10873aa6 Merge pull request #1335 from hashicorp/f-set-docker-timeout
Setting a timeout in the docker client
2016-06-21 17:00:14 -07:00
Diptanu Choudhury 2837d3395d Setting a timeout in the docker client 2016-06-21 16:58:21 -07:00
Diptanu Choudhury 1d5c5b18f3 Making SSL default 2016-06-21 16:41:14 -07:00
Sean Chittenden 8bdb38d016
Code golf
Pointed out by: @dadgar
2016-06-21 14:26:01 -07:00
Sean Chittenden df4fe2e502
Fix the shuffling of remote datacenters.
Pointed out by: @ryanuber
2016-06-21 13:37:22 -07:00
Diptanu Choudhury 88ac1b33a4 Not emitting per-pid stats and added the total ticks consumed by a Task 2016-06-20 17:30:25 -07:00
Alex Dadgar 19024f4da0 Merge pull request #1322 from hashicorp/b-docker-logs-splicing
Make line copy to avoid being overriden by subsequent scans
2016-06-20 13:17:49 -07:00
Alex Dadgar 661dc200f3 Make line copy to avoid being overriden by subsequent scans 2016-06-20 13:14:43 -07:00
Michal Wieczorek 67a04bb1cc Volume binds for windows containers 2016-06-20 21:46:33 +02:00
Alex Dadgar 3cd9c9590b guard against NaN 2016-06-20 10:29:46 -07:00
Alex Dadgar 7b83503596 finer grain locking 2016-06-20 10:19:06 -07:00
Alex Dadgar 744270590b Guard against bad restore 2016-06-17 14:58:53 -07:00
Alex Dadgar c9f7467ccb Driver tests use client default config 2016-06-17 14:24:49 -07:00
Sean Chittenden 7b9961f09b
Initialize the stats helpers before accessing them for the first time 2016-06-17 13:23:30 -07:00
Sean Chittenden 9cb649d247 Merge pull request #1307 from hashicorp/f-fingerprint-cpus
F fingerprint cpus
2016-06-17 12:23:40 -07:00
Sean Chittenden 9e287858de Merge pull request #1310 from hashicorp/b-logger
Create and pass only one `logger` object around per Agent
2016-06-17 12:16:35 -07:00
Sean Chittenden 21b84fc3e6
Memoize the CPU stats. Error if CPU fingerprinting fails. 2016-06-17 12:13:53 -07:00
Alex Dadgar 27c6398639 debug message when stopping container 2016-06-17 11:52:44 -07:00
Sean Chittenden 686c125fea
Record and use only the first Mhz from the CPU fingerprinter.
Assume all cores are the same speed.
2016-06-17 11:06:57 -07:00
Sean Chittenden 46e2d54acf
Provide `nomad.Config` with a default `LogOutput` of `os.StdErr` 2016-06-17 06:44:10 -07:00
Sean Chittenden 9a60999100
Pass a logger arg to `NewClient` and `NewServer` 2016-06-16 23:29:23 -07:00
Sean Chittenden 4cc90753f8
In the debug log, split the unit from the measurement
awk(1) friendly is UNIX(tm) friendly.
2016-06-16 23:07:13 -07:00
Sean Chittenden 2dcb591cd8
Warn when we're unable to fingerprint the CPU Mhz 2016-06-16 23:07:13 -07:00
Sean Chittenden b8e63411c0
Explicitly call `cpu.Counts()` to determine the CPU core count
Much safer than counting the number of InfoStat structs returned.
2016-06-16 23:07:13 -07:00
Sean Chittenden d17af396ca
Create config.DefaultConsulConfig() 2016-06-16 20:41:05 -07:00
Sean Chittenden fd18eb7fdb
Only register the Client services reaper when `consul.auto_advertise` is enabled 2016-06-16 18:24:58 -07:00
Sean Chittenden 1ce2cc6141
`conf` -> `config` 2016-06-16 17:05:29 -07:00
Sean Chittenden 015248a67f
Fix tests for NewServer() in client mode.
Pointy-hat: sean-

I'm positive I've done this before.
2016-06-16 16:34:22 -07:00
Sean Chittenden 952b6ce7b5
Only auto-join clients if `client_auto_join` is true 2016-06-16 14:47:21 -07:00
Sean Chittenden ec77a1869e
Test for errors 2016-06-16 14:43:46 -07:00
Sean Chittenden af55b74114 Merge pull request #1276 from hashicorp/f-consul-server-autojoin
Teach Nomad servers how to fall back to Consul.
2016-06-16 14:40:45 -07:00
Diptanu Choudhury ed67f1a347 Merge pull request #1285 from hashicorp/fix-selinux-options
Added a client options for setting selinux options
2016-06-16 22:45:24 +02:00
Diptanu Choudhury 266c417ac8 Changed the client options for docker volume selinux labels 2016-06-16 21:41:02 +01:00
Alex Dadgar fe588a2469 Guard against restoring a nil task in task_runner 2016-06-16 11:55:40 -07:00
Sean Chittenden 008d75184b
Use the `%+q` verb in log messages (vs `%q`). 2016-06-16 11:03:51 -07:00
Alex Dadgar 7375d828e1 remove trace 2016-06-15 15:47:59 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Alex Dadgar cf99fc3173 Use Status.Peers instead of Status.Ping 2016-06-15 12:00:20 -07:00
Diptanu Choudhury fa216199ce Added documentation 2016-06-15 02:42:15 +02:00
Diptanu Choudhury e08083acfe Added a client options for setting selinux options 2016-06-15 02:33:09 +02:00
Sean Chittenden 6e22b680ce
Disambiguate `auto_join` from `auto_register`, rename reg to `auto_advertise`.
Provide an option that describes the value to the user vs the
operation performed by the software.  Momentarily introducing
`auto_join`
2016-06-14 12:11:38 -07:00
Alex Dadgar 4b04e503f3 address comments 2016-06-13 17:32:18 -07:00
Alex Dadgar 8bbf4a55e5 Fix IDs and domain scoping 2016-06-13 16:30:58 -07:00
Diptanu Choudhury d019d8ef8e implemented reconciliation of unwanted services 2016-06-13 14:52:26 +02:00
Alex Dadgar 232654cdee register checks 2016-06-12 21:28:56 -07:00
Alex Dadgar a82c2bb058 Do not reconcile in client and cleanup executor a bit 2016-06-12 18:22:07 -07:00
Alex Dadgar 8e231fa382 Rename ConsulService back to Service 2016-06-12 16:36:49 -07:00
Alex Dadgar e931b42473 unify cli output 2016-06-12 13:16:07 -07:00
Alex Dadgar 6ef4bfd6bf skip docker test if no docker found 2016-06-12 11:28:43 -07:00
Alex Dadgar 2628827b7c Merge pull request #1262 from hashicorp/remove-artifact-check
Removing artifact check for java and qemu drivers
2016-06-12 11:21:18 -07:00
Alex Dadgar c4a819528a Merge pull request #1260 from hashicorp/f-alloc-stats-struct
Allocation resources returned in a struct
2016-06-12 11:18:57 -07:00
Alex Dadgar b671cd4134 Test fixes 2016-06-12 11:14:17 -07:00
Alex Dadgar fdda90229f only support latest and remove ring buffer 2016-06-12 09:32:38 -07:00
Diptanu Choudhury 34f85baab0 Fix the calculation of total ticks for docker and exec 2016-06-12 18:08:35 +02:00
Diptanu Choudhury beb362e202 Setting a flag to indicate whether fs isolation is indeed happening 2016-06-12 15:43:24 +02:00
Diptanu Choudhury 641cf50682 Not converting the abs path relative to task dir for drivers which enforce FS isolation only in linux 2016-06-12 13:54:30 +02:00
Alex Dadgar e952540f6f Allocation resources returned in a struct 2016-06-11 21:04:10 -07:00
Diptanu Choudhury a5e81ebc3a Removing un-used code 2016-06-12 01:23:49 +02:00
Diptanu Choudhury 86e4f295da Fixed the calculation of the host node ticks 2016-06-12 01:14:51 +02:00
Sean Chittenden 2f036231e5 Merge pull request #1201 from hashicorp/f-dyn-server-list
Dynamic Server Lists/Client Bootstrapping via consul.
2016-06-11 18:58:25 -04:00
Sean Chittenden 92e2cfb0ad
Walk the DCs from nearest to most remote. 2016-06-11 18:52:21 -04:00
Sean Chittenden 2968545201
Walk the DCs from nearest to most remote, no limit on the search. 2016-06-11 18:23:06 -04:00
Sean Chittenden 917766a3df
Prefer `%+q` over `%q` in log messages. 2016-06-11 18:17:20 -04:00
Sean Chittenden 445783889b
Remove default values and use nil for the executor. Much better. 2016-06-11 17:52:09 -04:00
Diptanu Choudhury 0fad17a1a9 Merge pull request #1258 from hashicorp/fix-statsd-metric-type
Emitting client resource usage metrics as guages instead of k/v pairs
2016-06-11 13:18:52 -07:00
Diptanu Choudhury fd60cfd585 Emitting client resource usage metrics as guages instead of k/v pairs 2016-06-11 22:17:32 +02:00
Diptanu Choudhury 19f4adbcf1 Using a different client for collecting stats and waiting on containers 2016-06-11 20:37:29 +02:00
Diptanu Choudhury 7fb507e810 Moving the clkspeed code to helper 2016-06-11 17:31:49 +02:00
Sean Chittenden f891fa0ec8
Perform a nil-check for Executor's consulServices.
Executors can `Shutdown()` before calling `SyncServices()`.
2016-06-10 23:43:54 -04:00
Sean Chittenden bbd8dfa798
goling(1) compliance pass (e.g. Rpc* -> RPC) 2016-06-10 23:38:28 -04:00
Sean Chittenden bc771d35df
Query for the Nomad service across multiple Consul datacenters. 2016-06-10 23:05:14 -04:00
Sean Chittenden d433001f04
Expose rpcproxy's `ServerEndpoint()` constructor, `newServer()` as `NewServerEndpoint()` 2016-06-10 22:14:03 -04:00
Diptanu Choudhury 59540c3e93 Extracted a method for getting clock speed 2016-06-11 02:07:28 +02:00
Diptanu Choudhury f94f89b6d7 Pruning out pids which are no longer present 2016-06-11 01:40:52 +02:00
Diptanu Choudhury 0a9a3918d6 Not reset-ing the list of pids if they don't change 2016-06-11 01:19:50 +02:00
Diptanu Choudhury c38a6fb3c5 Implementing the total ticks per task for the docker driver 2016-06-10 23:33:25 +02:00
Diptanu Choudhury 01054db4fa Calculating total ticks consumed in the nomad client 2016-06-10 23:14:33 +02:00
Sean Chittenden 5b0c2bf282
Restore old behavior and have AddPrimaryServer() return a pointer to the existing server (vs nil when the server already exists). 2016-06-10 16:46:49 -04:00
Diptanu Choudhury 2d3798b076 Calculating the cpu ticks in nomad client 2016-06-10 22:22:32 +02:00
Sean Chittenden df896c6ef2
Prevent duplicate servers being added in AddPrimaryServer.
This logic was already present elsewhere and was missed in this one
place.
2016-06-10 15:55:27 -04:00
Sean Chittenden d99467ef5e
Always create a consul.Syncer. Use a default Consul Config if necessary. 2016-06-10 15:55:27 -04:00
Sean Chittenden f6a0459ae5
Always create a consul.Syncer. Use a default Consul Config if necessary. 2016-06-10 15:55:27 -04:00
Sean Chittenden 03846fb754
Rename listLock to activatedListLock 2016-06-10 15:54:39 -04:00
Sean Chittenden 4c9067310b
Nomad does not use Serf at the client level. Use a hard lock. 2016-06-10 15:54:39 -04:00
Sean Chittenden 26b1e826d7
golint(1) police 2016-06-10 15:54:39 -04:00
Sean Chittenden 974c9927c7
Formatting nit: remove brackets 2016-06-10 15:54:39 -04:00
Sean Chittenden 1371ef85f5
Prefix all log entries in client/rpcproxy with client.rpcproxy 2016-06-10 15:54:39 -04:00
Sean Chittenden 009495c18a
Style nit: remove `var` block 2016-06-10 15:54:39 -04:00
Sean Chittenden f139d0c68b
Properly guard consulPullHeartbeatDeadline behind heartbeatLock 2016-06-10 15:54:39 -04:00
Sean Chittenden 8dc16ad5e3
Move RPCProxy.New() adjacent to its struct definition 2016-06-10 15:54:39 -04:00
Sean Chittenden cc6e8792e0
Create a consulContext using a client's consul config.
This is wrong and should be the Agent's Consul Config.  This is a
step in the right direction, so committing to mark the necessary
future change.
2016-06-10 15:54:39 -04:00
Sean Chittenden ed29946f5e
Populate the RPC Proxy's server list if heartbeat did not include a leader.
It's possible that a Nomad Client is heartbeating with a Nomad server that
has become issolated from the quorum of Nomad Servers.  When 3x the
heartbeatTTL has been exceeded, append the Consul server list to the primary
primary server list.  When the next RPCProxy rebalance occurs, there is a
chance one of the servers discovered from Consul will be in the majority.
When client reattaches to a Nomad Server in the majority, it will include
a heartbeat and will reset the TTLs *AND* will clear the primary server list
to include only values from the heartbeat.
2016-06-10 15:54:39 -04:00
Sean Chittenden 9a223936bb
Generate and sync Consul ServiceIDs consistently 2016-06-10 15:54:39 -04:00
Sean Chittenden 3892a0433e
Move the start of the UniversalExecutor's consulSyncer to initialize once
This should be handled via a sync.Once primative, but I don't want to
unpack that atm.
2016-06-10 15:54:39 -04:00
Sean Chittenden 7956eb0c80
Rename structs.Task's `Service` attribute to `ConsulService` 2016-06-10 15:54:39 -04:00
Sean Chittenden 8c813630e6
Move package client/consul/sync to command/agent/consul.
This has been done to allow the Server and Client to reuse the same
Syncer because the Agent may be running Client, Server, or both
simultaneously and we only want one Syncer object alive in the agent.
2016-06-10 15:54:39 -04:00
Sean Chittenden d6ef97911b
Rename Syncer.SetServiceIdentifier to SetServiceRegPrefix()
This attribute isn't actually an identifier because it can represent
a collection of services.  Rename `serviceIdentifier` to
`serviceRegPrefix which more accurately conveys the intention of this
Syncer attribute.

While here, also rename `SetServiceIdentifier()` to `SetServiceRegPrefix()`
and `GenerateServiceIdentifier()` to `GenerateServicePrefix()`.
2016-06-10 15:54:39 -04:00
Sean Chittenden 6b126ce488
Change the API signature of Syncer.SyncServices().
SyncServices() immediately attempts to sync whatever information
the process has with Consul.  Previously this method would take an
argument of the exclusive list of services that should exist,
however this is not condusive to having a Nomad Client and Nomad
Server share the same consul.Syncer.
2016-06-10 15:54:39 -04:00
Sean Chittenden fda03c5c9e
Change the signature of the PeriodicCallback to return an error
I *KNEW* I should have done this when I wrote it, but didn't want to
go back and audit the handlers to include the appropriate return
handling, but now that the code is taking shape, make this change.
2016-06-10 15:54:39 -04:00
Sean Chittenden 4973ec32bb
Rename structs.Services to structs.ConsulServices 2016-06-10 15:54:39 -04:00
Sean Chittenden 6430190d6a
Rename createCheck() to createDelegatedCheck() for clarity 2016-06-10 15:54:39 -04:00
Sean Chittenden 555f4fe135
Change client/consul.NewSyncer() to accept a shutdown channel
In addition to the API changing, consul.Syncer can now be signaled
to shutdown via the Shutdown() method, which will call the Run()'ing
sync task to exit gracefully.
2016-06-10 15:54:39 -04:00
Sean Chittenden 439baa0f8b
Remove named return parameters 2016-06-10 15:50:11 -04:00
Sean Chittenden ea0b35e303
Collapse server_endpoint_internal_test.go into server_endpoint_test.go
Requested by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden 6e71bbc26b
Collapse rpcproxy_internal_test.go into rpcproxy_test.go
Requested by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden 4b021b79b0
Bump the default Consul client timeout from 500ms to 5s.
Requsted by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden 747a794c77
Move `const` block to the top of the file.
Requested by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden 484816f5e0
Ensure that all accesses to Client.alloc are wrapped by allocLock. 2016-06-10 15:50:11 -04:00
Sean Chittenden 08cab4fdfa
Use client.getAllocRunners() where appropriate. 2016-06-10 15:50:11 -04:00
Sean Chittenden f9d0b9da32
Line wrap long line. 2016-06-10 15:50:11 -04:00
Sean Chittenden 0d201631a3
Rename rpcproxy.UpdateFromNodeUpdateResponse to RefreshServerLists
While breaking the API within this PR, break out the individual
arguments to RefreshServerLists.  The servers parameter is reusing
`structs.NodeServerInfo` for the time being, but this can be revisited
if the needs of the strucutre diverge in the future.
2016-06-10 15:50:11 -04:00
Sean Chittenden 0997fb1669
Fix up the comments
Pointed out by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden aaa7d6bf40
Make the locking protocol more explicit in client.NewClient
With an over abundance of caution, preevnt future copy/pasta by
using the right locks when bootstrapping a Client.  Strictly speaking
this is not necessary, but it makes explicit the locking semantics
and guards against future concurrent or parallel initialization.
2016-06-10 15:50:11 -04:00
Sean Chittenden 525554c008
Use the client configCopy and lock appropriately. 2016-06-10 15:50:11 -04:00
Sean Chittenden 3060d6b33c
Flesh out the comment re: the client.rpcproxy.Run() task.
Requested by: Alex
2016-06-10 15:50:11 -04:00
Sean Chittenden a6a4345f27
Clean up various comments 2016-06-10 15:50:11 -04:00
Sean Chittenden 56a9981a13
Nuke the last of the explicit types in favor of using language idioms 2016-06-10 15:50:11 -04:00
Sean Chittenden b1ee131db8
Rename `backupServerDeadline` to `consulPullHeartbeatDeadline`
Suggested by: @alex
2016-06-10 15:50:11 -04:00
Sean Chittenden a8d2af692c
Don't clobber the default consul config in tests 2016-06-10 15:50:11 -04:00
Sean Chittenden b9adfcecf5
Remove unused variable 2016-06-10 15:50:11 -04:00
Sean Chittenden f15eeb8f27
Clean up some docs and comments to be more accurate 2016-06-10 15:50:11 -04:00
Sean Chittenden c78b0a6567
Remove unused constants 2016-06-10 15:50:11 -04:00
Sean Chittenden bcbec34937
Only actively test Consul when env `CONSUL_HTTP_ADDR` is set 2016-06-10 15:50:11 -04:00
Sean Chittenden 04c697c610
Update godoc for newServer to reflect DNS and IP-based inputs
Requested by: alex
2016-06-10 15:50:11 -04:00
Sean Chittenden 27413076fb
Pick the right `DefaultConfig` from the right package.
Overly zealous search && replace at work here.
2016-06-10 15:50:11 -04:00
Sean Chittenden 7167b7a357
Add a quick set of client/rpcproxy.ServerEndpoint equality tests 2016-06-10 15:50:11 -04:00
Sean Chittenden 837b387dcb
Fix building tests that used `DefaultConfig()` but didn't pickup the package move. 2016-06-10 15:50:11 -04:00
Sean Chittenden 3b5db4e390
Fix the client/rpcproxy unit tests. 2016-06-10 15:50:11 -04:00
Sean Chittenden 39fb0f2469
Change the endpoint for `/v1/agent/servers` and fix tests.
When an agent is running a server, the list of servers includes the
Raft peers.  When the agent is running a client (which is always the
case?), include a list of the servers found in the Client's RpcProxy.
Dedupe and provide a unique list back to the caller.
2016-06-10 15:50:11 -04:00
Sean Chittenden 6fdf9135cb
Provide a default ConsulConfig for client/config.DefaultConfig()
Change the unit test to only test if the consul link exists, not the
value of the link.  The old test was hostname specific and therefore
would always be different based on the environment running the tests.
2016-06-10 15:50:11 -04:00
Sean Chittenden cb80e93a6b
Move client.DefaultConfig() to client/config.DefaultConfig()
Resolves an import cycle in testing and is more appropriate because
the default should reside next to its struct definition.
2016-06-10 15:50:11 -04:00
Sean Chittenden d87085e040
Fix a comment to be more correct 2016-06-10 15:50:11 -04:00
Sean Chittenden c7e1879c4d
Unused code wasn't as unused as I thought. Restore. 2016-06-10 15:50:11 -04:00
Sean Chittenden bff57a0dce
Reconcile, clean up, and centralize API version numbers (major and minor).
Reduce future confusion by introducing a minor version that is gossiped out
via the `mvn` Serf tag (Minor Version Number, `vsn` is already being used for
to communicate `Major Version Number`).

Background: hashicorp/consul/issues/1346#issuecomment-151663152
2016-06-10 15:50:11 -04:00
Sean Chittenden b3fd455b1f
Register the serf service with the Nomad server service.
This will be unused in this PR.
2016-06-10 15:50:11 -04:00
Sean Chittenden 82d537fbd9
Update the `nomad_server_service` default from `nomad-server` to just `nomad`. 2016-06-10 15:50:11 -04:00
Sean Chittenden 6d9ec364e5
Change the constants used to match the struct definitions 2016-06-10 15:50:11 -04:00
Sean Chittenden c013957d49
When clearing the backup servers, set the length to zero. 2016-06-10 15:50:11 -04:00
Sean Chittenden 36340c654d
Improve language re: fingerprinting 2016-06-10 15:50:11 -04:00
Sean Chittenden bff82e4890
Remove unused function. 2016-06-10 15:50:11 -04:00
Sean Chittenden 498f21cdec
Clear the backup server list when a Nomad heartbeat arives with servers
If Nomad is heartbeating during a transition from using backup servers
to Nomad servers, make Nomad the canonical source of servers and flush
the list of servers populated from Consul.
2016-06-10 15:50:11 -04:00
Sean Chittenden 1fe979a5e4
Remove types.ShutdownChannel and replace with `chan struct{}` 2016-06-10 15:50:11 -04:00
Sean Chittenden 2395aa481c
Fix unit tests 2016-06-10 15:50:11 -04:00
Sean Chittenden 438becb28b
Pass the datacenter name in the heartbeat
Servers that are part of a different datacenter are added as backup
servers instead of primary servers.
2016-06-10 15:50:11 -04:00
Sean Chittenden d914f262f9
Consolidate all consul sync periodic go routines to handlers.
Only one pump and periodic loop now.
2016-06-10 15:50:11 -04:00
Sean Chittenden 9fb0104def
Teach Client to reuse an Agent's consulSyncer.
"There can be only one."
2016-06-10 15:50:11 -04:00
Sean Chittenden 47891fb559
Register two services each for clients and servers, http and rpc.
In order to give clients a fighting chance to talk to the right port,
differentiate RPC services from HTTP services by registering two
services with different tags.  This yields
`rpc.nomad-server.service.consul` and
`http.nomad-server.service.consul` which is immensely more useful to
clients attempting to bootstrap their world.
2016-06-10 15:50:11 -04:00
Sean Chittenden d2f9848348
Bump the cluster test minimums to 10min.
These ranges aren't too useful with the default 600s rebalance, but
will be useful if that default ever changes in the future.
2016-06-10 15:50:11 -04:00