Commit graph

349 commits

Author SHA1 Message Date
Michael Schurter 4a9a574d9d Merge pull request #2054 from hashicorp/f-prestart
Add Driver.Prestart method
2016-12-20 16:18:56 -08:00
Diptanu Choudhury b6120e2fc8 Removing the alloc runner from GC if it is destroyed by the server 2016-12-20 11:14:22 -08:00
Diptanu Choudhury 6e6e0d364a Added comments 2016-12-20 10:49:48 -08:00
Diptanu Choudhury 36b5545d6b Making the gc allocator understand real disk usage 2016-12-16 18:34:59 -08:00
Diptanu Choudhury 7aef9bcabe Added the stats collector to GC 2016-12-14 15:11:11 -08:00
Diptanu Choudhury e855cd587b Refactored hoststats collector 2016-12-14 15:07:42 -08:00
Diptanu Choudhury 0ffd92668d GC-ing before we start a new allocation 2016-12-14 15:04:06 -08:00
Diptanu Choudhury afdaa979f7 Added a garbage collector for allocations 2016-12-14 15:01:12 -08:00
Alex Dadgar 648ad2ebc5 Merge pull request #2096 from hashicorp/b-addAlloc
Fix race and remove panic
2016-12-13 13:50:17 -08:00
Diptanu Choudhury 53fb09023c cancelling waiting for remote allocation if the alloc doesn't need migration 2016-12-13 13:06:33 -08:00
Alex Dadgar 3cbd237512 Fix race and remove panic 2016-12-13 12:34:23 -08:00
Christoffer Kylvåg 6a1f32b8ba #1680: Continue after not being able to stat a mountpoint 2016-12-13 12:28:57 +01:00
Diptanu Choudhury cbf73908ff Setting the appropriate file permissions which un-archiving compressed alloc dir 2016-12-05 17:04:43 -08:00
Diptanu Choudhury bc17cacca0 Merge pull request #2017 from hashicorp/b-sticky
Not moving alloc data when sticky is turned off
2016-12-05 14:11:45 -08:00
Diptanu Choudhury 21f49564d3 Not moving alloc data when sticky is turned off 2016-12-05 14:00:01 -08:00
Michael Schurter 770ed703d0 Add Driver.Prestart method
The Driver.Prestart method currently does very little but lays the
foundation for where lifecycle plugins can interleave execution _after_
task environment setup but _before_ the task starts.

Currently Prestart does two things:

* Any driver specific task environment building
* Download Docker images

This change also attaches a TaskEvent emitter to Drivers, so they can
emit events during task initialization.
2016-12-02 11:03:48 -08:00
Alex Dadgar 86ed1fb2e5 Disallow stale queries when deriving Vault tokens
This PR disallows stale queries when deriving a Vault token. Allowing
stale queries could result in the allocation not existing on the server
that is servicing the request.
2016-12-01 11:13:36 -08:00
Alex Dadgar ec4d6936ff add debug panic 2016-11-29 15:57:40 -08:00
Diptanu Choudhury f67217297c Ensuring allocs are not added multiple times to blocking queue 2016-11-29 11:19:37 -08:00
Alex Dadgar 88c7e04348 Check for Ephemeral Disk being nil 2016-11-15 10:03:06 -08:00
Alex Dadgar ee921ccbb2 Merge pull request #1949 from carlpett/blacklist-fingerprints-and-drivers
Support blacklisting fingerprinters
2016-11-09 10:31:17 -08:00
Calle Pettersson 4304755c12 Address comments from PR 2016-11-09 11:50:16 +01:00
Calle Pettersson 8632696e2d Add blacklisting of drivers 2016-11-08 18:30:07 +01:00
Calle Pettersson b603bb007e Add blacklisting of fingerprinters 2016-11-08 18:29:44 +01:00
Alex Dadgar 9015e79aaa Add compatibility code for secret ID while upgrading cluster in both server/client mode on single nodes 2016-11-07 16:52:08 -08:00
Diptanu Choudhury 1a8fa8c8d5 Making Nomad TLS configs region aware 2016-11-01 11:55:29 -07:00
Diptanu Choudhury 4079545a92 Making the client use tls if the node from which migration has to be made has enabled tls 2016-10-31 10:20:04 -07:00
Michael Schurter cc115fe984 Swap log line classifiers to be consistent 2016-10-28 14:59:48 -07:00
Diptanu Choudhury 3182d0454f Adding the alloc if we can't find the TG 2016-10-27 15:45:10 -07:00
Diptanu Choudhury 0682a1a113 Not blocking for remote alloc if the alloc is not sticky 2016-10-27 12:04:55 -07:00
Alex Dadgar 150b678a6b Merge pull request #1806 from hashicorp/f-docker4mac-fixes
A couple fixes to make Docker For Mac work
2016-10-27 09:29:40 -07:00
Diptanu Choudhury 50ca5e1e9d Merge pull request #1853 from hashicorp/f-rpc-http-tls
TLS support for http and RPC
2016-10-25 16:14:43 -07:00
Diptanu Choudhury 7c61e115bd Moved tlsutil into helpers 2016-10-25 16:05:37 -07:00
Diptanu Choudhury 353e7fc7f1 Moving the certs into tlsutil package 2016-10-25 16:01:53 -07:00
Diptanu Choudhury cf35aeac84 Moving the TLSConfig to structs 2016-10-25 15:57:38 -07:00
Alex Dadgar 03eba049ed Merge pull request #1848 from hashicorp/f-vault-error
Thread through whether DeriveToken error is recoverable or not
2016-10-24 15:01:18 -07:00
Alex Dadgar 692a809919 Merge pull request #1842 from hashicorp/f-version-and-id
Print the version and client node ID
2016-10-24 10:13:33 -07:00
Diptanu Choudhury 2e3118e69c Implemented TLS support for http and rpc 2016-10-23 22:22:00 -07:00
Alex Dadgar ede3a814ba Small fixes 2016-10-22 18:20:50 -07:00
Alex Dadgar 0070178741 Thread through whether DeriveToken error is recoverable or not 2016-10-22 18:08:30 -07:00
Michael Schurter 285e80ac0f Remove disk usage enforcement
Many thanks to @iverberk for the original PR (#1609), but we ended up
not wanting to ship this implementation with 0.5.

We'll come back to it after 0.5 and hopefully find a way to leverage
filesystem accounting and quotas, so we can skip the expensive polling.
2016-10-21 13:55:51 -07:00
Alex Dadgar aa0d8d0d8d Print the version and client node ID 2016-10-20 17:46:04 -07:00
Evan Phoenix e7a98d5500 Make EvalSymlink errors more verbose 2016-10-12 17:07:21 -07:00
Evan Phoenix f8a65a3b9d Resolve alloc/state directories to make Docker For Mac happy
* In -dev mode, `ioutil.TempDir` is used for the alloc and state
directories.
* `TempDir` uses `$TMPDIR`, which os OS X contains a per user
directory which is under `/var/folder`.
* `/var` is actually a symlink to `/private/var`
* Docker For Mac validates the directories that are passed to bind and on
OS X. That whitelist contains `/private`, but not `/var`. It does not
expand the path, and so any paths in `$TMPDIR` fail the whitelist check.

And thusly, by expanding the alloc/state directories the value passed
for binding does contain `/private` and Docker For Mac is happy.
2016-10-12 17:06:25 -07:00
Michael Schurter 6dea6df919 Restore lost chan inits 2016-10-03 14:56:50 -07:00
Diptanu Choudhury d50c395421 Getting snapshot of allocation from remote node (#1741)
* Added the alloc dir move

* Moving allocdirs when starting allocations

* Added the migrate flag to ephemeral disk

* Stopping migration if the allocation doesn't need migration any more

* Added the GetAllocDir method

* refactored code

* Added a test for alloc runner

* Incorporated review comments
2016-10-03 09:59:57 -07:00
Michael Schurter b117725dc9 Only log consul errors once since last succesful run 2016-09-28 17:18:45 -07:00
Michael Schurter d486de3804 Remove unused const 2016-09-27 16:04:01 -07:00
Michael Schurter 2e696c5e61 Fix lies found in comments by fact checkers 2016-09-26 16:51:53 -07:00
Michael Schurter 11cf9686a6 No need to put reaper ticker on the struct 2016-09-26 16:15:19 -07:00
Michael Schurter 2eb0062959 Drop clumsy timeout on discovery notifications
It's better to just let goroutines fallback to their longer retry
intervals then try to be clever here.
2016-09-26 16:05:21 -07:00
Michael Schurter 307e674eca Flip disco chan; clarify method names/comments 2016-09-26 15:52:40 -07:00
Michael Schurter 888ee21270 Return csv of servers from Stats, not just count 2016-09-26 15:40:26 -07:00
Michael Schurter 7dc0079dd2 doDisco -> triggerDiscoveryCh; discovered -> serversDiscoveredCh
Also fix log line formatting
2016-09-26 15:21:28 -07:00
Michael Schurter 434e4be97c noServers -> noServersErr 2016-09-26 15:12:35 -07:00
Michael Schurter b2ddb85a78 consul -> Consul 2016-09-26 15:06:57 -07:00
Michael Schurter 37cfb2769c Replace periodic handlers with event driven disco
Remove use of periodic consul handlers in the client and just use
goroutines. Consul Discovery is now triggered with a chan instead of
using a timer and deadline to trigger.

Once discovery is complete a chan is ticked so all goroutines waiting
for servers will run.

Should speed up bootstraping and recovery while decreasing spinning on
timers.
2016-09-23 17:02:48 -07:00
Michael Schurter 2ab5264595 Retry all servers on RPC call failure
rpcproxy is refactored into serverlist which prioritizes good servers
over servers in a remote DC or who have had a failure.

Registration, heartbeating, and alloc status updating will retry faster
when new servers are discovered.

Consul discovery will be retried more quickly when no servers are
available (eg on startup or an outage).
2016-09-23 11:44:48 -07:00
Alex Dadgar 50efdb00e9 Merge pull request #1713 from hashicorp/f-alloc-runner-vault
Vault integration in client
2016-09-20 16:15:55 -07:00
Alex Dadgar 64de46432a Merge pull request #1677 from hashicorp/f-vault-implicit-constraint
Vault implicit Task Group constraint + allow root tokens
2016-09-20 16:15:32 -07:00
Alex Dadgar ec152a6d12 Clean up vault client 2016-09-14 18:10:56 -07:00
Alex Dadgar 6702a29071 Vault token threaded 2016-09-14 13:30:01 -07:00
Robert Neumayer 8dc19dbd10 Log adding of servers at INFO level 2016-09-14 22:24:17 +02:00
Alex Dadgar 2c8dd8bbd3 Revert "Introduce a Secret/ directory" 2016-09-01 17:23:15 -07:00
Alex Dadgar b0adaa5301 Allow root token 2016-09-01 12:05:08 -07:00
Alex Dadgar 1ed454dd60 Merge pull request #1671 from hashicorp/f-secret-dir2
Introduce a Secret/ directory
2016-09-01 09:56:17 -07:00
Alex Dadgar 9fa23e3536 Symlink on windows 2016-08-31 21:41:44 -07:00
Alex Dadgar 5d3b47e648 Address comments and reserve 2016-08-31 18:11:02 -07:00
vishalnayak 55a6f06e15 Addressed review feedback 2016-08-30 13:08:13 -04:00
vishalnayak 3808dd0ff8 Return only fatal error to renewal error channel 2016-08-30 12:46:59 -04:00
vishalnayak a0dbfe25b3 Fix tests 2016-08-29 21:30:06 -04:00
vishalnayak 82f6209e97 tokenDeriver function pointer to derive tokens.
Remove rpc*, connPool, node and region from vaultclient.
2016-08-29 20:32:05 -04:00
Alex Dadgar 14b7126511 Secret dir, hello world 2016-08-29 15:41:52 -07:00
vishalnayak 56e42cf03d Employ DeriveVaultToken API and flesh-up DeriveToken 2016-08-24 12:29:59 -04:00
vishalnayak 6002e596c4 VaultClient for Nomad Client 2016-08-24 09:43:45 -04:00
Diptanu Choudhury 1e1eef56a1 Putting the mock driver behind a build flag 2016-08-22 15:02:28 -05:00
Diptanu Choudhury 4ca623bcfe blocking chained allocations until previous allocation hasn't terminated 2016-08-22 11:34:24 -05:00
Alex Dadgar a90dafe9ab handle the upgrade case 2016-08-18 19:01:24 -07:00
Alex Dadgar 895c31f605 Nodes generate Secret ID and used for retrieving allocations and registering 2016-08-17 16:31:47 -07:00
Alex Dadgar 84820db86f If the client detects that a heartbeat has failed because it is not registered, reregister 2016-08-15 17:24:09 -07:00
Diptanu Choudhury 28b3f511e0 Fixed some error messages 2016-08-10 15:17:32 -07:00
Kenjiro Nakayama 6a810e6f1e Update after review 2016-08-09 08:57:26 +09:00
Kenjiro Nakayama 5c621b74e5 tiny: Return fmt.Errorf instead of duplicated error messages 2016-08-09 08:57:26 +09:00
Diptanu Choudhury 41b540fbc8 Allow operators to opt into publishing node and alloc metrics 2016-08-01 19:52:20 -07:00
Cameron Davison 777bdf4a1e
fix setup consul syncer error message 2016-07-28 22:14:52 -05:00
Alex Dadgar ebac5cb283 Node.Register handles the case of transistioning to ready and creating evals 2016-07-21 15:22:02 -07:00
Diptanu Choudhury 5b39a5db40 Fixed a debug message 2016-07-09 00:12:53 -07:00
Sean Chittenden 03c571c61b
Consolidate fingerprinters into a single map. 2016-07-08 23:37:14 -07:00
Sean Chittenden 8bdb38d016
Code golf
Pointed out by: @dadgar
2016-06-21 14:26:01 -07:00
Sean Chittenden df4fe2e502
Fix the shuffling of remote datacenters.
Pointed out by: @ryanuber
2016-06-21 13:37:22 -07:00
Sean Chittenden 9a60999100
Pass a logger arg to NewClient and NewServer 2016-06-16 23:29:23 -07:00
Sean Chittenden fd18eb7fdb
Only register the Client services reaper when consul.auto_advertise is enabled 2016-06-16 18:24:58 -07:00
Sean Chittenden 952b6ce7b5
Only auto-join clients if client_auto_join is true 2016-06-16 14:47:21 -07:00
Sean Chittenden af55b74114 Merge pull request #1276 from hashicorp/f-consul-server-autojoin
Teach Nomad servers how to fall back to Consul.
2016-06-16 14:40:45 -07:00
Sean Chittenden 008d75184b
Use the %+q verb in log messages (vs %q). 2016-06-16 11:03:51 -07:00
Alex Dadgar 7375d828e1 remove trace 2016-06-15 15:47:59 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Alex Dadgar cf99fc3173 Use Status.Peers instead of Status.Ping 2016-06-15 12:00:20 -07:00
Alex Dadgar 4b04e503f3 address comments 2016-06-13 17:32:18 -07:00