Alex Dadgar
e1d68f3e9b
Docker host network test
2016-07-12 10:59:34 -06:00
Alex Dadgar
9085b5ca0d
Merge pull request #1405 from novilabs/delay-on-startup-failure
...
do not fail for multiple startup failures, delay instead
2016-07-12 09:51:40 -06:00
Cameron Davison
d7fe56ecf3
test policy delay for startup error
2016-07-11 20:54:36 -05:00
Cameron Davison
dd314ea06e
if policy mode is delay, do not fail for multiple startup failures, delay instead
2016-07-11 20:40:53 -05:00
Diptanu Choudhury
ed3be34105
Introduced an env var for rkt tests
2016-07-11 15:48:16 -06:00
Diptanu Choudhury
4e0b621ffa
Skipping travis tests and not installing rkt on travis
2016-07-11 15:10:09 -06:00
Sean Chittenden
a9b3f5e552
Alpha-sort the build platforms
2016-07-11 12:23:46 -07:00
Sean Chittenden
267198742f
Merge branch 'master' into f-resource-isolation-fingerprinter
2016-07-11 12:23:09 -07:00
Sean Chittenden
d309649ada
Darwin currently has allocdir support.
...
Pointed out by: @dadgar
2016-07-11 12:19:17 -07:00
Sean Chittenden
20d87f1782
Remove cgroup fingerprinter from non-linux systems.
...
If someone wants to extend or reuse Cgroup detenction in the future they
can move `cgroup_linux.go` to `cgroup.go` and add the relevant build
tags.
Requested by: @dadgar
2016-07-11 12:16:56 -07:00
Diptanu Choudhury
e2909db9ef
Merge pull request #1388 from novilabs/support-docker-syslog-unixformat-and-defaultformat
...
Support docker syslog unixformat and defaultformat
2016-07-11 11:17:30 -07:00
Alex Dadgar
f11b1ce079
Get windows to build
2016-07-11 11:52:41 -06:00
Alex Dadgar
e9ffadfdc6
initial comments
2016-07-11 10:58:18 -06:00
Sean Chittenden
9966169596
Merge branch 'f-resource-isolation-cleanup' into f-resource-isolation-fingerprinter
2016-07-11 00:10:21 -07:00
Sean Chittenden
d20c5fc327
Merge pull request #1402 from hashicorp/f-resource-isolation-cleanup
...
Resource isolation cleanup
2016-07-11 02:09:35 -05:00
Sean Chittenden
be272168c7
Rename resourceContainer{,Context} and resCon{,Ctx}.
2016-07-11 00:02:55 -07:00
Sean Chittenden
1c14e01ac0
Add a comment describing IsolationConfig
2016-07-10 23:45:44 -07:00
Sean Chittenden
5e2e5b6ccc
Merge branch 'b-exec-cleanup' into f-resource-isolation-cleanup
2016-07-10 23:41:04 -07:00
Sean Chittenden
cf6b97ec6c
Merge pull request #1400 from hashicorp/b-exec-cleanup
...
Initialize the list of available fingerprinters per platform.
2016-07-10 23:22:02 -07:00
Sean Chittenden
f39e84b672
Improve readability: use of a switch vs two if's
2016-07-10 20:18:57 -07:00
Sean Chittenden
2ffbeee06c
Skip the network fingerprinter test when offline.
...
Conditionalize the network fingerprinter test so that it works when a
user is offline. Similarly, when the network fingerprint test fails in
the future pass a HINT to the user to set an env var to allow the test
to be skipped in the future.
2016-07-10 20:16:06 -07:00
Sean Chittenden
2983bd6fce
Fix test for non-Linux platforms.
...
The following tests now check a whitelist for whether or not their
driver is present or not, or if the OS is supported or not.
* `TestAllocDir_MountSharedAlloc`
* `TestClient_Drivers_InWhitelist` (`exec` driver)
* `TestClient_Drivers` (`exec` driver)
* `TestJavaDriver_Fingerprint` (`java` driver)
2016-07-10 15:19:49 -07:00
Sean Chittenden
710173e9cb
Build the Cgroup fingerprinter on only Linux.
...
Change the logic from `!linux` to an empty build tag so that *if*
another platform picks up Cgroups support they can add themselves to the
necessary build tags for this fingerprinter and be on their way.
Because this technology isn't inherently Linux-specific and isn't
mutually exclusive of other resource isolation containers, resist the
urge to rename the Cgroup fingerprinter to something generic like the
ResourceContainerFingerprinter.
2016-07-10 13:55:06 -07:00
Alex Dadgar
51ae7ace25
initial tail impl
2016-07-10 13:57:04 -04:00
Sean Chittenden
d4fe69ddf9
Update comments and pushdown a lock into the resource container
2016-07-10 00:12:59 -07:00
Sean Chittenden
fc9cd8d4af
Push down the Linux-specific bits into resourceContainer
2016-07-10 00:06:53 -07:00
Sean Chittenden
6fc269d2a6
Move unit tests around into per-platform where appropriate.
2016-07-09 23:56:31 -07:00
Sean Chittenden
a5dc6c2da9
Push the Client's cleanup of Cgroups down
2016-07-09 23:45:33 -07:00
Sean Chittenden
5dbc0bf382
Rename resourceContainer.cleanup() to executorCleanup()
...
Not to be confused with the imminent ClientCleanup().
2016-07-09 23:25:33 -07:00
Sean Chittenden
5f8f0a50ac
Begin cgroup pushdown into platform specific files
2016-07-09 23:01:14 -07:00
Sean Chittenden
bdd7022fdc
Centralize the fingerprintrs.
...
Add platform specific fingerprinters per platform.
Requested by: @diptanu
2016-07-09 22:31:14 -07:00
Sean Chittenden
1e2e0ca050
Initialize the list of available fingerprinters per platform.
2016-07-09 00:22:42 -07:00
Diptanu Choudhury
5b39a5db40
Fixed a debug message
2016-07-09 00:12:53 -07:00
Diptanu Choudhury
e5310b76a6
Merge pull request #1399 from hashicorp/b-exec-cleanup
...
WIP: Cleanup exec driver
2016-07-09 00:08:43 -07:00
Sean Chittenden
03c571c61b
Consolidate fingerprinters into a single `map`.
2016-07-08 23:37:14 -07:00
Sean Chittenden
7530b27014
Move all non-Linux Fingerprinter items to the default exec driver
2016-07-08 18:35:46 -07:00
Diptanu Choudhury
5d61fa01f1
Fixed tests
2016-07-08 18:27:51 -07:00
Diptanu Choudhury
3c4002c48b
Fixed the client tests
2016-07-08 17:49:58 -07:00
Diptanu Choudhury
1784d98182
Fixed the host port environment variable
2016-07-08 15:37:44 -07:00
Cameron Davison
921a6c889c
remove the expected leading space, after the colon in syslog
2016-07-06 11:08:24 -05:00
Cameron Davison
07a9e15560
get into the hour minute second part of the time before looking for spaces, and then looking for the : seperator
2016-07-06 11:08:24 -05:00
Alex Dadgar
c35b1be845
Set running when restoring
2016-06-28 13:47:59 -07:00
Wojciech Bederski
a73422b4ff
Fix docker driver lockup during nomad boot
...
Unit mismatch caused docker driver to wait almost indefinitely during boot
(when one or more containers were a bit uncooperative during StopContainer())
This should fix problems described in #1202
2016-06-28 14:26:47 +02:00
Cameron Davison
d1e7d9c50f
write state to temp file and then rename
2016-06-27 12:29:33 -05:00
Jake Champlin
f094969c7b
Update failing tests
2016-06-23 11:28:17 -04:00
Alex Dadgar
14e950f882
Treat float as int
2016-06-22 15:09:39 -07:00
Alex Dadgar
4ff8edd2da
Floor CPU MHz and total compute and mark hostname as unique
2016-06-22 15:01:36 -07:00
Diptanu Choudhury
0a10873aa6
Merge pull request #1335 from hashicorp/f-set-docker-timeout
...
Setting a timeout in the docker client
2016-06-21 17:00:14 -07:00
Diptanu Choudhury
2837d3395d
Setting a timeout in the docker client
2016-06-21 16:58:21 -07:00
Diptanu Choudhury
1d5c5b18f3
Making SSL default
2016-06-21 16:41:14 -07:00
Sean Chittenden
8bdb38d016
Code golf
...
Pointed out by: @dadgar
2016-06-21 14:26:01 -07:00
Sean Chittenden
df4fe2e502
Fix the shuffling of remote datacenters.
...
Pointed out by: @ryanuber
2016-06-21 13:37:22 -07:00
Diptanu Choudhury
88ac1b33a4
Not emitting per-pid stats and added the total ticks consumed by a Task
2016-06-20 17:30:25 -07:00
Alex Dadgar
19024f4da0
Merge pull request #1322 from hashicorp/b-docker-logs-splicing
...
Make line copy to avoid being overriden by subsequent scans
2016-06-20 13:17:49 -07:00
Alex Dadgar
661dc200f3
Make line copy to avoid being overriden by subsequent scans
2016-06-20 13:14:43 -07:00
Michal Wieczorek
67a04bb1cc
Volume binds for windows containers
2016-06-20 21:46:33 +02:00
Alex Dadgar
3cd9c9590b
guard against NaN
2016-06-20 10:29:46 -07:00
Alex Dadgar
7b83503596
finer grain locking
2016-06-20 10:19:06 -07:00
Alex Dadgar
744270590b
Guard against bad restore
2016-06-17 14:58:53 -07:00
Alex Dadgar
c9f7467ccb
Driver tests use client default config
2016-06-17 14:24:49 -07:00
Sean Chittenden
7b9961f09b
Initialize the stats helpers before accessing them for the first time
2016-06-17 13:23:30 -07:00
Sean Chittenden
9cb649d247
Merge pull request #1307 from hashicorp/f-fingerprint-cpus
...
F fingerprint cpus
2016-06-17 12:23:40 -07:00
Sean Chittenden
9e287858de
Merge pull request #1310 from hashicorp/b-logger
...
Create and pass only one `logger` object around per Agent
2016-06-17 12:16:35 -07:00
Sean Chittenden
21b84fc3e6
Memoize the CPU stats. Error if CPU fingerprinting fails.
2016-06-17 12:13:53 -07:00
Alex Dadgar
27c6398639
debug message when stopping container
2016-06-17 11:52:44 -07:00
Sean Chittenden
686c125fea
Record and use only the first Mhz from the CPU fingerprinter.
...
Assume all cores are the same speed.
2016-06-17 11:06:57 -07:00
Sean Chittenden
46e2d54acf
Provide `nomad.Config` with a default `LogOutput` of `os.StdErr`
2016-06-17 06:44:10 -07:00
Sean Chittenden
9a60999100
Pass a logger arg to `NewClient` and `NewServer`
2016-06-16 23:29:23 -07:00
Sean Chittenden
4cc90753f8
In the debug log, split the unit from the measurement
...
awk(1) friendly is UNIX(tm) friendly.
2016-06-16 23:07:13 -07:00
Sean Chittenden
2dcb591cd8
Warn when we're unable to fingerprint the CPU Mhz
2016-06-16 23:07:13 -07:00
Sean Chittenden
b8e63411c0
Explicitly call `cpu.Counts()` to determine the CPU core count
...
Much safer than counting the number of InfoStat structs returned.
2016-06-16 23:07:13 -07:00
Sean Chittenden
d17af396ca
Create config.DefaultConsulConfig()
2016-06-16 20:41:05 -07:00
Sean Chittenden
fd18eb7fdb
Only register the Client services reaper when `consul.auto_advertise` is enabled
2016-06-16 18:24:58 -07:00
Sean Chittenden
1ce2cc6141
`conf` -> `config`
2016-06-16 17:05:29 -07:00
Sean Chittenden
015248a67f
Fix tests for NewServer() in client mode.
...
Pointy-hat: sean-
I'm positive I've done this before.
2016-06-16 16:34:22 -07:00
Sean Chittenden
952b6ce7b5
Only auto-join clients if `client_auto_join` is true
2016-06-16 14:47:21 -07:00
Sean Chittenden
ec77a1869e
Test for errors
2016-06-16 14:43:46 -07:00
Sean Chittenden
af55b74114
Merge pull request #1276 from hashicorp/f-consul-server-autojoin
...
Teach Nomad servers how to fall back to Consul.
2016-06-16 14:40:45 -07:00
Diptanu Choudhury
ed67f1a347
Merge pull request #1285 from hashicorp/fix-selinux-options
...
Added a client options for setting selinux options
2016-06-16 22:45:24 +02:00
Diptanu Choudhury
266c417ac8
Changed the client options for docker volume selinux labels
2016-06-16 21:41:02 +01:00
Alex Dadgar
fe588a2469
Guard against restoring a nil task in task_runner
2016-06-16 11:55:40 -07:00
Sean Chittenden
008d75184b
Use the `%+q` verb in log messages (vs `%q`).
2016-06-16 11:03:51 -07:00
Alex Dadgar
7375d828e1
remove trace
2016-06-15 15:47:59 -07:00
Sean Chittenden
5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
...
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs. With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden
2123460cf0
Bump various Consul search limits
...
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.
This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Alex Dadgar
cf99fc3173
Use Status.Peers instead of Status.Ping
2016-06-15 12:00:20 -07:00
Diptanu Choudhury
fa216199ce
Added documentation
2016-06-15 02:42:15 +02:00
Diptanu Choudhury
e08083acfe
Added a client options for setting selinux options
2016-06-15 02:33:09 +02:00
Sean Chittenden
6e22b680ce
Disambiguate `auto_join` from `auto_register`, rename reg to `auto_advertise`.
...
Provide an option that describes the value to the user vs the
operation performed by the software. Momentarily introducing
`auto_join`
2016-06-14 12:11:38 -07:00
Alex Dadgar
4b04e503f3
address comments
2016-06-13 17:32:18 -07:00
Alex Dadgar
8bbf4a55e5
Fix IDs and domain scoping
2016-06-13 16:30:58 -07:00
Diptanu Choudhury
d019d8ef8e
implemented reconciliation of unwanted services
2016-06-13 14:52:26 +02:00
Alex Dadgar
232654cdee
register checks
2016-06-12 21:28:56 -07:00
Alex Dadgar
a82c2bb058
Do not reconcile in client and cleanup executor a bit
2016-06-12 18:22:07 -07:00
Alex Dadgar
8e231fa382
Rename ConsulService back to Service
2016-06-12 16:36:49 -07:00
Alex Dadgar
e931b42473
unify cli output
2016-06-12 13:16:07 -07:00
Alex Dadgar
6ef4bfd6bf
skip docker test if no docker found
2016-06-12 11:28:43 -07:00
Alex Dadgar
2628827b7c
Merge pull request #1262 from hashicorp/remove-artifact-check
...
Removing artifact check for java and qemu drivers
2016-06-12 11:21:18 -07:00
Alex Dadgar
c4a819528a
Merge pull request #1260 from hashicorp/f-alloc-stats-struct
...
Allocation resources returned in a struct
2016-06-12 11:18:57 -07:00
Alex Dadgar
b671cd4134
Test fixes
2016-06-12 11:14:17 -07:00
Alex Dadgar
fdda90229f
only support latest and remove ring buffer
2016-06-12 09:32:38 -07:00
Diptanu Choudhury
34f85baab0
Fix the calculation of total ticks for docker and exec
2016-06-12 18:08:35 +02:00
Diptanu Choudhury
beb362e202
Setting a flag to indicate whether fs isolation is indeed happening
2016-06-12 15:43:24 +02:00
Diptanu Choudhury
641cf50682
Not converting the abs path relative to task dir for drivers which enforce FS isolation only in linux
2016-06-12 13:54:30 +02:00
Alex Dadgar
e952540f6f
Allocation resources returned in a struct
2016-06-11 21:04:10 -07:00
Diptanu Choudhury
a5e81ebc3a
Removing un-used code
2016-06-12 01:23:49 +02:00
Diptanu Choudhury
86e4f295da
Fixed the calculation of the host node ticks
2016-06-12 01:14:51 +02:00
Sean Chittenden
2f036231e5
Merge pull request #1201 from hashicorp/f-dyn-server-list
...
Dynamic Server Lists/Client Bootstrapping via consul.
2016-06-11 18:58:25 -04:00
Sean Chittenden
92e2cfb0ad
Walk the DCs from nearest to most remote.
2016-06-11 18:52:21 -04:00
Sean Chittenden
2968545201
Walk the DCs from nearest to most remote, no limit on the search.
2016-06-11 18:23:06 -04:00
Sean Chittenden
917766a3df
Prefer `%+q` over `%q` in log messages.
2016-06-11 18:17:20 -04:00
Sean Chittenden
445783889b
Remove default values and use nil for the executor. Much better.
2016-06-11 17:52:09 -04:00
Diptanu Choudhury
0fad17a1a9
Merge pull request #1258 from hashicorp/fix-statsd-metric-type
...
Emitting client resource usage metrics as guages instead of k/v pairs
2016-06-11 13:18:52 -07:00
Diptanu Choudhury
fd60cfd585
Emitting client resource usage metrics as guages instead of k/v pairs
2016-06-11 22:17:32 +02:00
Diptanu Choudhury
19f4adbcf1
Using a different client for collecting stats and waiting on containers
2016-06-11 20:37:29 +02:00
Diptanu Choudhury
7fb507e810
Moving the clkspeed code to helper
2016-06-11 17:31:49 +02:00
Sean Chittenden
f891fa0ec8
Perform a nil-check for Executor's consulServices.
...
Executors can `Shutdown()` before calling `SyncServices()`.
2016-06-10 23:43:54 -04:00
Sean Chittenden
bbd8dfa798
goling(1) compliance pass (e.g. Rpc* -> RPC)
2016-06-10 23:38:28 -04:00
Sean Chittenden
bc771d35df
Query for the Nomad service across multiple Consul datacenters.
2016-06-10 23:05:14 -04:00
Sean Chittenden
d433001f04
Expose rpcproxy's `ServerEndpoint()` constructor, `newServer()` as `NewServerEndpoint()`
2016-06-10 22:14:03 -04:00
Diptanu Choudhury
59540c3e93
Extracted a method for getting clock speed
2016-06-11 02:07:28 +02:00
Diptanu Choudhury
f94f89b6d7
Pruning out pids which are no longer present
2016-06-11 01:40:52 +02:00
Diptanu Choudhury
0a9a3918d6
Not reset-ing the list of pids if they don't change
2016-06-11 01:19:50 +02:00
Diptanu Choudhury
c38a6fb3c5
Implementing the total ticks per task for the docker driver
2016-06-10 23:33:25 +02:00
Diptanu Choudhury
01054db4fa
Calculating total ticks consumed in the nomad client
2016-06-10 23:14:33 +02:00
Sean Chittenden
5b0c2bf282
Restore old behavior and have AddPrimaryServer() return a pointer to the existing server (vs nil when the server already exists).
2016-06-10 16:46:49 -04:00
Diptanu Choudhury
2d3798b076
Calculating the cpu ticks in nomad client
2016-06-10 22:22:32 +02:00
Sean Chittenden
df896c6ef2
Prevent duplicate servers being added in AddPrimaryServer.
...
This logic was already present elsewhere and was missed in this one
place.
2016-06-10 15:55:27 -04:00
Sean Chittenden
d99467ef5e
Always create a consul.Syncer. Use a default Consul Config if necessary.
2016-06-10 15:55:27 -04:00
Sean Chittenden
f6a0459ae5
Always create a consul.Syncer. Use a default Consul Config if necessary.
2016-06-10 15:55:27 -04:00
Sean Chittenden
03846fb754
Rename listLock to activatedListLock
2016-06-10 15:54:39 -04:00
Sean Chittenden
4c9067310b
Nomad does not use Serf at the client level. Use a hard lock.
2016-06-10 15:54:39 -04:00
Sean Chittenden
26b1e826d7
golint(1) police
2016-06-10 15:54:39 -04:00
Sean Chittenden
974c9927c7
Formatting nit: remove brackets
2016-06-10 15:54:39 -04:00
Sean Chittenden
1371ef85f5
Prefix all log entries in client/rpcproxy with client.rpcproxy
2016-06-10 15:54:39 -04:00
Sean Chittenden
009495c18a
Style nit: remove `var` block
2016-06-10 15:54:39 -04:00
Sean Chittenden
f139d0c68b
Properly guard consulPullHeartbeatDeadline behind heartbeatLock
2016-06-10 15:54:39 -04:00
Sean Chittenden
8dc16ad5e3
Move RPCProxy.New() adjacent to its struct definition
2016-06-10 15:54:39 -04:00
Sean Chittenden
cc6e8792e0
Create a consulContext using a client's consul config.
...
This is wrong and should be the Agent's Consul Config. This is a
step in the right direction, so committing to mark the necessary
future change.
2016-06-10 15:54:39 -04:00
Sean Chittenden
ed29946f5e
Populate the RPC Proxy's server list if heartbeat did not include a leader.
...
It's possible that a Nomad Client is heartbeating with a Nomad server that
has become issolated from the quorum of Nomad Servers. When 3x the
heartbeatTTL has been exceeded, append the Consul server list to the primary
primary server list. When the next RPCProxy rebalance occurs, there is a
chance one of the servers discovered from Consul will be in the majority.
When client reattaches to a Nomad Server in the majority, it will include
a heartbeat and will reset the TTLs *AND* will clear the primary server list
to include only values from the heartbeat.
2016-06-10 15:54:39 -04:00
Sean Chittenden
9a223936bb
Generate and sync Consul ServiceIDs consistently
2016-06-10 15:54:39 -04:00
Sean Chittenden
3892a0433e
Move the start of the UniversalExecutor's consulSyncer to initialize once
...
This should be handled via a sync.Once primative, but I don't want to
unpack that atm.
2016-06-10 15:54:39 -04:00
Sean Chittenden
7956eb0c80
Rename structs.Task's `Service` attribute to `ConsulService`
2016-06-10 15:54:39 -04:00
Sean Chittenden
8c813630e6
Move package client/consul/sync to command/agent/consul.
...
This has been done to allow the Server and Client to reuse the same
Syncer because the Agent may be running Client, Server, or both
simultaneously and we only want one Syncer object alive in the agent.
2016-06-10 15:54:39 -04:00
Sean Chittenden
d6ef97911b
Rename Syncer.SetServiceIdentifier to SetServiceRegPrefix()
...
This attribute isn't actually an identifier because it can represent
a collection of services. Rename `serviceIdentifier` to
`serviceRegPrefix which more accurately conveys the intention of this
Syncer attribute.
While here, also rename `SetServiceIdentifier()` to `SetServiceRegPrefix()`
and `GenerateServiceIdentifier()` to `GenerateServicePrefix()`.
2016-06-10 15:54:39 -04:00
Sean Chittenden
6b126ce488
Change the API signature of Syncer.SyncServices().
...
SyncServices() immediately attempts to sync whatever information
the process has with Consul. Previously this method would take an
argument of the exclusive list of services that should exist,
however this is not condusive to having a Nomad Client and Nomad
Server share the same consul.Syncer.
2016-06-10 15:54:39 -04:00
Sean Chittenden
fda03c5c9e
Change the signature of the PeriodicCallback to return an error
...
I *KNEW* I should have done this when I wrote it, but didn't want to
go back and audit the handlers to include the appropriate return
handling, but now that the code is taking shape, make this change.
2016-06-10 15:54:39 -04:00
Sean Chittenden
4973ec32bb
Rename structs.Services to structs.ConsulServices
2016-06-10 15:54:39 -04:00
Sean Chittenden
6430190d6a
Rename createCheck() to createDelegatedCheck() for clarity
2016-06-10 15:54:39 -04:00
Sean Chittenden
555f4fe135
Change client/consul.NewSyncer() to accept a shutdown channel
...
In addition to the API changing, consul.Syncer can now be signaled
to shutdown via the Shutdown() method, which will call the Run()'ing
sync task to exit gracefully.
2016-06-10 15:54:39 -04:00