Nick Ethier
047fad2953
client: simplify driver plugin logic from review comments
2018-10-16 16:56:56 -07:00
Nick Ethier
9686e1b258
client: fix broked tests from refactoring
2018-10-16 16:56:56 -07:00
Nick Ethier
3183b33d24
client: review comments and fixup/skip tests
2018-10-16 16:56:56 -07:00
Nick Ethier
f192c3752a
client: refactor post allocrunnerv2 finalization
2018-10-16 16:56:56 -07:00
Nick Ethier
4a4c7dbbfc
client: begin driver plugin integration
...
client: fingerprint driver plugins
2018-10-16 16:56:56 -07:00
Alex Dadgar
7946a14aa8
Fix lints
2018-10-16 16:56:56 -07:00
Alex Dadgar
89dafaaea9
compile on windows
2018-10-16 16:56:56 -07:00
Alex Dadgar
ad4fac526c
more test fixes
2018-10-16 16:56:56 -07:00
Alex Dadgar
45e41cca03
allocrunnerv2 -> allocrunner
2018-10-16 16:56:56 -07:00
Alex Dadgar
9baa7402ef
fix test compiling
2018-10-16 16:56:55 -07:00
Alex Dadgar
7d9c069f09
skip building deprecated files
2018-10-16 16:56:55 -07:00
Alex Dadgar
6c9d9d5173
move files around
2018-10-16 16:56:55 -07:00
Michael Schurter
5f696608a6
tests: fix missing logger caused by bad merge
2018-10-16 16:56:55 -07:00
Michael Schurter
048510b13e
tr: properly comment handle fields
2018-10-16 16:56:55 -07:00
Michael Schurter
9e49ed3464
ar: AllocState should not mutate ar.state
...
If ar.state.TaskStates has not been set, set it on the copy of ar.state.
That keeps ar.state manipulations in one location and allows AllocState
to only acquire read-locks.
2018-10-16 16:56:55 -07:00
Michael Schurter
f279b1d1b1
tests: test logs endpoint against pending task
...
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter
dd4227f84a
tests: make a test client/config easier to generate
...
Sadly can't move the fingerprint timeout tweak into the helper due to
circular imports.
2018-10-16 16:56:55 -07:00
Michael Schurter
1d747048ea
tests: ensure task state is initialized in NewAR
...
Also expose NoopDB for use in tests.
2018-10-16 16:56:55 -07:00
Michael Schurter
960f3be76c
client: expose task state to client
...
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.
However, that would lead to AR having two awkwardly similar methods:
- Alloc() - which returns the server-sent alloc
- ClientAlloc() - which returns the fully materialized client alloc
Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?
Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.
This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?
So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!
- AR.Alloc() is for static Allocation access (often for the Job)
- AR.AllocState() is for the dynamic AR & TR runtime state (deployment
status, task states, etc).
If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()
It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
2018-10-16 16:56:55 -07:00
Michael Schurter
fb4aa74153
client: add comment
2018-10-16 16:56:55 -07:00
Michael Schurter
9a7e6be2b6
client: fix potentially dropped streaming errors
2018-10-16 16:56:55 -07:00
Michael Schurter
4b44b9039b
tr: remove unneeded lock; chan synchronizes access
2018-10-16 16:56:55 -07:00
Michael Schurter
211b96bb5c
tr: fix shutdown/destroy/WaitResult handling
...
Multiple receivers raced for the WaitResult when killing tasks which
could lead to a deadlock if the "wrong" receiver won.
Wrap handlers in an ugly little proxy to avoid this. At first I wanted
to push this into drivers, but the result is tied to the TR's handle
lifecycle -- not the lifecycle of an alloc or task.
2018-10-16 16:56:55 -07:00
Michael Schurter
951ed17436
client: do not inspect task state to follow logs
...
"Ask forgiveness, not permission."
Instead of peaking at TaskStates (which are no longer updated on the
AR.Alloc() view of the world) to only read logs for running tasks, just
try to read the logs and improve the error handling if they don't exist.
This should make log streaming less dependent on AR/TR behavior.
Also fixed a race where the log streamer could exit before reading an
error. This caused no logs or errors to be displayed sometimes when an
error occurred.
2018-10-16 16:56:55 -07:00
Michael Schurter
2325348053
mock_driver: close waitCh after exiting
...
mock_driver wasn't behaving like other driver handles.
2018-10-16 16:56:55 -07:00
Michael Schurter
8d1419c62b
client: fix accessing alloc runners
...
* GetClientAlloc() gains nothing from using allAllocs()
* getAllocatedResources was calling getAllocRunners() twice
2018-10-16 16:56:55 -07:00
Michael Schurter
55ab491801
tr: remove wip comments
2018-10-16 16:56:55 -07:00
Michael Schurter
3ccc091a72
ar: lock around accessing tasks
...
Specify that Alloc() does not return updated task states.
2018-10-16 16:56:55 -07:00
Alex Dadgar
6f0ed6184b
Fix client reloading and pass the plugin loaders to server and client
2018-10-16 16:56:55 -07:00
Nick Ethier
352c05cdf4
plugin/drivers: plumb in stdout/stderr paths
2018-10-16 16:53:31 -07:00
Nick Ethier
0e3f85222a
driver/raw_exec: port existing raw_exec tests and add some testing utilities
2018-10-16 16:53:31 -07:00
Nick Ethier
d9628ff394
driver/raw_exec: more tests and bug fixes
...
added wrapper struct for plugin.ReattachConfig to better handle serialization
2018-10-16 16:53:31 -07:00
Nick Ethier
bcc5c4a8bd
clientv2: base driver plugin ( #4671 )
...
Driver plugin framework to facilitate development of driver plugins.
Implementing plugins only need to implement the DriverPlugin interface.
The framework proxies this interface to the go-plugin GRPC interface generated
from the driver.proto spec.
A testing harness is provided to allow implementing drivers to test the full
lifecycle of the driver plugin. An example use:
func TestMyDriver(t *testing.T) {
harness := NewDriverHarness(t, &MyDiverPlugin{})
// The harness implements the DriverPlugin interface and can be used as such
taskHandle, err := harness.StartTask(...)
}
2018-10-16 16:53:31 -07:00
Michael Schurter
62c1285afc
tr: add comments and cleanup call signature
...
From review comments on #4649 left post-merge.
2018-10-16 16:53:31 -07:00
Nick Ethier
5dee1141d1
executor v2 ( #4656 )
...
* client/executor: refactor client to remove interpolation
* executor: POC libcontainer based executor
* vendor: use hashicorp libcontainer fork
* vendor: add libcontainer/nsenter dep
* executor: updated executor interface to simplify operations
* executor: implement logging pipe
* logmon: new logmon plugin to manage task logs
* driver/executor: use logmon for log management
* executor: fix tests and windows build
* executor: fix logging key names
* executor: fix test failures
* executor: add config field to toggle between using libcontainer and standard executors
* logmon: use discover utility to discover nomad executable
* executor: only call libcontainer-shim on main in linux
* logmon: use seperate path configs for stdout/stderr fifos
* executor: windows fixes
* executor: created reusable pid stats collection utility that can be used in an executor
* executor: update fifo.Open calls
* executor: fix build
* remove executor from docker driver
* executor: Shutdown func to kill and cleanup executor and its children
* executor: move linux specific universal executor funcs to seperate file
* move logmon initialization to a task runner hook
* client: doc fixes and renaming from code review
* taskrunner: use shared config struct for logmon fifo fields
* taskrunner: logmon only needs to be started once per task
2018-10-16 16:53:31 -07:00
Michael Schurter
e6e2930a00
tr: implement stats collection hook
...
Tested except for the net/rpc specific error case which may need
changing in the gRPC world.
2018-10-16 16:53:31 -07:00
Michael Schurter
86bd329539
fix build errors post merges
2018-10-16 16:53:31 -07:00
Michael Schurter
a977e22028
test: cleanup mock consul service client
...
Updated to hclog.
It exposed fields that required an unexported lock to access. Created a
getter methodn instead. Only old allocrunner currently used this
feature.
2018-10-16 16:53:31 -07:00
Michael Schurter
6f92b04226
health_hook: simplify locking; test thoroughly
...
Use doneCh like @dadgar suggested in the original PR.
Thoroughly test hook as concurrent Update calls make for a tricky
concurrency problem.
2018-10-16 16:53:30 -07:00
Alex Dadgar
cebfead6bc
add logger back
2018-10-16 16:53:30 -07:00
Nick Ethier
03422aa529
fifo: add new fifo package for named pipes ( #4665 )
...
* fifo: add new fifo package for named pipes
2018-10-16 16:53:30 -07:00
Alex Dadgar
8504505c0d
client uses passed logger and fix fingerprinters
2018-10-16 16:53:30 -07:00
Nick Ethier
66ff12e5f7
Update runc/libcontainer and friends ( #4655 )
...
* vendor: bump libcontainer and docker to remove Sirupsen imports
* vendor: fix bad vendoring of archive package
* vendor: fix api changes to cgroups in executor
* vendor: fix docker api changes
* vendor: update github.com/Azure/go-ansiterm to use non capitalized logrus import
2018-10-16 16:53:30 -07:00
Michael Schurter
195b8127fb
health_hook: fix panic and add tests
...
Still more testing to do, but I want to get this panic fixed ASAP.
All new tests pass with -race
2018-10-16 16:53:30 -07:00
Michael Schurter
64efc3d301
Emit events before long operations
...
Append when there's nothing blocking between appending and sending an
update to the server.
2018-10-16 16:53:30 -07:00
Michael Schurter
a2b696c4cf
Use a semaphore to block until watcher exits
2018-10-16 16:53:30 -07:00
Michael Schurter
a73162c977
ar: use multierror in update hook loop
...
Make it match TaskRunner update hook behavior
2018-10-16 16:53:30 -07:00
Michael Schurter
a7b427718c
tr: refactor EmitEvents into Emit+Append
...
* UpdateState: set state, append event, persist, update servers
* EmitEvent: append event, persist, update servers
* AppendEvent: append event, persist
AppendEvent may not even have to persist, but for the sake of
correctness I'm going with that for now.
2018-10-16 16:53:30 -07:00
Michael Schurter
93f3ac9ed6
ar: create health setting shim for health watcher
2018-10-16 16:53:30 -07:00
Michael Schurter
4d5aaac6d2
fix detection of task transitioning to running
2018-10-16 16:53:30 -07:00
Michael Schurter
4136e59f79
arv2: implement alloc health watching
...
Also remove initial alloc from broadcaster as it just caused useless
extra processing.
2018-10-16 16:53:30 -07:00
Michael Schurter
5c5c6dc41b
refactor ar hooks into their own files
...
minimize passed dependencies to ease testing
2018-10-16 16:53:30 -07:00
Michael Schurter
0bbf3a93ee
make AllocBroadcaster easier to use
...
And test thoroughly.
2018-10-16 16:53:30 -07:00
Michael Schurter
9d1ea3b228
client: hclog-ify most of the client
...
Leaving fingerprinters in case that interface changes with plugins.
2018-10-16 16:53:30 -07:00
Michael Schurter
e42154fc46
implement stopping, destroying, and disk migration
...
* Stopping an alloc is implemented via Updates but update hooks are
*not* run.
* Destroying an alloc is a best effort cleanup.
* AllocRunner destroy hooks implemented.
* Disk migration and blocking on a previous allocation exiting moved to
its own package to avoid cycles. Now only depends on alloc broadcaster
instead of also using a waitch.
* AllocBroadcaster now only drops stale allocations and always keeps the
latest version.
* Made AllocDir safe for concurrent use
Lots of internal contexts that are currently unused. Unsure if they
should be used or removed.
2018-10-16 16:53:30 -07:00
Michael Schurter
4236255686
lots of comment/log fixes
2018-10-16 16:53:30 -07:00
Michael Schurter
5749ede04e
keep forgetting lxc
2018-10-16 16:53:30 -07:00
Michael Schurter
357641c364
persist alloc state on changes, not periodically
...
Allow alloc and task runners to persist their own state when something
changes instead of periodically syncing all state.
2018-10-16 16:53:30 -07:00
Michael Schurter
820af27171
wrap boltdb in a write deduplicator
...
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4 500 4059591 ns/op 23736 B/op 56 allocs/op
BenchmarkWriteDeduplication_Off-4 300 4115319 ns/op 25942 B/op 55 allocs/op
2018-10-16 16:53:30 -07:00
Michael Schurter
990228a6e2
wip wrap boltdb to get path information
...
finished but doesn't handle deleting deeply nested buckets
2018-10-16 16:53:30 -07:00
Michael Schurter
a3fe0510d1
Move all encoding and put deduping into state db
...
Still WIP as it does not handle deletions.
2018-10-16 16:53:30 -07:00
Michael Schurter
533bc93b3a
implement all boltdb interactions behind StateDB
2018-10-16 16:53:30 -07:00
Michael Schurter
d890de036a
tr: persist hook state whenever it changes
2018-10-16 16:53:30 -07:00
Michael Schurter
fae5e89a0e
artifacts: don't emit event when there's no artifacts
2018-10-16 16:53:30 -07:00
Michael Schurter
5383d20505
removing old restoration path before api change
2018-10-16 16:53:30 -07:00
Michael Schurter
a5d3e3fb0a
Implement alloc updates in arv2
...
Updates are applied asynchronously but sequentially
2018-10-16 16:53:30 -07:00
Michael Schurter
39b3f3a85b
call handle.Network() instead of storing it
2018-10-16 16:53:30 -07:00
Michael Schurter
7132b67c1e
Add Network method to Handle interface
...
Should probably be moved to an Inspect method in the Driver Plugin world
2018-10-16 16:53:30 -07:00
Michael Schurter
a4b4d7b266
consul service hook
...
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Michael Schurter
5be982e674
restore vault client
2018-10-16 16:53:29 -07:00
Michael Schurter
ce04915c9f
log before killing tasks
2018-10-16 16:53:29 -07:00
Michael Schurter
a2bf851805
no need to TaskStateUpdated to return an error
...
also updated comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
fd3bc1bd39
Update state with server
2018-10-16 16:53:29 -07:00
Alex Dadgar
bc905cc61d
Define and thread through state updating interface
2018-10-16 16:53:29 -07:00
Michael Schurter
9a63d6103d
tr: add validate task hook
2018-10-16 16:53:29 -07:00
Michael Schurter
7f4ec50906
missed locking around c.allocs access
2018-10-16 16:53:29 -07:00
Alex Dadgar
c93cfc89c0
wip
2018-10-16 16:53:29 -07:00
Alex Dadgar
7ddc0eb65c
Fix deadlock
2018-10-16 16:53:29 -07:00
Alex Dadgar
3779077052
Remove SetState from interface
2018-10-16 16:53:29 -07:00
Alex Dadgar
e1ba73b515
compile
2018-10-16 16:53:29 -07:00
Michael Schurter
6ebdf532ea
wip split event emitting and state transitions
2018-10-16 16:53:29 -07:00
Michael Schurter
516d641db0
client: implement all-or-nothing alloc restoration
...
Restoring calls NewAR -> Restore -> Run
NewAR now calls NewTR
AR.Restore calls TR.Restore
AR.Run calls TR.Run
2018-10-16 16:53:29 -07:00
Alex Dadgar
e401c660e7
Implement lifecycle hooks on the task runner
2018-10-16 16:53:29 -07:00
Alex Dadgar
89b4ba9cc8
comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
86e81947b4
Hook renames
2018-10-16 16:53:29 -07:00
Alex Dadgar
2599cf9d74
remove comment
2018-10-16 16:53:29 -07:00
Alex Dadgar
88aa0299a9
Template hook
2018-10-16 16:53:29 -07:00
Alex Dadgar
c9765deff1
address comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
80f6ce50c0
vault hook
2018-10-16 16:53:29 -07:00
Michael Schurter
30d377eba4
tr: improve skip log line
2018-10-16 16:53:29 -07:00
Michael Schurter
ef213b864b
tr: pass context to hooks
2018-10-16 16:53:29 -07:00
Michael Schurter
3a4f387fd3
tr: fix setting done in existing hooks
2018-10-16 16:53:29 -07:00
Michael Schurter
b360f6f96e
fix hclog level
2018-10-16 16:53:29 -07:00
Michael Schurter
ae89b7da95
reimplement success state for tr hooks and state persistence
...
splits apart local and remote persistence
removes some locking *for now*
2018-10-16 16:53:29 -07:00
Michael Schurter
4f43ff5c51
pass statedb into allocrunnerv2
2018-10-16 16:53:29 -07:00
Michael Schurter
582c76a420
remove unused allocrunner shim
2018-10-16 16:53:29 -07:00
Michael Schurter
c5504bd939
tr: cleanup main loop and shutdown hook impl
2018-10-16 16:53:29 -07:00
Michael Schurter
561260d6fe
tr: skip error/success saving
...
All hooks only need to be run once.
Since only one hook can fail per run there's no need to
track errors on a per hook basis.
2018-10-16 16:53:29 -07:00
Michael Schurter
67874e761f
tr: don't lock for immutable fields
2018-10-16 16:53:29 -07:00
Michael Schurter
f473cd03d6
tr: start update/shutdown logic
2018-10-16 16:53:29 -07:00
Michael Schurter
637ef264ae
Copy TR.Config vals to TR
...
I think I like this pattern better as some Config vals are mutable
(Alloc) and some aren't and some are used to derive other values and
never used directly.
Promoting them onto the TR struct is a little more work but is hopefully
more clear as to how each value is used.
2018-10-16 16:53:29 -07:00
Michael Schurter
0f7dcfdc9a
example redis job "runs" on arv2! see below
...
Tons left to do and lots of churn:
1. No state saving
2. No shutdown or gc
3. Removed AR factory *for now*
4. Made all "Config" structs local to the package they configure
5. Added allocID to GC to avoid a lookup
Really hating how many things use *structs.Allocation. It's not bad
without state saving, but if AllocRunner starts updating its copy things
get racy fast.
2018-10-16 16:53:29 -07:00
Michael Schurter
9a6aa38b0f
begin adding AllocRunner.Update
2018-10-16 16:53:29 -07:00
Michael Schurter
eae54e2954
artifact task hook
2018-10-16 16:53:29 -07:00
Alex Dadgar
b9bed81e6e
Initial V2 alloc runner
2018-10-16 16:53:28 -07:00
Alex Dadgar
a78cefec18
use int64
2018-10-16 15:34:32 -07:00
Preetha Appan
7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs
2018-10-16 16:21:42 -05:00
Christian Winther
0c5154100c
fix: increase log rotator line scan limit
...
In case where gelf/json logging is used, its fairly easy to exceed the 16k limit, resulting in json output being cut up into multiple strings
the result is invalid json lines which can create all kind of badness in the logging server
This fixes https://github.com/hashicorp/nomad/issues/4699
Signed-off-by: Christian Winther <jippignu@gmail.com>
2018-10-09 18:57:18 +02:00
Alex Dadgar
01f8e5b95f
renames
2018-10-04 14:57:25 -07:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
bac5cb1e8b
Scheduler uses allocated resources
2018-10-02 17:08:25 -07:00
Alex Dadgar
5c8697667e
Node reserved resources
2018-09-29 18:44:55 -07:00
Alex Dadgar
3183153315
Node resources on client
2018-09-29 17:23:41 -07:00
Alex Dadgar
9971b3393f
yamux
2018-09-17 14:22:40 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
7739ef51ce
agent + consul
2018-09-13 10:43:40 -07:00
Michael Schurter
08862fc177
fix race around error handling
2018-09-05 17:34:17 -07:00
Michael Schurter
6def5bc4f9
client: set host name when migrating over tls
...
Not setting the host name led the Go HTTP client to expect a certificate
with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad`
names ephemeral dir migrations were broken when TLS was enabled.
Added an e2e test to ensure this doesn't break again as it's very
difficult to test and the TLS configuration is very easy to get wrong.
2018-09-05 17:24:17 -07:00
Alex Dadgar
c6576ddac1
Fix make check errors
2018-09-04 16:03:52 -07:00
Alex Dadgar
089b533047
Fix kill timeout exceeding 5m on Docker driver
...
Fixes an issue where the Docker API client would timeout before the kill
timeout was hit.
2018-08-17 16:01:09 -07:00
Alex Dadgar
49a1ba9297
Merge pull request #4535 from hashicorp/f-keep-docker-container-0.8.4
...
Option to prevent removal of container on exit
2018-07-26 11:11:22 -07:00
Charlie Voiselle
f319a149cd
Option to prevent removal of container on exit
2018-07-26 11:10:48 -07:00
Michael Schurter
ddf948001e
Merge pull request #4462 from omame/omame/cpu_cfs_period
...
Add support for specifying cpu_cfs_period in the Docker driver
2018-07-25 09:34:38 -07:00
Daniele Valeriani
b0a14caca2
Add test for cpu_cfs_period
2018-07-16 22:43:34 +02:00
Michael Schurter
91588cb861
rkt: revert to redis 3.2 to favor stability
2018-07-09 16:15:32 -07:00
Michael Schurter
c56f899ee9
rkt: speed up tests
...
Disable networking when it's not needed and improve failure message for
UserGroup test by including the full ps output on failure.
2018-07-09 14:02:27 -07:00
Michael Schurter
a1d4f77ce0
rkt: skip retrieving network information when net=none
...
Even when net=none we would attempt to retrieve network information from
rkt which would spew useless log lines such as:
```
testlog.go:30: 20:37:31.409209 [DEBUG] driver.rkt: failed getting network info for pod UUID 8303cfe6-0c10-4288-84f5-cb79ad6dbf1c attempt 2: no networks found. Sleeping for 970ms
```
It would also delay tests for ~60s during the network information retry
period.
So skip this when net=none. It's unlikely anyone actually uses net=none
outside of tests, so I doubt anyone will notice this change.
Official docs:
https://coreos.com/rkt/docs/latest/networking/overview.html#no-loopback-only-networking
2018-07-09 13:44:43 -07:00
Michael Schurter
0fbc84b81d
tests: make alloc id consistent in helper
...
It worked, but the old code used a different alloc id for the path than
the actual alloc! Use the same alloc id everywhere to prevent confusing
test output.
2018-07-09 13:37:35 -07:00
Michael Schurter
f3b8815c96
rkt: fix failing TestRktDriver_UserGroup test
...
Started failing due to the docker redis image switching from Debian
jessie to stretch:
53f8680550 (diff-acff46b161a3b7d6ed01ba79a032acc9)
Switched from Debian based image to Alpine to get a working `ps` command
again (albeit busybox's stripped down implementation)
2018-07-09 12:19:02 -07:00
Daniele Valeriani
748f6afd89
Validate the value of cpu_cfs_period
2018-07-02 22:30:22 +02:00
Daniele Valeriani
9364446a03
Remove an unnecessary conversion
2018-07-02 17:47:23 +02:00
Daniele Valeriani
906952a2c8
Add support for specifying cpu_cfs_period in the Docker driver
2018-07-02 16:37:04 +02:00
Preetha
b567750824
Merge pull request #4392 from burdandrei/telemetry-parametrized-jobs
...
Parametrized/periodic jobs per child tagged metric emmision
2018-06-21 17:13:36 -05:00
Preetha
043f4c208b
Merge pull request #3882 from burdandrei/telemetry-add-node-class-tag
...
Added node class to tagged metrics
2018-06-21 17:04:35 -05:00
Andrei Burd
444ee45aff
Parametrized/periodic jobs per child tagged metric emmision
2018-06-21 10:40:56 +03:00
James Rasell
75f95ccf09
Merge branch 'master' into f_gh_4381
2018-06-19 17:51:57 +02:00
Alex Dadgar
b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
...
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar
22757d964e
lint
2018-06-13 16:06:39 -07:00
Alex Dadgar
af558df94c
Fix test using a lot of memory
2018-06-13 15:52:25 -07:00
Alex Dadgar
300b1a7a15
Tests only use testlog package logger
2018-06-13 15:40:56 -07:00
Chelsea Komlo
03075b603a
Merge pull request #4399 from hashicorp/r-reload-refactor
...
Refactor logic for dynamic reloading
2018-06-13 13:35:12 -04:00
Alex Dadgar
9bab9edf27
test fixes
2018-06-12 17:45:39 -07:00
Alex Dadgar
90c2108bfb
Fix gc tests + parallel destroy + small test fixes
2018-06-12 10:23:45 -07:00
Alex Dadgar
f5ff509fa5
Refactor - wip
2018-06-12 10:23:45 -07:00
Alex Dadgar
ff2ab8f58e
Fix vault template test
2018-06-12 09:57:28 -07:00
Alex Dadgar
d0043691fb
remove structs + bump version
2018-06-11 13:52:19 -07:00
Alex Dadgar
af5753d2cd
bump version + generated files
2018-06-11 13:39:42 -07:00
Nick Ethier
f36eb14360
Merge pull request #4403 from hashicorp/b-fix-dispatched-optional-meta
...
Fix dispatched optional meta correctly
2018-06-11 16:17:14 -04:00
Nick Ethier
e75e3ae665
nomad: use require pkg for tests
2018-06-11 13:50:50 -04:00
Nick Ethier
3aa6241b5c
client/driver/env: fix optional meta test
2018-06-11 12:29:13 -04:00
Nick Ethier
c65882cafd
client/driver/env: use 'job.Dispatch' to trigger optional meta logic
2018-06-11 12:15:19 -04:00
Nick Ethier
ccb5372813
Revert "Revert "client/driver/env: interpolate empty optional meta params as empty strings""
...
This reverts commit c17e0fc9dc5fd288935ab2b68fb441b4d25ac189.
2018-06-11 11:59:23 -04:00
Michael Schurter
c198cfd8ea
executor: fix log line formatting
2018-06-08 14:55:39 -07:00
Michael Schurter
d1a60e700e
executor: fix Windows blocking on pipe close
...
Sending the Ctrl-Break signal to PowerShell <6 causes it to drop into
debug mode. Closing its output pipe at that point will block
indefinitely and prevent the process from being killed by Nomad.
See the upstream powershell issue for details:
https://github.com/PowerShell/PowerShell/issues/4254
2018-06-08 14:48:05 -07:00
Chelsea Holland Komlo
f74e74b22d
add client logic to determine whether TLS RPC connections should reload
2018-06-08 14:38:58 -04:00
James Rasell
b9009c419c
Add 'nomad.advertise.address' to client meta via NomadFingerPrint
...
This change removes the addition of the advertise address to the
exported task env vars and instead moves this work into the
NomadFingerprint.Fingerprint which adds this value to the client
attrs. This can then be used within a Nomad job like
${attr.nomad.advertise.address}.
2018-06-08 09:44:10 +02:00
Alex Dadgar
d9b35fab52
Revert "client/driver/env: interpolate empty optional meta params as empty strings"
...
This reverts commit 84926f759a63a90be7bbcf0fad78deb3f02af23d.
2018-06-07 16:27:47 -07:00
Nick Ethier
b3c767fae0
client/driver: drop docker pull progress estimate if its < 0
2018-06-07 15:23:31 -04:00
James Rasell
367a8b5152
Add the local clients advertise address to interpolation env vars
...
This commit adds the Nomad local client advertise address in the
form host:port to the environment variables passed to each task.
2018-06-07 09:45:15 +02:00
Alex Dadgar
98705824ed
Merge pull request #4185 from jesusvazquez/add-counter-metric-for-oom-killer-events
...
Add driver.docker counter metric for OOM Killer events
2018-06-04 15:12:51 -07:00
Alex Dadgar
23cd56dc78
remove generated structs
2018-06-01 16:11:28 -07:00
Alex Dadgar
bf5b5747ab
fix test message
2018-06-01 15:51:54 -07:00
Alex Dadgar
3e3d3c7445
Disable Exec on non-linux platforms
...
This PR disables exec on non-linux platforms
2018-06-01 15:48:14 -07:00
Alex Dadgar
c0386819b3
bump version/lint/generated files
2018-06-01 15:23:10 -07:00
Preetha Appan
ce6d4a8d7a
Fix tests and move isClient to constructor
2018-06-01 15:59:53 -05:00
Alex Dadgar
a62dd2aadb
Merge pull request #4350 from hashicorp/b-raw-exec-cgroups
...
Raw exec can use cgroups to manage PIDs
2018-06-01 17:37:49 +00:00
Alex Dadgar
8da42940c9
wait for result
2018-06-01 10:14:53 -07:00
Alex Dadgar
40fec81315
Merge pull request #4277 from hashicorp/f-retry-join-clients
...
Add go-discover support to Nomad clients
2018-06-01 16:57:40 +00:00
Alex Dadgar
460ecb8705
Comments
2018-05-31 18:05:03 -07:00
Alex Dadgar
de98774f2c
Add test and docs
2018-05-31 18:05:03 -07:00
Alex Dadgar
ff28b04c46
Use more appropriate name than cgroup
2018-05-31 18:05:03 -07:00
Alex Dadgar
37e900b1d3
Only use freezer/devices when in the basic cgroup only
2018-05-31 18:05:03 -07:00
Alex Dadgar
ffd9270f2f
Use cgroup when possible
2018-05-31 18:05:03 -07:00
Alex Dadgar
0ff0ed290d
Fix TestDockerDriver_StartNVersions
2018-05-31 17:14:59 -07:00
Alex Dadgar
7e6dd498c9
Remove debug logging
2018-05-31 15:52:42 -07:00
Alex Dadgar
b1b908527f
spelling
2018-05-31 15:29:55 -07:00
Alex Dadgar
a3b29553a5
Force close stdout/stderr after grace
...
This commit changes the force closing of the stdout/stderr file
descriptor from closing immediately to being closed after a grace
period. This allows the created process to close its own file and allows
copying of the data.
2018-05-31 15:21:36 -07:00
Alex Dadgar
5e787e2d72
test build
2018-05-31 12:22:31 -07:00
Alex Dadgar
ead1b7f423
Log more info for TestExecutor_IsolationAndConstraints
2018-05-31 11:57:44 -07:00
Alex Dadgar
b05740ad13
Merge pull request #4341 from hashicorp/f-docker-pids
...
Support Docker Pids Limit
2018-05-31 17:59:29 +00:00
Chelsea Holland Komlo
064b5481e0
add server join info to server and client
2018-05-31 10:50:03 -07:00
Alex Dadgar
f4d4bbdc97
test pid limit
2018-05-30 12:55:24 -07:00
Chelsea Holland Komlo
94d510e969
Support Docker Pids Limit
2018-05-25 19:54:14 -04:00
Alex Dadgar
1685c8ebe4
cleanup
2018-05-24 16:25:20 -07:00
Alex Dadgar
2eacdb6bd6
Force closing of pipe to child process
2018-05-24 16:03:48 -07:00
Chelsea Holland Komlo
38f611a7f2
refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
...
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Preetha
9084bb025e
Merge pull request #4303 from hashicorp/b-docker-client-nil-panic
...
Add nil check before setting timeout on docker client
2018-05-21 19:34:44 -07:00
Jesus Vazquez
23d959e42c
Add job, task, taskgroup to open method
2018-05-21 20:37:18 +02:00
Jesus Vazquez
0a062a04c7
Remove allocID from dockerhandle struct
2018-05-21 20:33:01 +02:00
Jesus Vazquez
e5a81815bb
Rename labels job, task_group and task
2018-05-21 20:32:50 +02:00
Jesus Vazquez
ffe1b1a1b6
Remove allocid label from driver.docker.oom counter metric
2018-05-21 20:30:56 +02:00
Alex Dadgar
38762d9bde
Merge pull request #4282 from hashicorp/f-rotator
...
Avoid splitting log line across two files
2018-05-21 17:52:13 +00:00
Alex Dadgar
d95698e2c5
Merge pull request #4298 from justenwalker/docker-driver-digest-tags
...
driver/docker: pull image with digest
2018-05-21 17:46:14 +00:00
Nick Ethier
6392009dd6
client/driver: use correct repo address when using docker-credential helper ( #4266 )
2018-05-15 17:39:48 -04:00
Justen Walker
a8989f33bb
driver/docker: add test for dockerImageRef
2018-05-14 14:24:03 -04:00
Justen Walker
194b2231d6
driver/docker: fix up TestParseDockerImage
2018-05-14 14:23:48 -04:00
Justen Walker
25b2807ce3
driver/docker: fix TestDockerDriver_ForcePull_RepoDigest
2018-05-14 14:23:02 -04:00
Nick Ethier
c4d07a2200
client/driver: gaurd authHelper test from running on windows
2018-05-14 13:46:57 -04:00
Justen Walker
b23ca7574c
driver/docker: cleanup parseDockerImage
2018-05-14 11:11:51 -04:00
Justen Walker
60f7f1aa08
driver/docker: pull image with digest
...
GH #4290
Add digest support to the docker driver image config. This commit
factors out some common code to print the repo:tag (dockerImageRef) for
events/logs as well as parsing the image to retreive the repo,tag
(parseDockerImage) so that the results are consistent/sane for both
repo:tag and repo@sha256:... references.
When pulling an image with a digest, the tag is blank and the repo
contains the digest. See:
https://github.com/fsouza/go-dockerclient/blob/master/image_test.go#L471
2018-05-14 10:42:58 -04:00
Preetha Appan
de66ec7394
Add nil check before setting timeout on docker client
2018-05-11 17:09:26 -05:00
Alex Dadgar
7ad5c76734
Add new line test
2018-05-11 10:52:09 -07:00
Alex Dadgar
3671ed139d
Avoid splitting log line across two files
...
We attempt to avoid splitting a log line between two files by detecting
if we are near the file size limit and scanning for new lines and only
flushing those.
BenchmarkRotator/1KB-8 300000 5613 ns/op
BenchmarkRotator/2KB-8 200000 8384 ns/op
BenchmarkRotator/4KB-8 100000 14604 ns/op
BenchmarkRotator/8KB-8 50000 25002 ns/op
BenchmarkRotator/16KB-8 30000 47572 ns/op
BenchmarkRotator/32KB-8 20000 92080 ns/op
BenchmarkRotator/64KB-8 10000 165883 ns/op
BenchmarkRotator/128KB-8 5000 294405 ns/op
BenchmarkRotator/256KB-8 2000 572374 ns/op
2018-05-10 15:11:01 -07:00
Alex Dadgar
f5d91b5338
Benchmark for rotator
...
BenchmarkRotator/1KB-8 200000 5572 ns/op
BenchmarkRotator/2KB-8 200000 8338 ns/op
BenchmarkRotator/4KB-8 100000 14246 ns/op
BenchmarkRotator/8KB-8 50000 25279 ns/op
BenchmarkRotator/16KB-8 30000 48602 ns/op
BenchmarkRotator/32KB-8 20000 92159 ns/op
BenchmarkRotator/64KB-8 10000 154766 ns/op
BenchmarkRotator/128KB-8 5000 296872 ns/op
BenchmarkRotator/256KB-8 3000 551793 ns/op
2018-05-10 14:15:15 -07:00
Nick Ethier
91603a377e
client/driver: parse repo instead of attempting to pull repo info
2018-05-09 22:34:25 -04:00
Nick Ethier
38a33f9c75
client/driver: add test for docker auth helper
2018-05-09 22:33:56 -04:00
Alex Dadgar
e067a9ae06
naming of constants
2018-05-09 16:46:52 -07:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Alex Dadgar
0e79e1a46e
Keep stream and logs in sync for detecting closed pipe
2018-05-09 11:22:52 -07:00
Preetha
e7ae6e98d9
Merge pull request #4259 from hashicorp/f-deployment-improvements
2018-05-08 16:37:10 -05:00
Nick Ethier
3598925ca4
client/driver: use correct repo address when using docker-credential helper
2018-05-08 15:17:28 -04:00
Nick Ethier
54c86a0292
client/driver/env: interpolate empty optional meta params as empty strings
2018-05-07 20:19:51 -04:00
Nick Ethier
016ab7a105
client/driver: remove unused const 'dockerPullProgressEmitInterval'
2018-05-07 16:24:48 -04:00
Michael Schurter
f1d13683e6
consul: remove services with/without canary tags
...
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.
Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter
50e04c976e
consul: support canary tags for services
...
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.
Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar
df8fce4347
Ensure canaries tags are interpolated
2018-05-07 14:50:01 -05:00
Alex Dadgar
552604451c
rework where time gets set
2018-05-07 14:50:01 -05:00
Alex Dadgar
ee50789c22
Initial implementation
2018-05-07 14:50:01 -05:00
Nick Ethier
d8de354dbf
client/driver: add waiting layer status count to pull progress status msg
2018-05-07 12:18:20 -04:00
Nick Ethier
77af17efbc
client/driver: add seperate handler for emitting pull progress
2018-05-07 12:17:34 -04:00
Nick Ethier
0bdd976b7d
client/driver: remove pull timeout due to race condition that can lead to unexpected timeouts
...
If two jobs are pulling the same image simultaneously, which ever starts the pull first will set the pull timeout.
This can lead to a poor UX where the first job requested a short timeout while the second job requested a longer timeout
causing the pull to potentially timeout much sooner than expected by the second job.
2018-05-07 12:18:11 -04:00
Nick Ethier
7c5821d7c6
client/driver: do accounting on layer pull progress
2018-05-07 12:17:53 -04:00
Nick Ethier
8efda7dc6c
client/driver: emit progress to all allocs pulling same image
2018-05-07 12:17:34 -04:00
Nick Ethier
e35948ab91
client/driver: add image pull progress monitoring
2018-05-07 12:17:38 -04:00
Michael Schurter
0d534d30d6
Merge pull request #4251 from hashicorp/f-grpc-checks
...
Support Consul gRPC Health Checks
2018-05-04 14:55:16 -07:00
Michael Schurter
f6a4713141
consul: make grpc checks more like http checks
2018-05-04 11:08:11 -07:00
Michael Schurter
382caec1e1
consul: initial grpc implementation
...
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Jesus Vazquez
08a390448b
Update counter driver.docker.oom labels
2018-05-04 14:02:34 +08:00
Jesus Vazquez
4f6db56283
Initialize dockerhandle with jobname, taskgroupname, taskname and allocid
2018-05-04 14:02:19 +08:00
Jesus Vazquez
127b764dfb
Add Job, taskgroupname, taskname, and allocid to the DockerHandle struct
2018-05-04 14:01:26 +08:00
Jesus Vazquez
fd1ff1a0cf
Run goimports
2018-05-04 13:46:36 +08:00
Jesus Vazquez
5dd4059527
Add driver.docker counter metric for OOM Killer events
2018-05-04 13:46:36 +08:00
Michael Schurter
526af6a246
framer: fix early exit/truncation in framer
2018-05-02 10:46:16 -07:00
Michael Schurter
f1a6aa103a
framer: fix race and remove unused error var
...
In the old code `sending` in the `send()` method shared the Data slice's
underlying backing array with its caller. Clearing StreamFrame.Data
didn't break the reference from the sent frame to the StreamFramer's
data slice.
2018-05-02 10:46:16 -07:00
Michael Schurter
7360fe3a6d
client: squelch errors on cleanly closed pipes
2018-05-02 10:46:16 -07:00
Michael Schurter
ffff97e25f
client: don't spin on read errors
2018-05-02 10:46:16 -07:00
Michael Schurter
5ef0a82e6e
client: reset encoders between uses
...
According to go/codec's docs, Reset(...) should be called on
Decoders/Encoders before reuse:
https://godoc.org/github.com/ugorji/go/codec
I could find no evidence that *not* calling Reset() caused bugs, but
might as well do what the docs say?
2018-05-02 10:46:16 -07:00
Alex Dadgar
de4af37249
version bump and remove generated
2018-04-27 11:10:00 -07:00
Alex Dadgar
845a43864a
generated files
2018-04-27 10:45:40 -07:00
Alex Dadgar
35e06ddb31
Remove generated and version bump
2018-04-26 16:49:19 -07:00
Alex Dadgar
43192cefae
generated files
2018-04-26 16:28:58 -07:00
Michael Schurter
0e602d4779
Merge pull request #4188 from hashicorp/f-rkt-stats
...
rkt: create parent cgroup to enable stats
2018-04-24 14:54:36 -07:00
Michael Schurter
d687761ebf
rkt: test Stats() and always run tests
...
Remove the NOMAD_TEST_RKT flag as a guard for rkt tests. Still require
Linux, root, and rkt to be installed. Only check for rkt installation
once in hopes of speeding up rkt tests a bit.
2018-04-24 11:05:42 -07:00
Javier Palomo Almena
3e6c01ffa1
docker tests: Fix usage of NewDriverContext
2018-04-23 22:51:06 +02:00
Javier Palomo Almena
74d3c5df07
DriverContext: Add the TaskGroup and the Job name
...
Adding this fields to the DriverContext object, will allow us to pass
them to the drivers.
An use case for this, will be to emit tagged metrics in the drivers,
which contain all relevant information:
- Job
- TaskGroup
- Task
- ...
Ref: https://github.com/hashicorp/nomad/pull/4185
2018-04-23 00:15:29 +02:00
Michael Schurter
4cee6cca6c
rkt: create parent cgroup to enable stats
...
Having the Nomad executor create parent cgroups that rkt is launched
within allows the stats collection code used for the exec driver to Just
Work. The only downside is that now the Nomad executor's resource
utilization counts against the cgroups resource limits just as it does
for the exec driver.
2018-04-19 15:14:56 -07:00
Michael Schurter
1a85d0c990
run goimports
2018-04-19 11:16:28 -07:00
Michael Schurter
d77c265d1f
Merge pull request #4168 from ninoles/b-2117-windows-group-process
...
B 2117 windows group process
2018-04-19 11:10:51 -07:00
Michael Schurter
fdbcbd4e5b
Merge pull request #4058 from hashicorp/f-mock-by-default
...
[Post-0.8] test: build with mock_driver by default
2018-04-18 15:57:00 -07:00
Michael Schurter
d3650fb2cd
test: build with mock_driver by default
...
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Michael Schurter
a991923389
tests: fix race in alloc_runner_test.go
...
I could not reproduce the failure locally even with `stress -cpu ...`
eating all the cpu it could on my machine.
But I think the race was in one of two places:
* The task could restart which could create new events
* I think there could be a race between the updater's version of events
and alloc runners as updates are async
I fixed both. Here's hoping that fixes this flaky test.
2018-04-17 17:14:59 -07:00
Fabien Ninoles
c81bec48c9
Merge branch 'master' into b-2117-windows-group-process
2018-04-17 13:47:25 -04:00
Fabien Ninoles
35cf641416
Update based on PR request.
2018-04-17 13:43:04 -04:00
Alex Dadgar
c4ad76091d
Merge pull request #4166 from hashicorp/b-panic-fix-update
...
Fixes races accessing node and updating it during fingerprinting
2018-04-17 10:02:19 -07:00
Chelsea Holland Komlo
9b8a079558
fix up comments
2018-04-17 11:53:08 -04:00
Alex Dadgar
9d612c8cb0
Cleanup
2018-04-16 15:48:34 -07:00
Alex Dadgar
32adaf9dfc
Copy the config given to the alloc runner
2018-04-16 15:45:52 -07:00
Alex Dadgar
3ff2d4d795
fix race node access
2018-04-16 15:45:51 -07:00
Alex Dadgar
4f2a7b6949
Fix copying drivers
2018-04-16 15:45:51 -07:00
Alex Dadgar
0b799822ff
Operate on copy
2018-04-16 15:45:49 -07:00
Fabien Ninoles
27cf4995ce
- Clean up for windows compilation.
...
- Set CREATE_NEW_PROCESS_GROUP for Windows subprocess.
- Ensure we only kill actual process that need to.
2018-04-14 13:58:42 -04:00
Michael Schurter
3836b8a335
Merge pull request #3572 from emate/master
...
Create new process group on process startup.
2018-04-13 11:56:38 -07:00
Alex Dadgar
adaf4fa7e0
Remove generated structs
2018-04-12 16:35:31 -07:00
Alex Dadgar
663c4d0433
Version bump and generated files
2018-04-12 16:21:50 -07:00
Alex Dadgar
ff1a1a63e8
Move where attribute for driver detection is set
2018-04-12 15:50:25 -07:00
Chelsea Holland Komlo
5291788b40
delete driver name from only health check attributes
2018-04-12 18:24:41 -04:00
Alex Dadgar
3d53d380f7
Fix tests
2018-04-12 14:29:30 -07:00
Alex Dadgar
f24ce2c50c
Driver health detection cleanups
...
This PR does:
1. Health message based on detection has format "Driver XXX detected"
and "Driver XXX not detected"
2. Set initial health description based on detection status and don't
wait for the first health check.
3. Combine updating attributes on the node, fingerprint and health
checking update for drivers into a single call back.
4. Condensed driver info in `node status` only shows detected drivers
and make the output less wide by removing spaces.
2018-04-12 12:46:40 -07:00
Charlie Voiselle
ba88f00ccb
Changed "til" to "until"
...
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Andrei Burd
502d17fa90
Added node class to tagged metrics
2018-04-11 12:20:59 +03:00
Chelsea Komlo
eb5aac16e6
Merge pull request #4111 from hashicorp/b-undetected-set-health-to-false
...
Immediately set driver health status to false when driver moves to undetected
2018-04-10 18:30:31 -04:00
Chelsea Holland Komlo
d58b3e473c
update comment for when the fingerprinter setting health status
2018-04-10 16:53:00 -04:00
Chelsea Holland Komlo
f7ef13cc64
fingerprinter should set health check status if health check is not periodic
2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo
ede4f518bd
add setters for access to the fingerprint manager's node
...
refactor extracting driver info
2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo
f479da19f5
guard against overwriting health status
2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo
ece1618815
immediately set healthy to false when driver moves to undetected
2018-04-10 15:29:51 -04:00
Alex Dadgar
3d367d6fd7
Fix client uptime metric missing client prefix
2018-04-10 10:39:36 -07:00
Seth Vargo
df4fe7e76c
Set user-agent when talking to GCE metadata
2018-04-10 10:36:46 -04:00
Chelsea Komlo
d3bd8fb96e
Merge pull request #4109 from hashicorp/f-shorten-docker-health-timeout
...
Shorten docker health timeout
2018-04-09 15:38:39 -04:00
Chelsea Holland Komlo
ea4b65dd41
only initialize docker clients if they are nil
2018-04-09 14:13:07 -04:00
Chelsea Holland Komlo
288c7a33a1
refacotoring simplification from code review
2018-04-09 10:34:17 -04:00
Chelsea Holland Komlo
6e3b056c37
only run health check if driver moves from undetected to detected
2018-04-09 10:10:43 -04:00
Alex Dadgar
ae1f76477e
Start rebalance after discovering new servers
2018-04-05 15:41:59 -07:00
Alex Dadgar
929b6823a3
Merge pull request #4106 from hashicorp/b-servers
...
Improved Client handling of failed RPCs
2018-04-05 13:48:50 -07:00
Alex Dadgar
be2513e0f9
more jitter
2018-04-05 13:48:33 -07:00
Chelsea Holland Komlo
d3637825ef
group similar functions; update comments
...
health check timeout should be 1 minute
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
e8743f1f7b
remove do once block when creating a new docker client
...
only set cached connections upon no error
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
d0d793fc23
use client with shorter timeouts for health checks
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
5d1b2b77cb
refactor docker clients method to be able to extend to creating new clients
2018-04-05 16:19:02 -04:00
Alex Dadgar
bd3345942c
Handle no leader and faster retries near limit
...
Handle the ErrNoLeader case and apply slower retries. Also when we have
missed the heartbeat retry aggressively, backing off after we have
missed for more than 30 seconds.
2018-04-05 11:22:47 -07:00
Alex Dadgar
279b5c22e5
Scale heartbeat retrying based on remaining heartbeat time
2018-04-05 10:58:13 -07:00
Alex Dadgar
7941f4eb2d
Fire retry only when consul discovers new servers
2018-04-05 10:40:17 -07:00
Preetha
6254d75eee
Merge pull request #4101 from hashicorp/b-rescheduling-edge-fixes
...
Fixes edge cases around timing/ task finish time being set more than once
2018-04-04 16:18:21 -05:00
Preetha Appan
12ba4c45da
remove outdated commented out test code
2018-04-04 15:03:24 -05:00
Preetha Appan
6363a6fb4d
Remove old comment
2018-04-04 15:01:48 -05:00
Preetha Appan
5e4525bd30
Moves setting finishedAt to the right place and adds two unit tests.
2018-04-04 14:38:15 -05:00
Alex Dadgar
86c32358d4
Spelling error
2018-04-03 18:30:01 -07:00
Alex Dadgar
01a6beafbf
RPC Retry Watcher
2018-04-03 18:05:28 -07:00
Preetha Appan
e6bbce3fa0
Add comment
2018-04-03 19:49:03 -05:00
Alex Dadgar
ec844f19d9
randomize servers
2018-04-03 17:46:13 -07:00
Preetha Appan
00537c739b
Fixes edge cases around timing and task finish time being set more than once
2018-04-03 16:34:59 -05:00
Alex Dadgar
58a3ec3fb2
Improve Vault error handling
2018-04-03 14:29:22 -07:00
Alex Dadgar
86f9044676
remove generated files
2018-03-30 16:52:49 -07:00
Alex Dadgar
af81349dbe
Generated files
2018-03-30 16:14:40 -07:00
Michael Schurter
257ba5937d
test: don't rely on alloc runner update count
...
We were incorrectly relying on the count of alloc updates in a number of
tests. Since alloc updates are async, their number is non-determinstic
and largely meaningless.
This should fix quite a few flaky tests in Travis and prevent future
mistaken assumptions in tests.
2018-03-30 09:34:33 -07:00
Michael Schurter
62e9553333
Merge pull request #4069 from hashicorp/f-hashealth
...
add HasHealth helper for nil checks
2018-03-29 17:03:20 -07:00
Alex Dadgar
beee130a6e
Always capture the finish time
2018-03-29 11:27:22 -07:00
Michael Schurter
91b5bb58d9
add HasHealth helper for nil checks
...
We performed the DeploymentStatus nil checks a couple different ways, so
hopefully this helper will consoldiate them and make it more clear what
the code is doing.
2018-03-29 09:29:19 -07:00
Chelsea Komlo
4338360da9
Merge pull request #4065 from hashicorp/emit-node-event-on-first-health-change
...
Emit first node event after initialization on health status change
2018-03-29 11:23:25 -04:00
Chelsea Holland Komlo
2174ede6b9
add clarifying comment
2018-03-29 10:58:39 -04:00
Michael Schurter
3a79c32677
Merge pull request #4059 from hashicorp/b-drain-health-svc-only
...
only service allocs should have health watched
2018-03-28 16:49:22 -07:00
Michael Schurter
5eb0cb7176
only service allocs should have health watched
2018-03-28 16:20:11 -07:00
Chelsea Holland Komlo
e3319afee1
emit first node event
2018-03-28 17:26:53 -04:00
Chelsea Komlo
7812ac5abf
Merge pull request #4057 from hashicorp/specify-docker-msg
...
Specify docker name in driver health messages
2018-03-28 13:32:36 -04:00
Preetha
177d2d6010
Merge pull request #4052 from hashicorp/f-specify-total-memory
...
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo
efc03e252c
specify driver health messages
2018-03-28 11:35:21 -04:00
Preetha Appan
329428b49f
Code review feedback and unit test
2018-03-28 10:07:15 -05:00
Charlie Voiselle
ea10588227
rkt: logging enhancements ( #4044 )
...
* Added extra debug logging; extended timeout; added jitter.
* small log changes
* increase timeout
* remove unneccessary uuid
2018-03-27 17:30:06 -07:00
Michael Schurter
fcaee471a0
client: always mark exited sys/svc allocs as failed
...
When restarts.attempts=0 was set in a jobspec a system or service alloc
that exited with 0 status would be marked as `completed` instead of
`failed`. Since system and service jobs are intended to run until
stopped or updated, they should always be marked as failed when they
exit even in cases where the exit code is 0.
2018-03-27 14:30:19 -07:00
Mildred Ki'Lya
1017cbe8ab
Allow to specify total memory on agent configuration
...
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Alex Dadgar
432784dae3
Fix alloc watcher snapshot streaming
2018-03-27 11:14:53 -07:00
Alex Dadgar
05449fea09
drop stats fetching log
2018-03-23 12:01:50 -07:00
Chelsea Komlo
5f0c382021
Merge pull request #4030 from hashicorp/health-check-ux
...
UX improvments to driver health checks
2018-03-23 09:46:50 -04:00
Alex Dadgar
da27fc3880
Driver Info output
2018-03-22 17:18:32 -07:00
Chelsea Holland Komlo
e9005d8cfb
ux improvments to driver health checks
2018-03-22 18:38:29 -04:00
Michael Schurter
a318684738
Merge pull request #4022 from hashicorp/f-more-executor-logging
...
executor: increase level for helpful log lines
2018-03-22 15:21:20 -07:00
Michael Schurter
a4f346abeb
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
8b346c6176
test: try to prevent flakiness on travis
2018-03-21 16:51:45 -07:00
Michael Schurter
1b7ac447e9
alloc_runner: watch health for deployed batch jobs
2018-03-21 16:51:45 -07:00
Michael Schurter
62960ed7bd
client: don't monitor health of non-service jobs
...
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
bb0ff44fb4
mock_driver: improve Kill() logging
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Holland Komlo
f329e45e03
always set initial health status for every driver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
bbaffe3eca
set driver to unhealthy once if it cannot be detected in periodic check
2018-03-21 15:15:26 -04:00
Alex Dadgar
5df4b3728d
Docker driver doesn't return errors but injects into the DriverInfo
2018-03-21 15:15:26 -04:00
Alex Dadgar
4365bb7f59
Only run health check if driver is detected
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
f801709a0a
fix issue when updating node events
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
285729aee2
function rename and re-arrange functions in fingerprint_manager
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
60f12d206f
improve comments; update watchDriver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
739784736a
remove unused function
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d92703617c
simplify logic
...
bump log level
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
86b7b3d2d9
fix up health check logic comparison; add node events to client driver checks
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
53a5bc2bb3
Code review feedback
2018-03-21 15:15:26 -04:00
Alex Dadgar
34dc58421c
notes from walk through
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
44b6951dda
improve tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d740a6a46e
refresh driver information for non-health checking drivers periodically
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d8f68e5ef8
fix up codereview feedback
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d5f6c940c4
fix up racy tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
0425be8f48
updating comments; locking concurrent node access
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
c50d02ae93
go style; update comments
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3cba95e8a7
allow nomad to schedule based on the status of a client driver health check
...
Slight updates for go style
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Michael Schurter
1022170bf3
executor: increase level for helpful log lines
...
Should help with debugging issues like #3971
2018-03-21 11:53:58 -07:00
Marcin Matlaszek
6019a88824
Make raw_exec processes cleanup function more precise.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
bb36c122e2
Fix errors when trying to kill whole process group.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
86d650d7b0
Make starting & cleaning process group Windows compatible.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
79c139f2ef
Create new process group on process startup.
...
Clean up by sending SIGKILL to the whole process group.
2018-03-20 13:40:21 +01:00
Michael Schurter
1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
...
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00
Michael Schurter
32ee5e0d53
Merge pull request #3990 from hashicorp/f-rkt-groups
...
rkt: allow specifying --group
2018-03-16 11:19:53 -07:00
Michael Schurter
bd78cfb039
rkt: allow specifying --group
2018-03-16 11:08:22 -07:00
Michael Schurter
fb10ec9c01
docker: make volume errors recoverable
...
The interface+mock just to test this one little error handling may seem
like overkill but there was just no other way to write an automated test
around this logic as there's no way to simluate this error with stock
Docker.
2018-03-15 17:52:43 -07:00
Michael Schurter
0971114f0c
Replace Consul TLSSkipVerify handling
...
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.
The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Preetha Appan
3c38eededd
Fix spelling in comment
2018-03-14 15:54:25 -05:00
Alex Dadgar
bef4a8ee09
fix clearing node events
2018-03-14 09:48:59 -07:00
Chelsea Komlo
810eedfa2a
Merge pull request #3945 from hashicorp/f-add-node-events
...
Add node events
2018-03-14 08:42:55 -04:00
Preetha
360d6e5a92
Merge pull request #3968 from hashicorp/f-nicer-vault-error
...
Make server side error messages from vault more clearer
2018-03-13 20:49:39 -05:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
b41501e442
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8655320fd
fix up go check warnings
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
0934769b04
add client side emitting of node events
...
Changelog
2018-03-13 18:08:21 -07:00
Preetha Appan
914eaed64f
Address some code review comments
2018-03-13 18:19:16 -05:00
Preetha Appan
09c231ce43
Return the err from server correctly
2018-03-13 18:10:14 -05:00
Preetha Appan
9618f52746
Remove error wrapping and make vault connection server side errors clearer.
2018-03-13 17:09:03 -05:00
Michael Schurter
79df90acb0
Merge pull request #3958 from simplesurance/swappiness
...
fix: disable swap for executor_linux allocations
2018-03-13 10:10:22 -07:00
Fabian Holler
e6af051c93
fix: disable swap for executor_linux allocations
...
A comment in the nomad source code states that swapping for
executor_linux allocations is disabled but it wasn't.
Nomad wrote -1 to the memsw.limit_in_bytes cgroup file to disable
swapping.
This has the following problems:
1.) Writing -1 to the file does not disable swapping. It sets
the limit for memory and swap to unlimited.
2.) On common Linux distributions like Ubuntu 16.04 LTS the
memsw.limit_in_bytes cgroup file does not exist by default.
The memsw.limit_in_bytes file only exist if the Linux kernel is
build with CONFIG_MEMCG_SWAP=yes and either
CONFIG_MEMCG_SWAP_ENABLED=yes or when the kernel parameter
swapaccount=1 is passed during boot.
Most Linux distributions disable swap accounting by default because
of higher memory usage.
Nomad silently ignores if writing to the memsw.limit_in_bytes file
fails. The allocation succeeds, no message is logged to notify the
user.
To ensure that disabling swap works on common Linux kernels, disable
swapping by writing 0 to the memory.swappiness file.
Using the memory.swappiness file only requires that the kernel is
compiled with CONFIG_MEMCG=yes. This is the default in common Linux
kernels.
2018-03-13 10:52:50 +01:00
Alex Dadgar
4844317cc2
Merge pull request #3890 from hashicorp/b-heartbeat
...
Heartbeat improvements and handling failures during establishing leadership
2018-03-12 14:41:59 -07:00
Michael Schurter
7dd7fbcda2
non-Existent -> nonexistent
...
Reverting from #3963
https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Josh Soref
18c5659474
spelling: version
2018-03-11 19:13:25 +00:00
Josh Soref
6222bd564e
spelling: verify
2018-03-11 19:13:32 +00:00
Josh Soref
1359fd2c3d
spelling: unexpected
2018-03-11 19:08:07 +00:00
Josh Soref
173ce63fe9
spelling: transition
2018-03-11 19:06:05 +00:00
Josh Soref
782c704de6
spelling: thresholds
2018-03-11 19:03:47 +00:00
Josh Soref
ac6d3767da
spelling: terminated
2018-03-11 19:01:49 +00:00
Josh Soref
2dda6abab9
spelling: templates
2018-03-11 19:01:39 +00:00
Josh Soref
8978caea28
spelling: shutdown
2018-03-11 18:55:49 +00:00
Josh Soref
8d191c9273
spelling: severity
2018-03-11 18:53:52 +00:00
Josh Soref
a79eccaa58
spelling: service
2018-03-11 18:53:47 +00:00
Josh Soref
8149694f3a
spelling: server
2018-03-11 18:55:30 +00:00
Josh Soref
3787d8141e
spelling: serialize
2018-03-11 18:53:39 +00:00
Josh Soref
e37626561c
spelling: semantics
2018-03-11 19:00:26 +00:00
Josh Soref
e4639ac62f
spelling: secrets
2018-03-11 18:53:26 +00:00
Josh Soref
cec45c6bc8
spelling: safety
2018-03-11 18:52:54 +00:00
Josh Soref
de9d0c7180
spelling: retrieved
2018-03-11 18:51:40 +00:00
Josh Soref
e949d23e1b
spelling: resource
2018-03-11 18:51:03 +00:00
Josh Soref
82221f9a2b
spelling: represents
2018-03-11 18:42:29 +00:00
Josh Soref
1c3b60ae70
spelling: replace
2018-03-11 18:41:53 +00:00
Josh Soref
b47ab9ab8c
spelling: removes
2018-03-11 18:41:43 +00:00
Josh Soref
db166c6cf6
spelling: remnants
2018-03-11 18:41:26 +00:00
Josh Soref
258d76ec13
spelling: registry
2018-03-11 18:41:13 +00:00
Josh Soref
7ad77f568b
spelling: purposes
2018-03-11 18:39:35 +00:00
Josh Soref
6fa892a463
spelling: propagated
2018-03-11 18:39:26 +00:00
Josh Soref
1a8204fa11
spelling: previous
2018-03-11 18:38:23 +00:00
Josh Soref
f764e5552a
spelling: periodically
2018-03-11 18:36:59 +00:00
Josh Soref
96e47bd4c1
spelling: parallelism
2018-03-11 18:35:54 +00:00
Josh Soref
3c1ce6d16d
spelling: otherwise
2018-03-11 18:34:27 +00:00
Josh Soref
96dba3e267
spelling: mount
2018-03-11 18:27:18 +00:00
Josh Soref
13e5fb8221
spelling: malicious
2018-03-11 18:26:25 +00:00
Josh Soref
1ef6d6319e
spelling: labels
2018-03-11 18:21:44 +00:00
Josh Soref
b6ec60fb5f
spelling: isolation
2018-03-11 18:19:02 +00:00
Josh Soref
337ac13f0a
spelling: interpolation
2018-03-11 18:16:36 +00:00
Josh Soref
75d1240446
spelling: interface
2018-03-11 18:15:37 +00:00
Josh Soref
c1a0ae3161
spelling: inspect
2018-03-11 18:15:27 +00:00
Josh Soref
2a1cf2f216
spelling: initialization
2018-03-11 18:18:37 +00:00
Josh Soref
b293b48287
spelling: idempotent
2018-03-11 18:14:50 +00:00
Josh Soref
52b83328fc
spelling: heartbeating
2018-03-11 18:12:19 +00:00
Josh Soref
3ad579930e
spelling: fingerprint
2018-03-11 18:07:37 +00:00
Josh Soref
7f6e4012a0
spelling: existent
2018-03-11 18:30:37 +00:00
Josh Soref
7cd95f6eb3
spelling: executor
2018-03-11 18:05:31 +00:00
Josh Soref
b9ce8b9e37
spelling: each
2018-03-11 17:56:19 +00:00
Josh Soref
0fc23b0ba3
spelling: down
2018-03-11 17:55:47 +00:00
Josh Soref
e8478c4065
spelling: documentation
2018-03-11 17:55:21 +00:00
Josh Soref
4241ffc5ab
spelling: disable
2018-03-11 17:55:12 +00:00
Josh Soref
858b9e809f
spelling: directory
2018-03-11 17:55:06 +00:00
Josh Soref
09970343b5
spelling: destruction
2018-03-11 17:54:39 +00:00
Josh Soref
2f135f0ed7
spelling: destroy
2018-03-11 17:54:13 +00:00
Josh Soref
97dc9a00c0
spelling: default
2018-03-11 17:52:58 +00:00
Josh Soref
aaa6e104ed
spelling: could
2018-03-11 17:51:47 +00:00
Josh Soref
c9b86bbc2f
spelling: controls
2018-03-11 17:50:39 +00:00
Josh Soref
f2a7c95379
spelling: constraints
2018-03-11 17:50:28 +00:00
Josh Soref
cb1303e47a
spelling: conjunction
2018-03-11 17:48:37 +00:00
Josh Soref
42fa13bbc6
spelling: cancelled
2018-03-11 17:45:47 +00:00
Josh Soref
7077386916
spelling: cancelable
2018-03-11 17:45:34 +00:00
Josh Soref
a70fe97556
spelling: assert
2018-03-11 17:41:33 +00:00
Josh Soref
58b794875f
spelling: artifact
2018-03-11 17:41:02 +00:00
Josh Soref
e78cf9c81a
spelling: already
2018-03-11 17:39:04 +00:00
Josh Soref
b8b46d3f74
spelling: allocation
2018-03-11 17:37:22 +00:00
Josh Soref
e87b0a4d86
spelling: alloc
2018-03-11 17:36:34 +00:00
Josh Soref
b67449796a
spelling: added
2018-03-11 17:34:28 +00:00
Chelsea Komlo
bd88877249
Merge pull request #3909 from hashicorp/b-node-attributes-concurrent-access
...
Fingerprinters accessing node information should be thread safe
2018-03-06 11:57:46 -05:00
Chelsea Komlo
7c7e2f4d0b
Merge pull request #3873 from hashicorp/r-edge-trigger-node-watcher
...
Edge trigger node updates
2018-03-01 15:18:59 -05:00
Chelsea Holland Komlo
122d1c4e4a
simplify retry logic
2018-03-01 09:48:26 -05:00
Michael Schurter
557a70f78d
Merge pull request #3917 from jaininshah9/master
...
changing the formula to correctly pass the CPUQota to docker
2018-02-28 20:00:37 -08:00
Jainin Shah
39e1fc06e5
adding comments to the change
2018-02-28 16:19:51 -08:00
Preetha Appan
eaedffc7f7
Fix go vet errors
2018-02-28 12:21:27 -06:00
Chelsea Holland Komlo
355805db56
reset timer after updating node copy
2018-02-27 17:18:10 -05:00
Jainin Shah
6eb7da002f
changing the formula to correctly pass the CPUQota to docker
2018-02-27 12:32:23 -08:00
Chelsea Holland Komlo
a72aaaf47f
add network resources equal method, use time ticker
...
remove impossible test case
2018-02-27 12:42:53 -05:00
Chelsea Holland Komlo
e736e31820
use time ticker, update how network resources are compared
2018-02-26 18:47:11 -05:00
Chelsea Holland Komlo
5059065b52
improved testing; node networks comparison
2018-02-26 15:55:38 -05:00
Chelsea Holland Komlo
b7bcd0b59f
fingerprinters accessing node information should be thread safe
2018-02-26 15:25:54 -05:00
Chelsea Holland Komlo
1f31b39fe8
code review fixups
2018-02-26 12:36:30 -05:00
Chelsea Holland Komlo
ed8c8afbcd
edge trigger node update
...
test update config copy trigger
2018-02-26 12:36:04 -05:00
Alex Dadgar
49a47483d1
Registering back to initializing
...
Fix a bug in which if the node attributes/meta changed, we would
re-register the node in status initializing. This would incorrectly
trigger the client to log that it missed its heartbeat.
It would change the status of the Node to initializing until the next
heartbeat occured.
2018-02-16 17:49:31 -08:00
Alex Dadgar
eff4455c68
Fix original client server list behavior
2018-02-15 16:04:53 -08:00
Alex Dadgar
0ebf7f3b7f
remove tmp file
2018-02-15 15:51:27 -08:00
Alex Dadgar
f9cf642436
Client tls
2018-02-15 15:22:57 -08:00
Alex Dadgar
0e85ae77b4
fix flaky gc tests
2018-02-15 13:59:03 -08:00
Alex Dadgar
38b695b69c
feedback and rebasing
2018-02-15 13:59:03 -08:00
Alex Dadgar
9117ef4650
HTTP agent
2018-02-15 13:59:03 -08:00
Alex Dadgar
d7029965ca
Server side impl + touch ups
2018-02-15 13:59:02 -08:00
Alex Dadgar
ce0caccad2
client implementation of alloc gc and stats
2018-02-15 13:59:02 -08:00
Alex Dadgar
e685211892
Code review feedback
2018-02-15 13:59:02 -08:00
Alex Dadgar
a9c4f8a4c8
clarify force
2018-02-15 13:59:02 -08:00
Alex Dadgar
dc75501c69
Respond to comments
2018-02-15 13:59:02 -08:00
Alex Dadgar
cea77df6a7
Add Streaming RPC ack
...
This PR introduces an ack allowing the receiving end of the streaming
RPC to return any error that may have occured during the establishment
of the streaming RPC.
2018-02-15 13:59:02 -08:00
Alex Dadgar
2f9d33f479
vet
2018-02-15 13:59:02 -08:00
Alex Dadgar
f5f43218f5
HTTP and tests
2018-02-15 13:59:02 -08:00
Alex Dadgar
6546b43a17
Client implementation of stream
2018-02-15 13:59:02 -08:00
Alex Dadgar
9a5569678c
Client Stat/List impl
2018-02-15 13:59:02 -08:00
Alex Dadgar
8854b35b34
Agent logs
2018-02-15 13:59:02 -08:00
Alex Dadgar
857b0ab6c7
client tests
2018-02-15 13:59:02 -08:00
Alex Dadgar
69def2ff22
Server tests of logs
2018-02-15 13:59:02 -08:00
Alex Dadgar
9479cb7f25
Remove logging
2018-02-15 13:59:01 -08:00
Alex Dadgar
14f57024b7
test stream framer
2018-02-15 13:59:01 -08:00
Alex Dadgar
ddd67f5f11
Server streaming
2018-02-15 13:59:01 -08:00
Alex Dadgar
ca9379be09
Logs over RPC w/ lots to touch up
2018-02-15 13:59:01 -08:00
Alex Dadgar
2c0ad26374
New RPC Modes and basic setup for streaming RPC handlers
2018-02-15 13:59:01 -08:00
Alex Dadgar
fea0e69d4f
wip fs endpoint
2018-02-15 13:59:01 -08:00
Alex Dadgar
b5037f20db
Remove circular dependency
2018-02-15 13:59:01 -08:00
Alex Dadgar
9bc75f0ad4
Fix manager tests and make testagent recover from port conflicts
2018-02-15 13:59:01 -08:00
Alex Dadgar
feb943c873
Fix lint/comments
2018-02-15 13:59:01 -08:00
Alex Dadgar
ac67da3b06
Unjankify the pkg
2018-02-15 13:59:01 -08:00
Alex Dadgar
3f1f8604bb
initial round of comment review
2018-02-15 13:59:01 -08:00
Alex Dadgar
e03b074650
Plumb config
2018-02-15 13:59:01 -08:00
Alex Dadgar
05c4fe8675
Change defaults for min use duration
2018-02-15 13:59:01 -08:00
Alex Dadgar
c8c1284bc3
SetServer command actually returns an error if given an invalid server
2018-02-15 13:59:01 -08:00
Alex Dadgar
3f786b904b
use server manager
2018-02-15 13:59:01 -08:00
Alex Dadgar
b24b05e025
Remove testing
2018-02-15 13:59:01 -08:00
Alex Dadgar
4e1cb1d96e
Test RPC from server
2018-02-15 13:59:00 -08:00
Alex Dadgar
6dd1c9f49d
Refactor
2018-02-15 13:59:00 -08:00
Alex Dadgar
a6dfffa4fa
Add testing interfaces
2018-02-15 13:59:00 -08:00
Alex Dadgar
d918f9bd5c
RPC Listener
2018-02-15 13:59:00 -08:00
Alex Dadgar
1472b943d6
Stats Endpoint
2018-02-15 13:59:00 -08:00
Chelsea Komlo
0c0b56a1a4
Merge pull request #3807 from hashicorp/f-client-add-fingerprint-manager
...
Add fingerprint manager to manage fingerprinting node
2018-02-13 11:22:50 -05:00
Chelsea Holland Komlo
b321287712
extract test helper
...
lock concurrent accesses to node
comment exported method
2018-02-12 18:30:10 -05:00