Chelsea Holland Komlo
e9005d8cfb
ux improvments to driver health checks
2018-03-22 18:38:29 -04:00
Michael Schurter
a318684738
Merge pull request #4022 from hashicorp/f-more-executor-logging
...
executor: increase level for helpful log lines
2018-03-22 15:21:20 -07:00
Michael Schurter
a4f346abeb
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
8b346c6176
test: try to prevent flakiness on travis
2018-03-21 16:51:45 -07:00
Michael Schurter
1b7ac447e9
alloc_runner: watch health for deployed batch jobs
2018-03-21 16:51:45 -07:00
Michael Schurter
62960ed7bd
client: don't monitor health of non-service jobs
...
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
bb0ff44fb4
mock_driver: improve Kill() logging
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Holland Komlo
f329e45e03
always set initial health status for every driver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
bbaffe3eca
set driver to unhealthy once if it cannot be detected in periodic check
2018-03-21 15:15:26 -04:00
Alex Dadgar
5df4b3728d
Docker driver doesn't return errors but injects into the DriverInfo
2018-03-21 15:15:26 -04:00
Alex Dadgar
4365bb7f59
Only run health check if driver is detected
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
f801709a0a
fix issue when updating node events
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
285729aee2
function rename and re-arrange functions in fingerprint_manager
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
60f12d206f
improve comments; update watchDriver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
739784736a
remove unused function
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d92703617c
simplify logic
...
bump log level
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
86b7b3d2d9
fix up health check logic comparison; add node events to client driver checks
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
53a5bc2bb3
Code review feedback
2018-03-21 15:15:26 -04:00
Alex Dadgar
34dc58421c
notes from walk through
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
44b6951dda
improve tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d740a6a46e
refresh driver information for non-health checking drivers periodically
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d8f68e5ef8
fix up codereview feedback
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d5f6c940c4
fix up racy tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
0425be8f48
updating comments; locking concurrent node access
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
c50d02ae93
go style; update comments
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3cba95e8a7
allow nomad to schedule based on the status of a client driver health check
...
Slight updates for go style
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Michael Schurter
1022170bf3
executor: increase level for helpful log lines
...
Should help with debugging issues like #3971
2018-03-21 11:53:58 -07:00
Marcin Matlaszek
6019a88824
Make raw_exec processes cleanup function more precise.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
bb36c122e2
Fix errors when trying to kill whole process group.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
86d650d7b0
Make starting & cleaning process group Windows compatible.
2018-03-20 13:40:21 +01:00
Marcin Matlaszek
79c139f2ef
Create new process group on process startup.
...
Clean up by sending SIGKILL to the whole process group.
2018-03-20 13:40:21 +01:00
Michael Schurter
1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
...
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00
Michael Schurter
32ee5e0d53
Merge pull request #3990 from hashicorp/f-rkt-groups
...
rkt: allow specifying --group
2018-03-16 11:19:53 -07:00
Michael Schurter
bd78cfb039
rkt: allow specifying --group
2018-03-16 11:08:22 -07:00
Michael Schurter
fb10ec9c01
docker: make volume errors recoverable
...
The interface+mock just to test this one little error handling may seem
like overkill but there was just no other way to write an automated test
around this logic as there's no way to simluate this error with stock
Docker.
2018-03-15 17:52:43 -07:00
Michael Schurter
0971114f0c
Replace Consul TLSSkipVerify handling
...
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.
The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Preetha Appan
3c38eededd
Fix spelling in comment
2018-03-14 15:54:25 -05:00
Alex Dadgar
bef4a8ee09
fix clearing node events
2018-03-14 09:48:59 -07:00
Chelsea Komlo
810eedfa2a
Merge pull request #3945 from hashicorp/f-add-node-events
...
Add node events
2018-03-14 08:42:55 -04:00
Preetha
360d6e5a92
Merge pull request #3968 from hashicorp/f-nicer-vault-error
...
Make server side error messages from vault more clearer
2018-03-13 20:49:39 -05:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
b41501e442
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8655320fd
fix up go check warnings
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
0934769b04
add client side emitting of node events
...
Changelog
2018-03-13 18:08:21 -07:00
Preetha Appan
914eaed64f
Address some code review comments
2018-03-13 18:19:16 -05:00
Preetha Appan
09c231ce43
Return the err from server correctly
2018-03-13 18:10:14 -05:00
Preetha Appan
9618f52746
Remove error wrapping and make vault connection server side errors clearer.
2018-03-13 17:09:03 -05:00
Michael Schurter
79df90acb0
Merge pull request #3958 from simplesurance/swappiness
...
fix: disable swap for executor_linux allocations
2018-03-13 10:10:22 -07:00
Fabian Holler
e6af051c93
fix: disable swap for executor_linux allocations
...
A comment in the nomad source code states that swapping for
executor_linux allocations is disabled but it wasn't.
Nomad wrote -1 to the memsw.limit_in_bytes cgroup file to disable
swapping.
This has the following problems:
1.) Writing -1 to the file does not disable swapping. It sets
the limit for memory and swap to unlimited.
2.) On common Linux distributions like Ubuntu 16.04 LTS the
memsw.limit_in_bytes cgroup file does not exist by default.
The memsw.limit_in_bytes file only exist if the Linux kernel is
build with CONFIG_MEMCG_SWAP=yes and either
CONFIG_MEMCG_SWAP_ENABLED=yes or when the kernel parameter
swapaccount=1 is passed during boot.
Most Linux distributions disable swap accounting by default because
of higher memory usage.
Nomad silently ignores if writing to the memsw.limit_in_bytes file
fails. The allocation succeeds, no message is logged to notify the
user.
To ensure that disabling swap works on common Linux kernels, disable
swapping by writing 0 to the memory.swappiness file.
Using the memory.swappiness file only requires that the kernel is
compiled with CONFIG_MEMCG=yes. This is the default in common Linux
kernels.
2018-03-13 10:52:50 +01:00
Alex Dadgar
4844317cc2
Merge pull request #3890 from hashicorp/b-heartbeat
...
Heartbeat improvements and handling failures during establishing leadership
2018-03-12 14:41:59 -07:00
Michael Schurter
7dd7fbcda2
non-Existent -> nonexistent
...
Reverting from #3963
https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Josh Soref
18c5659474
spelling: version
2018-03-11 19:13:25 +00:00
Josh Soref
6222bd564e
spelling: verify
2018-03-11 19:13:32 +00:00
Josh Soref
1359fd2c3d
spelling: unexpected
2018-03-11 19:08:07 +00:00
Josh Soref
173ce63fe9
spelling: transition
2018-03-11 19:06:05 +00:00
Josh Soref
782c704de6
spelling: thresholds
2018-03-11 19:03:47 +00:00
Josh Soref
ac6d3767da
spelling: terminated
2018-03-11 19:01:49 +00:00
Josh Soref
2dda6abab9
spelling: templates
2018-03-11 19:01:39 +00:00
Josh Soref
8978caea28
spelling: shutdown
2018-03-11 18:55:49 +00:00
Josh Soref
8d191c9273
spelling: severity
2018-03-11 18:53:52 +00:00
Josh Soref
a79eccaa58
spelling: service
2018-03-11 18:53:47 +00:00
Josh Soref
8149694f3a
spelling: server
2018-03-11 18:55:30 +00:00
Josh Soref
3787d8141e
spelling: serialize
2018-03-11 18:53:39 +00:00
Josh Soref
e37626561c
spelling: semantics
2018-03-11 19:00:26 +00:00
Josh Soref
e4639ac62f
spelling: secrets
2018-03-11 18:53:26 +00:00
Josh Soref
cec45c6bc8
spelling: safety
2018-03-11 18:52:54 +00:00
Josh Soref
de9d0c7180
spelling: retrieved
2018-03-11 18:51:40 +00:00
Josh Soref
e949d23e1b
spelling: resource
2018-03-11 18:51:03 +00:00
Josh Soref
82221f9a2b
spelling: represents
2018-03-11 18:42:29 +00:00
Josh Soref
1c3b60ae70
spelling: replace
2018-03-11 18:41:53 +00:00
Josh Soref
b47ab9ab8c
spelling: removes
2018-03-11 18:41:43 +00:00
Josh Soref
db166c6cf6
spelling: remnants
2018-03-11 18:41:26 +00:00
Josh Soref
258d76ec13
spelling: registry
2018-03-11 18:41:13 +00:00
Josh Soref
7ad77f568b
spelling: purposes
2018-03-11 18:39:35 +00:00
Josh Soref
6fa892a463
spelling: propagated
2018-03-11 18:39:26 +00:00
Josh Soref
1a8204fa11
spelling: previous
2018-03-11 18:38:23 +00:00
Josh Soref
f764e5552a
spelling: periodically
2018-03-11 18:36:59 +00:00
Josh Soref
96e47bd4c1
spelling: parallelism
2018-03-11 18:35:54 +00:00
Josh Soref
3c1ce6d16d
spelling: otherwise
2018-03-11 18:34:27 +00:00
Josh Soref
96dba3e267
spelling: mount
2018-03-11 18:27:18 +00:00
Josh Soref
13e5fb8221
spelling: malicious
2018-03-11 18:26:25 +00:00
Josh Soref
1ef6d6319e
spelling: labels
2018-03-11 18:21:44 +00:00
Josh Soref
b6ec60fb5f
spelling: isolation
2018-03-11 18:19:02 +00:00
Josh Soref
337ac13f0a
spelling: interpolation
2018-03-11 18:16:36 +00:00
Josh Soref
75d1240446
spelling: interface
2018-03-11 18:15:37 +00:00
Josh Soref
c1a0ae3161
spelling: inspect
2018-03-11 18:15:27 +00:00
Josh Soref
2a1cf2f216
spelling: initialization
2018-03-11 18:18:37 +00:00
Josh Soref
b293b48287
spelling: idempotent
2018-03-11 18:14:50 +00:00
Josh Soref
52b83328fc
spelling: heartbeating
2018-03-11 18:12:19 +00:00
Josh Soref
3ad579930e
spelling: fingerprint
2018-03-11 18:07:37 +00:00
Josh Soref
7f6e4012a0
spelling: existent
2018-03-11 18:30:37 +00:00
Josh Soref
7cd95f6eb3
spelling: executor
2018-03-11 18:05:31 +00:00
Josh Soref
b9ce8b9e37
spelling: each
2018-03-11 17:56:19 +00:00
Josh Soref
0fc23b0ba3
spelling: down
2018-03-11 17:55:47 +00:00
Josh Soref
e8478c4065
spelling: documentation
2018-03-11 17:55:21 +00:00
Josh Soref
4241ffc5ab
spelling: disable
2018-03-11 17:55:12 +00:00
Josh Soref
858b9e809f
spelling: directory
2018-03-11 17:55:06 +00:00
Josh Soref
09970343b5
spelling: destruction
2018-03-11 17:54:39 +00:00
Josh Soref
2f135f0ed7
spelling: destroy
2018-03-11 17:54:13 +00:00
Josh Soref
97dc9a00c0
spelling: default
2018-03-11 17:52:58 +00:00
Josh Soref
aaa6e104ed
spelling: could
2018-03-11 17:51:47 +00:00
Josh Soref
c9b86bbc2f
spelling: controls
2018-03-11 17:50:39 +00:00
Josh Soref
f2a7c95379
spelling: constraints
2018-03-11 17:50:28 +00:00
Josh Soref
cb1303e47a
spelling: conjunction
2018-03-11 17:48:37 +00:00
Josh Soref
42fa13bbc6
spelling: cancelled
2018-03-11 17:45:47 +00:00
Josh Soref
7077386916
spelling: cancelable
2018-03-11 17:45:34 +00:00
Josh Soref
a70fe97556
spelling: assert
2018-03-11 17:41:33 +00:00
Josh Soref
58b794875f
spelling: artifact
2018-03-11 17:41:02 +00:00
Josh Soref
e78cf9c81a
spelling: already
2018-03-11 17:39:04 +00:00
Josh Soref
b8b46d3f74
spelling: allocation
2018-03-11 17:37:22 +00:00
Josh Soref
e87b0a4d86
spelling: alloc
2018-03-11 17:36:34 +00:00
Josh Soref
b67449796a
spelling: added
2018-03-11 17:34:28 +00:00
Chelsea Komlo
bd88877249
Merge pull request #3909 from hashicorp/b-node-attributes-concurrent-access
...
Fingerprinters accessing node information should be thread safe
2018-03-06 11:57:46 -05:00
Chelsea Komlo
7c7e2f4d0b
Merge pull request #3873 from hashicorp/r-edge-trigger-node-watcher
...
Edge trigger node updates
2018-03-01 15:18:59 -05:00
Chelsea Holland Komlo
122d1c4e4a
simplify retry logic
2018-03-01 09:48:26 -05:00
Michael Schurter
557a70f78d
Merge pull request #3917 from jaininshah9/master
...
changing the formula to correctly pass the CPUQota to docker
2018-02-28 20:00:37 -08:00
Jainin Shah
39e1fc06e5
adding comments to the change
2018-02-28 16:19:51 -08:00
Preetha Appan
eaedffc7f7
Fix go vet errors
2018-02-28 12:21:27 -06:00
Chelsea Holland Komlo
355805db56
reset timer after updating node copy
2018-02-27 17:18:10 -05:00
Jainin Shah
6eb7da002f
changing the formula to correctly pass the CPUQota to docker
2018-02-27 12:32:23 -08:00
Chelsea Holland Komlo
a72aaaf47f
add network resources equal method, use time ticker
...
remove impossible test case
2018-02-27 12:42:53 -05:00
Chelsea Holland Komlo
e736e31820
use time ticker, update how network resources are compared
2018-02-26 18:47:11 -05:00
Chelsea Holland Komlo
5059065b52
improved testing; node networks comparison
2018-02-26 15:55:38 -05:00
Chelsea Holland Komlo
b7bcd0b59f
fingerprinters accessing node information should be thread safe
2018-02-26 15:25:54 -05:00
Chelsea Holland Komlo
1f31b39fe8
code review fixups
2018-02-26 12:36:30 -05:00
Chelsea Holland Komlo
ed8c8afbcd
edge trigger node update
...
test update config copy trigger
2018-02-26 12:36:04 -05:00
Alex Dadgar
49a47483d1
Registering back to initializing
...
Fix a bug in which if the node attributes/meta changed, we would
re-register the node in status initializing. This would incorrectly
trigger the client to log that it missed its heartbeat.
It would change the status of the Node to initializing until the next
heartbeat occured.
2018-02-16 17:49:31 -08:00
Alex Dadgar
eff4455c68
Fix original client server list behavior
2018-02-15 16:04:53 -08:00
Alex Dadgar
0ebf7f3b7f
remove tmp file
2018-02-15 15:51:27 -08:00
Alex Dadgar
f9cf642436
Client tls
2018-02-15 15:22:57 -08:00
Alex Dadgar
0e85ae77b4
fix flaky gc tests
2018-02-15 13:59:03 -08:00
Alex Dadgar
38b695b69c
feedback and rebasing
2018-02-15 13:59:03 -08:00
Alex Dadgar
9117ef4650
HTTP agent
2018-02-15 13:59:03 -08:00
Alex Dadgar
d7029965ca
Server side impl + touch ups
2018-02-15 13:59:02 -08:00
Alex Dadgar
ce0caccad2
client implementation of alloc gc and stats
2018-02-15 13:59:02 -08:00
Alex Dadgar
e685211892
Code review feedback
2018-02-15 13:59:02 -08:00
Alex Dadgar
a9c4f8a4c8
clarify force
2018-02-15 13:59:02 -08:00
Alex Dadgar
dc75501c69
Respond to comments
2018-02-15 13:59:02 -08:00
Alex Dadgar
cea77df6a7
Add Streaming RPC ack
...
This PR introduces an ack allowing the receiving end of the streaming
RPC to return any error that may have occured during the establishment
of the streaming RPC.
2018-02-15 13:59:02 -08:00
Alex Dadgar
2f9d33f479
vet
2018-02-15 13:59:02 -08:00
Alex Dadgar
f5f43218f5
HTTP and tests
2018-02-15 13:59:02 -08:00
Alex Dadgar
6546b43a17
Client implementation of stream
2018-02-15 13:59:02 -08:00
Alex Dadgar
9a5569678c
Client Stat/List impl
2018-02-15 13:59:02 -08:00
Alex Dadgar
8854b35b34
Agent logs
2018-02-15 13:59:02 -08:00