Chelsea Komlo
4338360da9
Merge pull request #4065 from hashicorp/emit-node-event-on-first-health-change
...
Emit first node event after initialization on health status change
2018-03-29 11:23:25 -04:00
Chelsea Holland Komlo
2174ede6b9
add clarifying comment
2018-03-29 10:58:39 -04:00
Michael Schurter
3a79c32677
Merge pull request #4059 from hashicorp/b-drain-health-svc-only
...
only service allocs should have health watched
2018-03-28 16:49:22 -07:00
Michael Schurter
5eb0cb7176
only service allocs should have health watched
2018-03-28 16:20:11 -07:00
Chelsea Holland Komlo
e3319afee1
emit first node event
2018-03-28 17:26:53 -04:00
Chelsea Komlo
7812ac5abf
Merge pull request #4057 from hashicorp/specify-docker-msg
...
Specify docker name in driver health messages
2018-03-28 13:32:36 -04:00
Preetha
177d2d6010
Merge pull request #4052 from hashicorp/f-specify-total-memory
...
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo
efc03e252c
specify driver health messages
2018-03-28 11:35:21 -04:00
Preetha Appan
329428b49f
Code review feedback and unit test
2018-03-28 10:07:15 -05:00
Charlie Voiselle
ea10588227
rkt: logging enhancements ( #4044 )
...
* Added extra debug logging; extended timeout; added jitter.
* small log changes
* increase timeout
* remove unneccessary uuid
2018-03-27 17:30:06 -07:00
Michael Schurter
fcaee471a0
client: always mark exited sys/svc allocs as failed
...
When restarts.attempts=0 was set in a jobspec a system or service alloc
that exited with 0 status would be marked as `completed` instead of
`failed`. Since system and service jobs are intended to run until
stopped or updated, they should always be marked as failed when they
exit even in cases where the exit code is 0.
2018-03-27 14:30:19 -07:00
Mildred Ki'Lya
1017cbe8ab
Allow to specify total memory on agent configuration
...
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Alex Dadgar
432784dae3
Fix alloc watcher snapshot streaming
2018-03-27 11:14:53 -07:00
Alex Dadgar
05449fea09
drop stats fetching log
2018-03-23 12:01:50 -07:00
Chelsea Komlo
5f0c382021
Merge pull request #4030 from hashicorp/health-check-ux
...
UX improvments to driver health checks
2018-03-23 09:46:50 -04:00
Alex Dadgar
da27fc3880
Driver Info output
2018-03-22 17:18:32 -07:00
Chelsea Holland Komlo
e9005d8cfb
ux improvments to driver health checks
2018-03-22 18:38:29 -04:00
Michael Schurter
a318684738
Merge pull request #4022 from hashicorp/f-more-executor-logging
...
executor: increase level for helpful log lines
2018-03-22 15:21:20 -07:00
Michael Schurter
a4f346abeb
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
8b346c6176
test: try to prevent flakiness on travis
2018-03-21 16:51:45 -07:00
Michael Schurter
1b7ac447e9
alloc_runner: watch health for deployed batch jobs
2018-03-21 16:51:45 -07:00
Michael Schurter
62960ed7bd
client: don't monitor health of non-service jobs
...
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
bb0ff44fb4
mock_driver: improve Kill() logging
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Holland Komlo
f329e45e03
always set initial health status for every driver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
bbaffe3eca
set driver to unhealthy once if it cannot be detected in periodic check
2018-03-21 15:15:26 -04:00
Alex Dadgar
5df4b3728d
Docker driver doesn't return errors but injects into the DriverInfo
2018-03-21 15:15:26 -04:00
Alex Dadgar
4365bb7f59
Only run health check if driver is detected
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
f801709a0a
fix issue when updating node events
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
285729aee2
function rename and re-arrange functions in fingerprint_manager
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
60f12d206f
improve comments; update watchDriver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
739784736a
remove unused function
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d92703617c
simplify logic
...
bump log level
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
86b7b3d2d9
fix up health check logic comparison; add node events to client driver checks
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
53a5bc2bb3
Code review feedback
2018-03-21 15:15:26 -04:00
Alex Dadgar
34dc58421c
notes from walk through
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
44b6951dda
improve tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d740a6a46e
refresh driver information for non-health checking drivers periodically
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d8f68e5ef8
fix up codereview feedback
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d5f6c940c4
fix up racy tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
0425be8f48
updating comments; locking concurrent node access
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
c50d02ae93
go style; update comments
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3cba95e8a7
allow nomad to schedule based on the status of a client driver health check
...
Slight updates for go style
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Michael Schurter
1022170bf3
executor: increase level for helpful log lines
...
Should help with debugging issues like #3971
2018-03-21 11:53:58 -07:00
Michael Schurter
1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
...
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00