Michael Schurter
533bc93b3a
implement all boltdb interactions behind StateDB
2018-10-16 16:53:30 -07:00
Michael Schurter
a5d3e3fb0a
Implement alloc updates in arv2
...
Updates are applied asynchronously but sequentially
2018-10-16 16:53:30 -07:00
Michael Schurter
a4b4d7b266
consul service hook
...
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Michael Schurter
5be982e674
restore vault client
2018-10-16 16:53:29 -07:00
Alex Dadgar
fd3bc1bd39
Update state with server
2018-10-16 16:53:29 -07:00
Michael Schurter
7f4ec50906
missed locking around c.allocs access
2018-10-16 16:53:29 -07:00
Michael Schurter
516d641db0
client: implement all-or-nothing alloc restoration
...
Restoring calls NewAR -> Restore -> Run
NewAR now calls NewTR
AR.Restore calls TR.Restore
AR.Run calls TR.Run
2018-10-16 16:53:29 -07:00
Alex Dadgar
80f6ce50c0
vault hook
2018-10-16 16:53:29 -07:00
Michael Schurter
b360f6f96e
fix hclog level
2018-10-16 16:53:29 -07:00
Michael Schurter
4f43ff5c51
pass statedb into allocrunnerv2
2018-10-16 16:53:29 -07:00
Michael Schurter
0f7dcfdc9a
example redis job "runs" on arv2! see below
...
Tons left to do and lots of churn:
1. No state saving
2. No shutdown or gc
3. Removed AR factory *for now*
4. Made all "Config" structs local to the package they configure
5. Added allocID to GC to avoid a lookup
Really hating how many things use *structs.Allocation. It's not bad
without state saving, but if AllocRunner starts updating its copy things
get racy fast.
2018-10-16 16:53:29 -07:00
Alex Dadgar
01f8e5b95f
renames
2018-10-04 14:57:25 -07:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
5c8697667e
Node reserved resources
2018-09-29 18:44:55 -07:00
Alex Dadgar
3183153315
Node resources on client
2018-09-29 17:23:41 -07:00
Alex Dadgar
9971b3393f
yamux
2018-09-17 14:22:40 -07:00
Alex Dadgar
7739ef51ce
agent + consul
2018-09-13 10:43:40 -07:00
Michael Schurter
08862fc177
fix race around error handling
2018-09-05 17:34:17 -07:00
Preetha
043f4c208b
Merge pull request #3882 from burdandrei/telemetry-add-node-class-tag
...
Added node class to tagged metrics
2018-06-21 17:04:35 -05:00
Alex Dadgar
b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
...
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar
90c2108bfb
Fix gc tests + parallel destroy + small test fixes
2018-06-12 10:23:45 -07:00
Alex Dadgar
f5ff509fa5
Refactor - wip
2018-06-12 10:23:45 -07:00
Chelsea Holland Komlo
f74e74b22d
add client logic to determine whether TLS RPC connections should reload
2018-06-08 14:38:58 -04:00
Chelsea Holland Komlo
064b5481e0
add server join info to server and client
2018-05-31 10:50:03 -07:00
Chelsea Holland Komlo
38f611a7f2
refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
...
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Chelsea Holland Komlo
9b8a079558
fix up comments
2018-04-17 11:53:08 -04:00
Alex Dadgar
9d612c8cb0
Cleanup
2018-04-16 15:48:34 -07:00
Alex Dadgar
32adaf9dfc
Copy the config given to the alloc runner
2018-04-16 15:45:52 -07:00
Alex Dadgar
4f2a7b6949
Fix copying drivers
2018-04-16 15:45:51 -07:00
Alex Dadgar
0b799822ff
Operate on copy
2018-04-16 15:45:49 -07:00
Alex Dadgar
ff1a1a63e8
Move where attribute for driver detection is set
2018-04-12 15:50:25 -07:00
Alex Dadgar
f24ce2c50c
Driver health detection cleanups
...
This PR does:
1. Health message based on detection has format "Driver XXX detected"
and "Driver XXX not detected"
2. Set initial health description based on detection status and don't
wait for the first health check.
3. Combine updating attributes on the node, fingerprint and health
checking update for drivers into a single call back.
4. Condensed driver info in `node status` only shows detected drivers
and make the output less wide by removing spaces.
2018-04-12 12:46:40 -07:00
Andrei Burd
502d17fa90
Added node class to tagged metrics
2018-04-11 12:20:59 +03:00
Alex Dadgar
3d367d6fd7
Fix client uptime metric missing client prefix
2018-04-10 10:39:36 -07:00
Alex Dadgar
ae1f76477e
Start rebalance after discovering new servers
2018-04-05 15:41:59 -07:00
Alex Dadgar
be2513e0f9
more jitter
2018-04-05 13:48:33 -07:00
Alex Dadgar
bd3345942c
Handle no leader and faster retries near limit
...
Handle the ErrNoLeader case and apply slower retries. Also when we have
missed the heartbeat retry aggressively, backing off after we have
missed for more than 30 seconds.
2018-04-05 11:22:47 -07:00
Alex Dadgar
279b5c22e5
Scale heartbeat retrying based on remaining heartbeat time
2018-04-05 10:58:13 -07:00
Alex Dadgar
7941f4eb2d
Fire retry only when consul discovers new servers
2018-04-05 10:40:17 -07:00
Alex Dadgar
86c32358d4
Spelling error
2018-04-03 18:30:01 -07:00
Alex Dadgar
01a6beafbf
RPC Retry Watcher
2018-04-03 18:05:28 -07:00
Alex Dadgar
58a3ec3fb2
Improve Vault error handling
2018-04-03 14:29:22 -07:00
Chelsea Holland Komlo
2174ede6b9
add clarifying comment
2018-03-29 10:58:39 -04:00
Chelsea Holland Komlo
e3319afee1
emit first node event
2018-03-28 17:26:53 -04:00
Chelsea Holland Komlo
efc03e252c
specify driver health messages
2018-03-28 11:35:21 -04:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo
f801709a0a
fix issue when updating node events
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
60f12d206f
improve comments; update watchDriver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
739784736a
remove unused function
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
d92703617c
simplify logic
...
bump log level
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
86b7b3d2d9
fix up health check logic comparison; add node events to client driver checks
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
53a5bc2bb3
Code review feedback
2018-03-21 15:15:26 -04:00
Alex Dadgar
34dc58421c
notes from walk through
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
44b6951dda
improve tests
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
0425be8f48
updating comments; locking concurrent node access
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
c50d02ae93
go style; update comments
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3aa726baab
fix scheduler driver name; create node structs file
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
3cba95e8a7
allow nomad to schedule based on the status of a client driver health check
...
Slight updates for go style
2018-03-21 15:15:25 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Preetha Appan
3c38eededd
Fix spelling in comment
2018-03-14 15:54:25 -05:00
Alex Dadgar
bef4a8ee09
fix clearing node events
2018-03-14 09:48:59 -07:00
Chelsea Komlo
810eedfa2a
Merge pull request #3945 from hashicorp/f-add-node-events
...
Add node events
2018-03-14 08:42:55 -04:00
Preetha
360d6e5a92
Merge pull request #3968 from hashicorp/f-nicer-vault-error
...
Make server side error messages from vault more clearer
2018-03-13 20:49:39 -05:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
b41501e442
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
a8655320fd
fix up go check warnings
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
0934769b04
add client side emitting of node events
...
Changelog
2018-03-13 18:08:21 -07:00
Preetha Appan
914eaed64f
Address some code review comments
2018-03-13 18:19:16 -05:00
Preetha Appan
09c231ce43
Return the err from server correctly
2018-03-13 18:10:14 -05:00
Preetha Appan
9618f52746
Remove error wrapping and make vault connection server side errors clearer.
2018-03-13 17:09:03 -05:00
Alex Dadgar
4844317cc2
Merge pull request #3890 from hashicorp/b-heartbeat
...
Heartbeat improvements and handling failures during establishing leadership
2018-03-12 14:41:59 -07:00
Josh Soref
173ce63fe9
spelling: transition
2018-03-11 19:06:05 +00:00
Josh Soref
782c704de6
spelling: thresholds
2018-03-11 19:03:47 +00:00
Josh Soref
8149694f3a
spelling: server
2018-03-11 18:55:30 +00:00
Josh Soref
258d76ec13
spelling: registry
2018-03-11 18:41:13 +00:00
Josh Soref
3c1ce6d16d
spelling: otherwise
2018-03-11 18:34:27 +00:00
Josh Soref
1ef6d6319e
spelling: labels
2018-03-11 18:21:44 +00:00
Josh Soref
52b83328fc
spelling: heartbeating
2018-03-11 18:12:19 +00:00
Josh Soref
c9b86bbc2f
spelling: controls
2018-03-11 17:50:39 +00:00
Josh Soref
e78cf9c81a
spelling: already
2018-03-11 17:39:04 +00:00
Josh Soref
b8b46d3f74
spelling: allocation
2018-03-11 17:37:22 +00:00
Chelsea Holland Komlo
122d1c4e4a
simplify retry logic
2018-03-01 09:48:26 -05:00
Chelsea Holland Komlo
355805db56
reset timer after updating node copy
2018-02-27 17:18:10 -05:00
Chelsea Holland Komlo
a72aaaf47f
add network resources equal method, use time ticker
...
remove impossible test case
2018-02-27 12:42:53 -05:00
Chelsea Holland Komlo
e736e31820
use time ticker, update how network resources are compared
2018-02-26 18:47:11 -05:00
Chelsea Holland Komlo
5059065b52
improved testing; node networks comparison
2018-02-26 15:55:38 -05:00
Chelsea Holland Komlo
1f31b39fe8
code review fixups
2018-02-26 12:36:30 -05:00
Chelsea Holland Komlo
ed8c8afbcd
edge trigger node update
...
test update config copy trigger
2018-02-26 12:36:04 -05:00
Alex Dadgar
49a47483d1
Registering back to initializing
...
Fix a bug in which if the node attributes/meta changed, we would
re-register the node in status initializing. This would incorrectly
trigger the client to log that it missed its heartbeat.
It would change the status of the Node to initializing until the next
heartbeat occured.
2018-02-16 17:49:31 -08:00
Alex Dadgar
eff4455c68
Fix original client server list behavior
2018-02-15 16:04:53 -08:00
Alex Dadgar
f9cf642436
Client tls
2018-02-15 15:22:57 -08:00
Alex Dadgar
e685211892
Code review feedback
2018-02-15 13:59:02 -08:00
Alex Dadgar
2c0ad26374
New RPC Modes and basic setup for streaming RPC handlers
2018-02-15 13:59:01 -08:00
Alex Dadgar
9bc75f0ad4
Fix manager tests and make testagent recover from port conflicts
2018-02-15 13:59:01 -08:00
Alex Dadgar
3f1f8604bb
initial round of comment review
2018-02-15 13:59:01 -08:00
Alex Dadgar
c8c1284bc3
SetServer command actually returns an error if given an invalid server
2018-02-15 13:59:01 -08:00
Alex Dadgar
3f786b904b
use server manager
2018-02-15 13:59:01 -08:00
Alex Dadgar
6dd1c9f49d
Refactor
2018-02-15 13:59:00 -08:00