Alex Dadgar
6d8bb3a7bd
Duplicate blocked evals cancelling improved
...
The old logic for cancelling duplicate blocked evaluations by job id had
the issue where the newer evaluation could have additional node classes
that it is (in)eligible for that we would not capture. This could make
it such that cluster state could change such that the job would make
progress but no evaluation was unblocked.
2018-11-07 10:08:23 -08:00
Alex Dadgar
9971b3393f
yamux
2018-09-17 14:22:40 -07:00
Alex Dadgar
b2f500b48c
Serf/Raft/Memberlist logger
2018-09-17 13:57:52 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
3c19d01d7a
server
2018-09-15 16:23:13 -07:00
Chelsea Holland Komlo
de03ce8070
move logic to determine whether to reload tls configuration to tlsutil helper
2018-06-08 14:33:58 -04:00
Chelsea Holland Komlo
38f611a7f2
refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
...
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Chelsea Komlo
687c26093c
Merge pull request #4269 from hashicorp/f-tls-remove-weak-standards
...
Configurable TLS cipher suites and versions; disallow weak ciphers
2018-05-11 08:11:46 -04:00
Preetha Appan
ca5758741b
Update serf to pick up graceful leave fix
2018-05-10 11:16:24 -05:00
Chelsea Holland Komlo
620558c107
log error if unable to create TLS configuration
2018-05-10 11:51:54 -04:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Alex Dadgar
a510774451
Use UpdateAllocDesiredTransistion instead of UpsertEval but no transistions yet
2018-05-07 14:50:01 -05:00
Alex Dadgar
4a23307baf
Track all client connections
2018-04-26 13:22:09 -07:00
Alex Dadgar
7f28cfcdfe
small cleanup
2018-03-30 15:49:56 -07:00
Chelsea Holland Komlo
a77dd08dd9
prevent double close due to error in creating listener
2018-03-30 17:15:56 -04:00
Chelsea Holland Komlo
402a026c88
add further error handling for rpc connection handling
2018-03-30 17:03:36 -04:00
Chelsea Holland Komlo
58ada9bc42
return error when setting checksum; don't reload
2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo
2d5af7ff4d
set TLS checksum when parsing config
...
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Chelsea Holland Komlo
dd5f627feb
set server configuration checksum on reload
2018-03-27 18:03:52 -04:00
Chelsea Komlo
57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
...
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Michael Schurter
9898edfa90
Switch to drainerv2 impl
2018-03-21 16:51:44 -07:00
Alex Dadgar
e63bcb474d
Drainer
2018-03-21 16:51:44 -07:00
Michael Schurter
8b41e9b2e1
drainer: drainer should shutdown with server
2018-03-21 16:51:44 -07:00
Michael Schurter
0a17076ad2
refactor drainer into a subpkg
2018-03-21 16:51:44 -07:00
Chelsea Holland Komlo
66e44cdb73
Allow TLS configurations for HTTP and RPC connections to be reloaded separately
2018-03-21 17:51:08 -04:00
Alex Dadgar
b8607ad6d6
Heartbeat uses client rpc advertise and server defaults server rpc advertise addr
2018-03-16 16:47:08 -07:00
Alex Dadgar
52b7fb5361
Separate client and server rpc advertise addresses
2018-03-16 16:47:08 -07:00
Alex Dadgar
92cb552ff6
Always add core scheduler and detect invalid schedulers
2018-03-14 10:53:27 -07:00
Alex Dadgar
55e4f5cdc4
Require core scheduler
2018-03-14 10:37:49 -07:00
Josh Soref
173ce63fe9
spelling: transition
2018-03-11 19:06:05 +00:00
Josh Soref
258d76ec13
spelling: registry
2018-03-11 18:41:13 +00:00
Josh Soref
c9b86bbc2f
spelling: controls
2018-03-11 17:50:39 +00:00
Alex Dadgar
a1faab0e58
Server TLS
2018-02-15 15:03:12 -08:00
Alex Dadgar
d7029965ca
Server side impl + touch ups
2018-02-15 13:59:02 -08:00
Alex Dadgar
2f9d33f479
vet
2018-02-15 13:59:02 -08:00
Alex Dadgar
ce3674ea17
Server stat/list impl
2018-02-15 13:59:02 -08:00
Alex Dadgar
ddd67f5f11
Server streaming
2018-02-15 13:59:01 -08:00
Alex Dadgar
2c0ad26374
New RPC Modes and basic setup for streaming RPC handlers
2018-02-15 13:59:01 -08:00
Alex Dadgar
46770d57e5
Forwarding
2018-02-15 13:59:01 -08:00
Alex Dadgar
cfe9afc567
Store connection time
2018-02-15 13:59:01 -08:00
Alex Dadgar
6dd1c9f49d
Refactor
2018-02-15 13:59:00 -08:00
Alex Dadgar
ad7bc0c6bd
Server can forward ClientStats.Stats
2018-02-15 13:59:00 -08:00
Alex Dadgar
940a2df8a1
Pull inmem codec to helper
2018-02-15 13:59:00 -08:00
Alex Dadgar
13bbf3fbbb
Track client connections
2018-02-15 13:59:00 -08:00
Alex Dadgar
ba5ecb8c1a
Dynamic RPC servers with context
2018-02-15 13:59:00 -08:00
Alex Dadgar
288b3c0e05
Helper to populate RPC server endpoints
2018-02-15 13:59:00 -08:00
Kyle Havlovitz
709b693d39
Clean up some leftover autopilot differences from Consul
2018-02-08 10:27:26 -08:00
Kyle Havlovitz
2ccf565bf6
Refactor redundancy_zone/upgrade_version out of client meta
2018-01-29 20:03:38 -08:00
Kyle Havlovitz
a162b9ce14
Move server health loop into autopilot leader actions
2018-01-23 12:57:02 -08:00
Chelsea Komlo
d09cc2a69f
Merge pull request #3492 from hashicorp/f-client-tls-reload
...
Client/Server TLS dynamic reload
2018-01-23 05:51:32 -05:00
Chelsea Holland Komlo
7d3c240871
swap raft layer tls wrapper
2018-01-19 17:00:15 -05:00
Chelsea Holland Komlo
a8f655fbb3
allow for similar error messages for closed connections
2018-01-17 12:02:40 -05:00
Chelsea Holland Komlo
35466a331a
fixing up raft reload tests
...
close second goroutine in raft-net
2018-01-17 10:29:15 -05:00
Kyle Havlovitz
7b980c42d8
Add raft remove by id endpoint/command
2018-01-16 13:35:32 -08:00
Chelsea Holland Komlo
5f52e8e103
feedback from code review
2018-01-16 11:55:11 -05:00
Chelsea Holland Komlo
649f86f094
refactor creating a new tls configuration
2018-01-16 08:02:39 -05:00
Chelsea Holland Komlo
214d128eb9
reload raft transport layer
...
fix up linting
2018-01-08 14:52:28 -05:00
Chelsea Holland Komlo
0708d34135
call reload on agent, client, and server separately
2018-01-08 09:56:31 -05:00
Chelsea Holland Komlo
909bb0af07
refactor rpc listener methods, wait for proper shutdown
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
6a2432659a
code review fixups
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
9741097406
reloading tls config should be atomic for clients/servers
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
e7bd156ef2
check error on generating tls context
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
9b0a7a7f7c
remove code duplication
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
4e0dbd23cf
prevent races when reloading, fully shut down raft
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
ae7fc4695e
fixups from code review
...
Revert "close raft long-lived connections"
This reverts commit 3ffda28206fcb3d63ad117fd1d27ae6f832b6625.
reload raft connections on changing tls
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
dfb6a3d9a8
close raft long-lived connections
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
acd3d1b162
fix up downgrading client to plaintext
...
add locks around changing server configuration
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo
c0ad9a4627
add ability to upgrade/downgrade nomad agents tls configurations via sighup
2018-01-08 09:21:06 -05:00
Preetha Appan
fcded9ba61
Add a TODO comment around handling peer address for remove peer correctly for raft protocol 3
2018-01-05 14:22:45 -06:00
Kyle Havlovitz
1c07066064
Add autopilot functionality based on Consul's autopilot
2017-12-18 14:29:41 -08:00
Kyle Havlovitz
b775fc7b33
Added support for v2 raft APIs and -raft-protocol option
2017-12-12 10:17:16 -06:00
Chelsea Komlo
2dfda33703
Nomad agent reload TLS configuration on SIGHUP ( #3479 )
...
* Allow server TLS configuration to be reloaded via SIGHUP
* dynamic tls reloading for nomad agents
* code cleanup and refactoring
* ensure keyloader is initialized, add comments
* allow downgrading from TLS
* initalize keyloader if necessary
* integration test for tls reload
* fix up test to assert success on reloaded TLS configuration
* failure in loading a new TLS config should remain at current
Reload only the config if agent is already using TLS
* reload agent configuration before specific server/client
lock keyloader before loading/caching a new certificate
* introduce a get-or-set method for keyloader
* fixups from code review
* fix up linting errors
* fixups from code review
* add lock for config updates; improve copy of tls config
* GetCertificate only reloads certificates dynamically for the server
* config updates/copies should be on agent
* improve http integration test
* simplify agent reloading storing a local copy of config
* reuse the same keyloader when reloading
* Test that server and client get reloaded but keep keyloader
* Keyloader exposes GetClientCertificate as well for outgoing connections
* Fix spelling
* correct changelog style
2017-11-14 17:53:23 -08:00
Alex Dadgar
5c34af1ee1
leader acl token
2017-10-23 14:10:14 -07:00
Alex Dadgar
c1cc51dbee
sync
2017-10-13 14:36:02 -07:00
Alex Dadgar
4173834231
Enable more linters
2017-09-26 15:26:33 -07:00
Alex Dadgar
e5ec915ac3
sync
2017-09-19 10:08:23 -05:00
Alex Dadgar
84d06f6abe
Sync namespace changes
2017-09-07 17:04:21 -07:00
Armon Dadgar
3e46094cee
Passthrough replication token for token/policy replication
2017-09-04 13:05:53 -07:00
Armon Dadgar
dc1904b57a
nomad: adding ACL token resolution logic
2017-09-04 13:04:45 -07:00
Armon Dadgar
e4f5f305ea
nomad: adding Get/List endpoints for ACL policies
2017-09-04 13:03:15 -07:00
Alex Dadgar
62c14c21a5
Merge pull request #3142 from hashicorp/f-deployment-watcher
...
Deployment watcher takes state store
2017-08-31 10:45:17 -07:00
Jeremy Olexa
f94f237597
Update peers.info message for operators
2017-08-31 08:51:04 -05:00
Alex Dadgar
590ff91bf3
Deployment watcher takes state store
2017-08-30 18:51:59 -07:00
Chelsea Holland Komlo
465c4d7082
change endpoint to /v1/search
2017-08-14 17:38:10 +00:00
Chelsea Holland Komlo
5ee58a391b
rename to cluster search
...
comment updates
2017-08-14 17:36:14 +00:00
Luke Farnell
f0ced87b95
fixed all spelling mistakes for goreport
2017-08-07 17:13:05 -04:00
Chelsea Holland Komlo
4dd6b46198
Retrieve job information for resources endpoint
...
requires further refactoring and logic for more contexts
2017-08-04 14:34:25 +00:00
Alex Dadgar
dad9e69822
more comment fixes
2017-07-07 12:03:11 -07:00
Alex Dadgar
7154e4e08f
Remove setters
2017-07-07 12:03:11 -07:00
Alex Dadgar
87d187d777
Tests
2017-07-07 12:03:11 -07:00
Alex Dadgar
6f821beec4
fix integration slightly
2017-07-07 12:03:11 -07:00
Alex Dadgar
7af65aa3d7
Add watcher to server
2017-07-07 12:03:11 -07:00
Alex Dadgar
d04877d23c
initial impl
2017-07-07 12:03:11 -07:00
Michael Schurter
33318501b6
Backoff on Consul lookup failures
2017-04-19 12:42:47 -07:00
Michael Schurter
e204a287ed
Refactor Consul Syncer into new ServiceClient
...
Fixes #2478 #2474 #1995 #2294
The new client only handles agent and task service advertisement. Server
discovery is mostly unchanged.
The Nomad client agent now handles all Consul operations instead of the
executor handling task related operations. When upgrading from an
earlier version of Nomad existing executors will be told to deregister
from Consul so that the Nomad agent can re-register the task's services
and checks.
Drivers - other than qemu - now support an Exec method for executing
abritrary commands in a task's environment. This is used to implement
script checks.
Interfaces are used extensively to avoid interacting with Consul in
tests that don't assert any Consul related behavior.
2017-04-19 12:42:47 -07:00
Alex Dadgar
a9c8b09da8
Push to configs
2017-04-14 15:24:55 -07:00
Alex Dadgar
5be806a3df
Fix vet script and fix vet problems
...
This PR fixes our vet script and fixes all the missed vet changes.
It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
2017-02-27 16:00:19 -08:00
Alex Dadgar
8bfc4255eb
Add server metrics
2017-02-14 16:02:18 -08:00
Alex Dadgar
b1cd81e997
Remove todos
2017-02-10 15:41:23 -08:00
Alex Dadgar
2d4d9b79d8
Operator command/endpoint/documentation
2017-02-09 18:04:46 -08:00
Alex Dadgar
ae31f4c84e
Respond to comments
2017-02-08 14:50:19 -08:00
Alex Dadgar
ee368762ae
It builds
2017-02-02 16:07:15 -08:00
Alex Dadgar
26db1bd12c
Join + Leave peer
2017-02-02 15:49:06 -08:00
Alex Dadgar
ac10aed731
Update setupRaft
2017-02-02 15:31:36 -08:00
Alex Dadgar
78cfcd2724
Bump protocol version and update numOtherPeers
2017-02-02 13:52:31 -08:00
Alex Dadgar
15ffdff497
Vault Client on Server handles SIGHUP
...
This PR allows the Vault client on the server to handle a SIGHUP. This
allows updating the Vault token and any other configuration without
downtime.
2017-02-01 14:24:10 -08:00
Diptanu Choudhury
e927de02d2
Moved functions to helper from structs
2017-01-18 15:55:14 -08:00
Diptanu Choudhury
1a8fa8c8d5
Making Nomad TLS configs region aware
2016-11-01 11:55:29 -07:00
Diptanu Choudhury
7c61e115bd
Moved tlsutil into helpers
2016-10-25 16:05:37 -07:00
Diptanu Choudhury
cf35aeac84
Moving the TLSConfig to structs
2016-10-25 15:57:38 -07:00
Diptanu Choudhury
e03927bb5c
Changed the way TLS config is parsed
2016-10-24 13:56:19 -07:00
Diptanu Choudhury
2e3118e69c
Implemented TLS support for http and rpc
2016-10-23 22:22:00 -07:00
Diptanu Choudhury
0f6e0d10b6
Enable serf encryption ( #1791 )
...
* Added the keygen command
* Added support for gossip encryption
* Changed the URL for keyring management
* Fixed the cli
* Added some tests
* Added tests for keyring operations
* Added a test for removal of keys
* Added some docs
* Fixed some docs
* Added general options
2016-10-17 10:48:04 -07:00
Alex Dadgar
48696ba0cc
Use tomb to shutdown
...
Token revocation
Remove from the statestore
Revoke tokens
Don't error when Vault is disabled as this could cause issue if the operator ever goes from enabled to disabled
update server interface to allow enable/disable and config loading
test the new functions
Leader revoke
Use active
2016-08-28 14:06:25 -07:00
Alex Dadgar
713e310670
Renew loop
2016-08-17 16:25:38 -07:00
Alex Dadgar
750a44b2c0
Create a Vault interface for the server
2016-08-17 16:25:38 -07:00
Alex Dadgar
6e2f0a2776
Server has Vault API client
2016-08-17 16:25:38 -07:00
Sean Chittenden
8bdb38d016
Code golf
...
Pointed out by: @dadgar
2016-06-21 14:26:01 -07:00
Sean Chittenden
df4fe2e502
Fix the shuffling of remote datacenters.
...
Pointed out by: @ryanuber
2016-06-21 13:37:22 -07:00
Sean Chittenden
46e2d54acf
Provide `nomad.Config` with a default `LogOutput` of `os.StdErr`
2016-06-17 06:44:10 -07:00
Sean Chittenden
9a60999100
Pass a logger arg to `NewClient` and `NewServer`
2016-06-16 23:29:23 -07:00
Sean Chittenden
7c24487850
Fix up various error handling
2016-06-16 14:40:09 -07:00
Sean Chittenden
71cd9984ae
Immediately query Consul upon initialization if we have no peers.
...
Also don't attempt to join the Server with itself.
2016-06-16 14:27:10 -07:00
Sean Chittenden
65319252b9
Rework `server_auto_join` to use a timer instead of the peer count.
...
It is perfectly viable for an admin to downsize a Nomad Server cluster
down to 1, 2, or `num % 2 == 0` (however ill-advised such activities
may be). And instead of using `bootstrap_expect`, use a timeout-based
strategy. If the `bootstrapFn` hasn't observed a leader in 15s it will
fall back to Consul and will poll every ~60s until it sees a leader.
2016-06-16 12:14:03 -07:00
Sean Chittenden
b0fecbefc1
Define `BootstrapExepct` as an `int32` so it can be manipulated atomically.
2016-06-16 12:00:15 -07:00
Sean Chittenden
5b0def194a
Namespace the log messages
2016-06-15 12:40:51 -07:00
Sean Chittenden
bffc82d668
Do not consider the number of Serf members when considering falling back to Consul.
2016-06-15 12:40:51 -07:00
Sean Chittenden
324af8d7f1
Guard the auto-join functionality behind its `consul.server_auto_join` tunable
2016-06-15 12:40:51 -07:00
Sean Chittenden
5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
...
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs. With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden
2123460cf0
Bump various Consul search limits
...
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.
This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Sean Chittenden
e8d1264dbc
Short-circuit the bootstrapFn if we have a leader
2016-06-15 12:40:51 -07:00
Sean Chittenden
f05514335b
Teach Nomad servers how to fall back to Consul.
2016-06-15 12:40:51 -07:00
Sean Chittenden
3d64daafd9
Fold RaftPeers() into its only call site now
2016-06-10 15:54:39 -04:00
Sean Chittenden
bff57a0dce
Reconcile, clean up, and centralize API version numbers (major and minor).
...
Reduce future confusion by introducing a minor version that is gossiped out
via the `mvn` Serf tag (Minor Version Number, `vsn` is already being used for
to communicate `Major Version Number`).
Background: hashicorp/consul/issues/1346#issuecomment-151663152
2016-06-10 15:50:11 -04:00
Sean Chittenden
d76c042a13
Invert error handling logic
2016-06-10 15:50:11 -04:00
Sean Chittenden
89168b0c51
Invert check definition so the error is first
2016-06-10 15:50:11 -04:00
Sean Chittenden
17116fc5a7
Rebalance Nomad client RPCs among different Nomad servers.
...
Implement client/rpc_proxy.RpcProxy.
2016-06-10 15:50:11 -04:00
Sean Chittenden
49deaae2ae
Seed random once in main
2016-06-10 15:48:36 -04:00
Sean Chittenden
dc28ab0cb5
Speling police
2016-05-15 09:41:34 -07:00
Diptanu Choudhury
26d1b60369
Adding raft peers in agent info
2016-04-05 10:30:46 -07:00
Diptanu Choudhury
d472dc2988
Adding the raft leader addr to server stats
2016-04-03 16:38:39 -07:00
Alex Dadgar
bf74e2f790
display server leaders per region
2016-03-17 16:04:09 -07:00
Armon Dadgar
7fc7cd9453
nomad: batch client updates for 50msec
2016-02-21 18:51:34 -08:00
Alex Dadgar
143972b6d9
Job GC endpoint
2016-02-20 15:50:41 -08:00
Alex Dadgar
25c5e543f4
Use crypto random seed
2016-02-17 11:47:02 -08:00
Alex Dadgar
01cadf7cb0
Seed the servers random number generator
2016-02-16 19:40:02 -08:00
Alex Dadgar
c55eb0816c
Address comments
2016-01-31 18:46:45 -08:00
Alex Dadgar
74135f02a4
Blocked Eval tracker
2016-01-31 18:04:45 -08:00
Alex Dadgar
80dd30b03d
Add force spawn endpoint
2016-01-13 10:19:53 -08:00
Alex Dadgar
b3e87b6719
Remove the periodicRunner interface and pass the server as an interface to the periodicDispatcher
2015-12-23 18:26:39 -08:00