Commit graph

78 commits

Author SHA1 Message Date
Sean Chittenden 46e2d54acf
Provide nomad.Config with a default LogOutput of os.StdErr 2016-06-17 06:44:10 -07:00
Sean Chittenden 9a60999100
Pass a logger arg to NewClient and NewServer 2016-06-16 23:29:23 -07:00
Sean Chittenden 7c24487850
Fix up various error handling 2016-06-16 14:40:09 -07:00
Sean Chittenden 71cd9984ae
Immediately query Consul upon initialization if we have no peers.
Also don't attempt to join the Server with itself.
2016-06-16 14:27:10 -07:00
Sean Chittenden 65319252b9
Rework server_auto_join to use a timer instead of the peer count.
It is perfectly viable for an admin to downsize a Nomad Server cluster
down to 1, 2, or `num % 2 == 0` (however ill-advised such activities
may be).  And instead of using `bootstrap_expect`, use a timeout-based
strategy.  If the `bootstrapFn` hasn't observed a leader in 15s it will
fall back to Consul and will poll every ~60s until it sees a leader.
2016-06-16 12:14:03 -07:00
Sean Chittenden b0fecbefc1
Define BootstrapExepct as an int32 so it can be manipulated atomically. 2016-06-16 12:00:15 -07:00
Sean Chittenden 5b0def194a
Namespace the log messages 2016-06-15 12:40:51 -07:00
Sean Chittenden bffc82d668
Do not consider the number of Serf members when considering falling back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 324af8d7f1
Guard the auto-join functionality behind its consul.server_auto_join tunable 2016-06-15 12:40:51 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Sean Chittenden e8d1264dbc
Short-circuit the bootstrapFn if we have a leader 2016-06-15 12:40:51 -07:00
Sean Chittenden f05514335b
Teach Nomad servers how to fall back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 3d64daafd9
Fold RaftPeers() into its only call site now 2016-06-10 15:54:39 -04:00
Sean Chittenden bff57a0dce
Reconcile, clean up, and centralize API version numbers (major and minor).
Reduce future confusion by introducing a minor version that is gossiped out
via the `mvn` Serf tag (Minor Version Number, `vsn` is already being used for
to communicate `Major Version Number`).

Background: hashicorp/consul/issues/1346#issuecomment-151663152
2016-06-10 15:50:11 -04:00
Sean Chittenden d76c042a13
Invert error handling logic 2016-06-10 15:50:11 -04:00
Sean Chittenden 89168b0c51
Invert check definition so the error is first 2016-06-10 15:50:11 -04:00
Sean Chittenden 17116fc5a7
Rebalance Nomad client RPCs among different Nomad servers.
Implement client/rpc_proxy.RpcProxy.
2016-06-10 15:50:11 -04:00
Sean Chittenden 49deaae2ae
Seed random once in main 2016-06-10 15:48:36 -04:00
Sean Chittenden dc28ab0cb5
Speling police 2016-05-15 09:41:34 -07:00
Diptanu Choudhury 26d1b60369 Adding raft peers in agent info 2016-04-05 10:30:46 -07:00
Diptanu Choudhury d472dc2988 Adding the raft leader addr to server stats 2016-04-03 16:38:39 -07:00
Alex Dadgar bf74e2f790 display server leaders per region 2016-03-17 16:04:09 -07:00
Armon Dadgar 7fc7cd9453 nomad: batch client updates for 50msec 2016-02-21 18:51:34 -08:00
Alex Dadgar 143972b6d9 Job GC endpoint 2016-02-20 15:50:41 -08:00
Alex Dadgar 25c5e543f4 Use crypto random seed 2016-02-17 11:47:02 -08:00
Alex Dadgar 01cadf7cb0 Seed the servers random number generator 2016-02-16 19:40:02 -08:00
Alex Dadgar c55eb0816c Address comments 2016-01-31 18:46:45 -08:00
Alex Dadgar 74135f02a4 Blocked Eval tracker 2016-01-31 18:04:45 -08:00
Alex Dadgar 80dd30b03d Add force spawn endpoint 2016-01-13 10:19:53 -08:00
Alex Dadgar b3e87b6719 Remove the periodicRunner interface and pass the server as an interface to the periodicDispatcher 2015-12-23 18:26:39 -08:00
Alex Dadgar 670cc50a02 merge 2015-12-23 18:26:39 -08:00
Alex Dadgar a892d61ae7 FSM integration 2015-12-23 18:26:39 -08:00
Ryan Uber d983f266b0 nomad: sort regions before returning 2015-11-24 13:15:01 -08:00
Ryan Uber 39b2c3a07b nomad: use a read-only lock 2015-11-23 22:27:07 -08:00
Ryan Uber ad6b55a37a nomad: support listing regions 2015-11-23 22:27:03 -08:00
Armon Dadgar b68c8404b1 nomad: remove noisy logs 2015-09-21 14:14:19 -07:00
Chris Bednarski da93d4a30f Change error to err to be consistent with other usage 2015-09-11 10:26:33 -07:00
Chris Bednarski 39feffd67f Change debug to info 2015-09-11 10:24:52 -07:00
Chris Bednarski 6ea318cacb Change to use the logger instance defined earlier 2015-09-11 10:23:53 -07:00
Chris Bednarski 3000ef07c1 Added logging to server startup so we can see where it fails 2015-09-11 10:20:36 -07:00
Armon Dadgar 7d69aa78c1 nomad: using Raft StartAsLeader to make tests faster 2015-09-07 10:46:41 -07:00
Armon Dadgar 0702f17989 Rename client endpoint to node endpoint 2015-09-06 20:31:32 -07:00
Armon Dadgar 46cbe8285d nomad: adding Alloc endpoint 2015-09-06 15:34:28 -07:00
Armon Dadgar 4d6238ebb8 client: test updating alloc status 2015-08-29 14:22:24 -07:00
Armon Dadgar 2ea99f211a nomad: updating for new alloc representation 2015-08-25 17:36:52 -07:00
Armon Dadgar d4e2faf216 nomad: track the workers in the pool 2015-08-23 10:53:53 -07:00
Armon Dadgar 3d48ff4c6f nomad: no heartbeat for nodes in terminal status 2015-08-22 17:17:13 -07:00
Armon Dadgar 40def1a187 nomad: expose RuntimeStats 2015-08-20 15:29:30 -07:00
Armon Dadgar c1d8ab0983 nomad: rename client endpoint struct 2015-08-16 17:40:35 -07:00