Commit graph

224 commits

Author SHA1 Message Date
Sean Chittenden 65319252b9
Rework server_auto_join to use a timer instead of the peer count.
It is perfectly viable for an admin to downsize a Nomad Server cluster
down to 1, 2, or `num % 2 == 0` (however ill-advised such activities
may be).  And instead of using `bootstrap_expect`, use a timeout-based
strategy.  If the `bootstrapFn` hasn't observed a leader in 15s it will
fall back to Consul and will poll every ~60s until it sees a leader.
2016-06-16 12:14:03 -07:00
Sean Chittenden b0fecbefc1
Define BootstrapExepct as an int32 so it can be manipulated atomically. 2016-06-16 12:00:15 -07:00
Sean Chittenden 5b0def194a
Namespace the log messages 2016-06-15 12:40:51 -07:00
Sean Chittenden bffc82d668
Do not consider the number of Serf members when considering falling back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 324af8d7f1
Guard the auto-join functionality behind its consul.server_auto_join tunable 2016-06-15 12:40:51 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Sean Chittenden e8d1264dbc
Short-circuit the bootstrapFn if we have a leader 2016-06-15 12:40:51 -07:00
Sean Chittenden f05514335b
Teach Nomad servers how to fall back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 3d64daafd9
Fold RaftPeers() into its only call site now 2016-06-10 15:54:39 -04:00
Sean Chittenden bff57a0dce
Reconcile, clean up, and centralize API version numbers (major and minor).
Reduce future confusion by introducing a minor version that is gossiped out
via the `mvn` Serf tag (Minor Version Number, `vsn` is already being used for
to communicate `Major Version Number`).

Background: hashicorp/consul/issues/1346#issuecomment-151663152
2016-06-10 15:50:11 -04:00
Sean Chittenden d76c042a13
Invert error handling logic 2016-06-10 15:50:11 -04:00
Sean Chittenden 89168b0c51
Invert check definition so the error is first 2016-06-10 15:50:11 -04:00
Sean Chittenden 17116fc5a7
Rebalance Nomad client RPCs among different Nomad servers.
Implement client/rpc_proxy.RpcProxy.
2016-06-10 15:50:11 -04:00
Sean Chittenden 49deaae2ae
Seed random once in main 2016-06-10 15:48:36 -04:00
Sean Chittenden dc28ab0cb5
Speling police 2016-05-15 09:41:34 -07:00
Diptanu Choudhury 26d1b60369 Adding raft peers in agent info 2016-04-05 10:30:46 -07:00
Diptanu Choudhury d472dc2988 Adding the raft leader addr to server stats 2016-04-03 16:38:39 -07:00
Alex Dadgar bf74e2f790 display server leaders per region 2016-03-17 16:04:09 -07:00
Armon Dadgar 7fc7cd9453 nomad: batch client updates for 50msec 2016-02-21 18:51:34 -08:00
Alex Dadgar 143972b6d9 Job GC endpoint 2016-02-20 15:50:41 -08:00
Alex Dadgar 25c5e543f4 Use crypto random seed 2016-02-17 11:47:02 -08:00
Alex Dadgar 01cadf7cb0 Seed the servers random number generator 2016-02-16 19:40:02 -08:00
Alex Dadgar c55eb0816c Address comments 2016-01-31 18:46:45 -08:00
Alex Dadgar 74135f02a4 Blocked Eval tracker 2016-01-31 18:04:45 -08:00
Alex Dadgar 80dd30b03d Add force spawn endpoint 2016-01-13 10:19:53 -08:00
Alex Dadgar b3e87b6719 Remove the periodicRunner interface and pass the server as an interface to the periodicDispatcher 2015-12-23 18:26:39 -08:00
Alex Dadgar 670cc50a02 merge 2015-12-23 18:26:39 -08:00
Alex Dadgar a892d61ae7 FSM integration 2015-12-23 18:26:39 -08:00
Ryan Uber d983f266b0 nomad: sort regions before returning 2015-11-24 13:15:01 -08:00
Ryan Uber 39b2c3a07b nomad: use a read-only lock 2015-11-23 22:27:07 -08:00
Ryan Uber ad6b55a37a nomad: support listing regions 2015-11-23 22:27:03 -08:00
Armon Dadgar b68c8404b1 nomad: remove noisy logs 2015-09-21 14:14:19 -07:00
Chris Bednarski da93d4a30f Change error to err to be consistent with other usage 2015-09-11 10:26:33 -07:00
Chris Bednarski 39feffd67f Change debug to info 2015-09-11 10:24:52 -07:00
Chris Bednarski 6ea318cacb Change to use the logger instance defined earlier 2015-09-11 10:23:53 -07:00
Chris Bednarski 3000ef07c1 Added logging to server startup so we can see where it fails 2015-09-11 10:20:36 -07:00
Armon Dadgar 7d69aa78c1 nomad: using Raft StartAsLeader to make tests faster 2015-09-07 10:46:41 -07:00
Armon Dadgar 0702f17989 Rename client endpoint to node endpoint 2015-09-06 20:31:32 -07:00
Armon Dadgar 46cbe8285d nomad: adding Alloc endpoint 2015-09-06 15:34:28 -07:00
Armon Dadgar 4d6238ebb8 client: test updating alloc status 2015-08-29 14:22:24 -07:00
Armon Dadgar 2ea99f211a nomad: updating for new alloc representation 2015-08-25 17:36:52 -07:00
Armon Dadgar d4e2faf216 nomad: track the workers in the pool 2015-08-23 10:53:53 -07:00
Armon Dadgar 3d48ff4c6f nomad: no heartbeat for nodes in terminal status 2015-08-22 17:17:13 -07:00
Armon Dadgar 40def1a187 nomad: expose RuntimeStats 2015-08-20 15:29:30 -07:00
Armon Dadgar c1d8ab0983 nomad: rename client endpoint struct 2015-08-16 17:40:35 -07:00
Armon Dadgar 79a1471b85 nomad: add delivery limit to eval broker 2015-08-16 10:55:55 -07:00
Armon Dadgar aadecdd805 nomad: adding version API endpoint 2015-08-15 12:59:10 -07:00
Armon Dadgar 8c2f8cddd0 nomad: create system scheduler as needed 2015-08-06 17:08:40 -07:00
Armon Dadgar 5f1ebb9274 nomad: adding special 'system' scheduler 2015-08-06 17:04:35 -07:00
Armon Dadgar 7cdcc45c64 nomad: cleanup stats goroutines 2015-08-05 16:45:50 -07:00
Armon Dadgar 3c2a16038b nomad: first pass adding scheduling workers 2015-07-28 15:12:08 -07:00
Armon Dadgar 4be2c1b9db nomad: adding plan endpoint 2015-07-27 15:31:49 -07:00
Armon Dadgar 743c7f32ef nomad: integrate plan queue 2015-07-27 15:11:42 -07:00
Armon Dadgar 487acdd3c7 nomad: threading eval broker through fsm and state store 2015-07-23 22:30:08 -07:00
Armon Dadgar 93b832cf59 nomad: emit metrics from the eval broker 2015-07-23 22:17:37 -07:00
Armon Dadgar fc11808fe9 nomad: add eval broker, configurable nack timeout 2015-07-23 21:44:17 -07:00
Armon Dadgar eff7170c77 nomad: adding RPC endpoints for fetching Eval 2015-07-23 16:00:19 -07:00
Armon Dadgar 501a57f9d8 nomad: Adding the CRUD endpoints for jobs 2015-07-23 14:41:18 -07:00
Armon Dadgar 5fdf3c5d33 nomad: adding Client endpoint 2015-06-07 12:14:41 -07:00
Armon Dadgar 4f78d6a7e9 nomad: adding connection pool 2015-06-07 11:50:29 -07:00
Armon Dadgar ac6ed01a1c nomad: track local peers 2015-06-07 11:32:01 -07:00
Armon Dadgar 25aad83ea4 nomad: testing leader bridging of serf 2015-06-05 23:54:45 +02:00
Armon Dadgar c18c068bf2 nomad: testing remove peer 2015-06-04 13:02:39 +02:00
Armon Dadgar b456588d1c nomad: adding reconcile channel 2015-06-04 12:42:56 +02:00
Armon Dadgar 490fa1b7db nomad: testing serf join 2015-06-04 12:33:12 +02:00
Armon Dadgar 4f6ecce727 nomad: working on serf member parsing 2015-06-03 13:35:48 +02:00
Armon Dadgar a088a302e5 nomad: expose serf methods, add leave support 2015-06-03 13:25:50 +02:00
Armon Dadgar 6ebfa2bed4 nomad: expose serf stats 2015-06-03 13:08:04 +02:00
Armon Dadgar e219bc1e71 nomad: adding Serf integration 2015-06-03 12:58:00 +02:00
Armon Dadgar d52122f041 nomad: more skeleton 2015-06-03 12:26:50 +02:00
Armon Dadgar bad4bfadc4 nomad: embed the state store to reduce duplication 2015-06-03 11:26:49 +02:00
Armon Dadgar 4c554f14bb nomad: support DevMode configuration 2015-06-01 21:11:40 +02:00
Armon Dadgar 1e7f84f3e6 nomad: adding basic structure for raft 2015-06-01 17:49:10 +02:00