Commit graph

124 commits

Author SHA1 Message Date
Mahmood Ali e436d2701a Handle Nomad leadership flapping
Fixes a deadlock in leadership handling if leadership flapped.

Raft propagates leadership transition to Nomad through a NotifyCh channel.
Raft blocks when writing to this channel, so channel must be buffered or
aggressively consumed[1]. Otherwise, Raft blocks indefinitely in `raft.runLeader`
until the channel is consumed[1] and does not move on to executing follower
related logic (in `raft.runFollower`).

While Raft `runLeader` defer function blocks, raft cannot process any other
raft operations.  For example, `run{Leader|Follower}` methods consume
`raft.applyCh`, and while runLeader defer is blocked, all raft log applications
or config lookup will block indefinitely.

Sadly, `leaderLoop` and `establishLeader` makes few Raft calls!
`establishLeader` attempts to auto-create autopilot/scheduler config [3]; and
`leaderLoop` attempts to check raft configuration [4].  All of these calls occur
without a timeout.

Thus, if leadership flapped quickly while `leaderLoop/establishLeadership` is
invoked and hit any of these Raft calls, Raft handler _deadlock_ forever.

Depending on how many times it flapped and where exactly we get stuck, I suspect
it's possible to get in the following case:

* Agent metrics/stats http and RPC calls hang as they check raft.Configurations
* raft.State remains in Leader state, and server attempts to handle RPC calls
  (e.g. node/alloc updates) and these hang as well

As we create goroutines per RPC call, the number of goroutines grow over time
and may trigger a out of memory errors in addition to missed updates.

[1] d90d6d6bda/config.go (L190-L193)
[2] d90d6d6bda/raft.go (L425-L436)
[3] 2a89e47746/nomad/leader.go (L198-L202)
[4] 2a89e47746/nomad/leader.go (L877)
2020-01-22 13:08:34 -05:00
Mahmood Ali 129c884105 extract leader step function 2020-01-22 10:55:48 -05:00
Mahmood Ali d699a70875
Merge pull request #5911 from hashicorp/b-rpc-consistent-reads
Block rpc handling until state store is caught up
2019-08-20 09:29:37 -04:00
Jasmine Dahilig 8d980edd2e
add create and modify timestamps to evaluations (#5881) 2019-08-07 09:50:35 -07:00
Pete Woods 9096aa3d23
Add job status metrics
This avoids having to write services to repeatedly hit the jobs API
2019-07-26 10:12:49 +01:00
Mahmood Ali ea3a98357f Block rpc handling until state store is caught up
Here, we ensure that when leader only responds to RPC calls when state
store is up to date.  At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.

The solution here is to have an RPC consistency read gate, enabled when
`establishLeadership` completes before we respond to RPC calls.
`establishLeadership` is gated by a `raft.Barrier` which ensures that
all prior raft logs have been applied.

Conversely, the gate is disabled when leadership is lost.

This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files
2019-07-02 16:07:37 +08:00
Preetha Appan 10e7d6df6d
Remove compat code associated with many previous versions of nomad
This removes compat code for namespaces (0.7), Drain(0.8) and other
older features from releases older than Nomad 0.7
2019-06-25 19:05:25 -05:00
Chris Baker e0170e1c67 metrics: add namespace label to allocation metrics 2019-06-17 20:50:26 +00:00
Michael Schurter 073893f529 nomad: disable service+batch preemption by default
Enterprise only.

Disable preemption for service and batch jobs by default.

Maintain backward compatibility in a x.y.Z release. Consider switching
the default for new clusters in the future.
2019-06-04 15:54:50 -07:00
Preetha Appan ad3c263d3f
Rename to match system scheduler config.
Also added docs
2019-05-03 14:06:12 -05:00
Preetha Appan 6615d5c868
Add config to disable preemption for batch/service jobs 2019-04-29 18:48:07 -05:00
Arshneet Singh b977748a4b Add code for plan normalization 2019-04-23 09:18:01 -07:00
Charlie Voiselle c28c195f42 Set NextEval when making failed-follow-up evals
This allows users to locate failed-follow-up evals more easily
2019-02-20 16:07:11 -08:00
Preetha Appan 7578522f58
variable name fix 2019-01-29 13:48:45 -06:00
Preetha Appan a6cebbbf9e
Make sure that all servers are 0.9 before applying scheduler config entry 2019-01-29 12:47:42 -06:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Nick Ethier b1484aec33
nomad: fix hclog usage 2018-11-29 22:27:39 -05:00
Nick Ethier 5c5cae79ab
nomad: only lookup job is disable_dispatched_job_summary_metrics is set 2018-11-19 23:22:23 -05:00
Nick Ethier 8ac69f440d
nomad: lookup job instead of adding Dispatched to summary 2018-11-19 23:22:02 -05:00
Nick Ethier 85b221a1d6
nomad: add flag to disable publishing of job_summary metrics for dispatched jobs 2018-11-19 23:21:19 -05:00
Preetha Appan 57fe5050f0
more minor review feedback 2018-11-01 17:05:17 -05:00
Preetha Appan 12278527c7
make default config a variable 2018-10-30 11:06:32 -05:00
Preetha Appan c1c1c230e4
Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption 2018-10-30 11:06:32 -05:00
Alex Dadgar ca28afa3b2 small fixes 2018-09-15 16:42:38 -07:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Andrei Burd 444ee45aff Parametrized/periodic jobs per child tagged metric emmision 2018-06-21 10:40:56 +03:00
Preetha Appan 2fd20310ea
Remove checks in member reconcile that was causing servers in protocol 3 to not change their ID in raft forever 2018-05-30 11:34:45 -05:00
Alex Dadgar ea24513d38 Allow nomad to restore bad periodic job 2018-04-26 15:51:47 -07:00
Alex Dadgar d0f237086b UX touchups 2018-04-26 15:24:27 -07:00
Chelsea Holland Komlo fca0169dbc handle potential panic in cron parsing 2018-04-26 16:57:45 -04:00
Michael Schurter 959d447d38 Remove unused context 2018-03-21 16:51:44 -07:00
Michael Schurter 0a17076ad2 refactor drainer into a subpkg 2018-03-21 16:51:44 -07:00
Michael Schurter c0542474db drain: initial drainv2 structs and impl 2018-03-21 16:49:48 -07:00
Alex Dadgar 4844317cc2
Merge pull request #3890 from hashicorp/b-heartbeat
Heartbeat improvements and handling failures during establishing leadership
2018-03-12 14:41:59 -07:00
Josh Soref 2c79e590ec spelling: maintenance 2018-03-11 18:26:20 +00:00
Alex Dadgar 64a45a1603 Need to revoke leadership to clean up in case there was a failure during leadership establishment 2018-02-20 12:52:00 -08:00
Alex Dadgar 9a54abd3a8 timers 2018-02-20 10:23:11 -08:00
Alex Dadgar 601177c250 Add escape hatches when non-leader 2018-02-20 10:22:15 -08:00
Kyle Havlovitz 2ccf565bf6 Refactor redundancy_zone/upgrade_version out of client meta 2018-01-29 20:03:38 -08:00
Kyle Havlovitz a162b9ce14 Move server health loop into autopilot leader actions 2018-01-23 12:57:02 -08:00
Kyle Havlovitz 1c07066064 Add autopilot functionality based on Consul's autopilot 2017-12-18 14:29:41 -08:00
Kyle Havlovitz 045f346293
Use region instead of datacenter for version checking 2017-12-12 10:17:16 -06:00
Kyle Havlovitz b775fc7b33
Added support for v2 raft APIs and -raft-protocol option 2017-12-12 10:17:16 -06:00
Alex Dadgar 86608124ca Fix followers not creating periodic launch
Fix an issue in which periodic launches wouldn't be made on followers.
2017-12-11 13:55:17 -08:00
Alex Dadgar 2c587fd67b
Merge pull request #3402 from hashicorp/leader-loop
Applies leader loop fixes from Consul.
2017-11-03 13:40:59 -07:00
Diptanu Choudhury 5a0edf646b Resetting the timer at the beginning of the loop 2017-11-01 13:15:06 -07:00
Diptanu Choudhury 46bc4280b2 Adding support for tagged metrics 2017-11-01 13:15:06 -07:00
Diptanu Choudhury 524a1f0712 Publishing metrics for job summary 2017-11-01 13:15:06 -07:00
Alex Dadgar 794daefa5e clear the token 2017-10-23 15:11:13 -07:00