Commit Graph

58 Commits

Author SHA1 Message Date
Joel May a8fc048151
Emit metrics for eval waitUntil as nomad.nomad.broker.eval_waiting (#10236) 2022-01-06 15:57:40 -05:00
Michael Schurter d7e123d7cd test: fix fake by increasing time window
Test originally only had a 10ms time window tolerance. Increased to
100ms and also improved assertions and docstrings.
2021-09-28 12:22:59 -07:00
Michael Schurter 9732bc37ff nomad: refactor waitForIndex into SnapshotAfter
Generalize wait for index logic in the state store for reuse elsewhere.
Also begin plumbing in a context to combine handling of timeouts and
shutdown.
2019-05-17 13:30:23 -07:00
Danielle Lancashire d9815888ed evalbroker: Simplify nextDelayedEval locking 2019-05-14 14:06:27 +02:00
Danielle Lancashire 38562afbc1 evalbroker: No new enqueues when disabled
Currently when an evalbroker is disabled, it still recieves delayed
enqueues via log application in the fsm. This causes an ever growing
heap of evaluations that will never be drained, and can cause memory
issues in larger clusters, or when left running for an extended period
of time without a leader election.

This commit prevents the enqueuing of evaluations while we are
disabled, and relies on the leader restoreEvals routine to handle
reconciling state during a leadership transition.

Existing dequeues during an Enabled->Disabled broker state transition are
handled by the enqueueLocked function dropping evals.
2019-05-14 13:59:10 +02:00
Danielle Lancashire c91ae21a6c evalbroker: Flush within update lock
Primarily a cleanup commit, however, currently there is a potential race
condition (that I'm not sure we've ever actually hit) during a flapping
SetEnabled/Disabled state where we may never correctly restart the eval
broker, if it was being called from multiple routines.
2019-05-14 13:26:56 +02:00
Michael Schurter 004fa574cb test: fix race in eval broker update chan
Similar to previous commits the delayed eval update chan was set and
access from different goroutines causing a race. Passing the chan on the
stack resolves the race.

Race output from `go test -race -run 'Server_RPC$'` in nomad/

```
==================
WARNING: DATA RACE
Write at 0x00c000339150 by goroutine 63:
  github.com/hashicorp/nomad/nomad.(*EvalBroker).flush()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:708
+0x3dc
  github.com/hashicorp/nomad/nomad.(*EvalBroker).SetEnabled()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:174
+0xc4
  github.com/hashicorp/nomad/nomad.(*Server).revokeLeadership()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:718
+0x1fd
  github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:122
+0x95d
  github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c

Previous read at 0x00c000339150 by goroutine 73:
  github.com/hashicorp/nomad/nomad.(*EvalBroker).runDelayedEvalsWatcher()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:771
+0x176

Goroutine 63 (running) created at:
  github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:70
+0x269

Goroutine 73 (running) created at:
  github.com/hashicorp/nomad/nomad.(*EvalBroker).SetEnabled()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/eval_broker.go:170
+0x173
  github.com/hashicorp/nomad/nomad.(*Server).establishLeadership()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:207
+0x355
  github.com/hashicorp/nomad/nomad.(*Server).leaderLoop()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:117
+0x82e
  github.com/hashicorp/nomad/nomad.(*Server).monitorLeadership.func1()
      /home/schmichael/go/src/github.com/hashicorp/nomad/nomad/leader.go:72
+0x6c
==================
```
2018-12-19 15:48:02 -08:00
Preetha Appan 4f8e925b54
Move topk and delay heap to separate packages under lib 2018-09-04 16:10:11 -05:00
Preetha Appan 9e29cfee76
Use readlock 2018-09-04 11:45:05 -05:00
Preetha Appan 062e5f1898
Use eval broker lock when reading/modifying delay heap 2018-08-31 10:59:48 -05:00
Charlie Voiselle ba88f00ccb Changed "til" to "until"
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Preetha Appan 56e60e5840
Fix linting warning 2018-03-14 16:12:22 -05:00
Preetha Appan 4193015e48
Update comment 2018-03-14 16:10:32 -05:00
Preetha Appan 3e96c6c4e0
Address more code review feedback 2018-03-14 16:10:32 -05:00
Preetha Appan 9749b7a05c
Avoids unnecessary timer object creation 2018-03-14 16:10:32 -05:00
Preetha Appan c6f333c90f
Move delayheap to lib package 2018-03-14 16:10:32 -05:00
Preetha Appan 1ab8f2b57a
Address some code review comments 2018-03-14 16:10:32 -05:00
Preetha Appan 7887f39ff4
Added a delay heap to track evals with WaitUntil set, and use in eval broker 2018-03-14 16:10:32 -05:00
Preetha Appan 9a4fcaaf34
Fix bug with not including namespace in indexing blocked evals 2018-03-13 13:23:11 -05:00
Josh Soref 6f06f0e6f2 spelling: requeuing 2018-03-11 18:43:05 +00:00
Josh Soref d300623abe spelling: evaluation 2018-03-11 18:01:35 +00:00
Josh Soref f1d21bfdfe spelling: enqueuing 2018-03-11 18:00:07 +00:00
Michael Schurter a66c53d45a Remove `structs` import from `api`
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar 84d06f6abe Sync namespace changes 2017-09-07 17:04:21 -07:00
Alex Dadgar a331a234d4 NewEvalBroker comment 2017-04-14 15:26:54 -07:00
Alex Dadgar a9c8b09da8 Push to configs 2017-04-14 15:24:55 -07:00
Alex Dadgar ef875f6dda Delay Nack re-enqueue
Add a delay when an evaluation is nacked that starts off small but
compounds to a larger delay for subsequent Nacks. This creates some
back pressure.
2017-04-12 13:41:40 -07:00
Alex Dadgar fd3e469d5e Remove requeue because it is a subset of EnqueueAll now 2016-06-24 10:14:34 -07:00
Alex Dadgar 2f8bb4b235 When enqueuing into eval broker always pass blocked eval's token 2016-06-23 22:40:22 -07:00
Sean Chittenden 1aefdb1e15
Use the correctly typed `rand.Int*` variant 2016-06-10 15:50:11 -04:00
Sean Chittenden 3a1dc9a194
Use `rand.Int*n()` where appropriate 2016-06-10 15:50:11 -04:00
Sean Chittenden 4e2835d5ff
Use the correctly typed `rand.Int*` variant 2016-06-10 15:48:36 -04:00
Sean Chittenden 66b4b2a99f
Use `rand.Int*n()` where appropriate 2016-06-10 15:48:36 -04:00
Alex Dadgar 060318845f Comments addressed 2016-05-31 11:39:03 -07:00
Alex Dadgar 1f9f015c1b Fix race condition in which a reblocked evaluation could be dropped 2016-05-27 16:53:10 -07:00
Alex Dadgar 1c6d3e129a EnqueueAll inserts all evaluations before unblocking dequeue calls 2016-05-18 12:13:59 -07:00
Alex Dadgar 045f7807e0 eval_broker.Enqueue no longer returns an error 2016-05-18 11:35:15 -07:00
Alex Dadgar 74726278b9 core: Pause NackTimeout while in the plan_queue as progress is being made 2016-03-04 12:59:35 -08:00
Alex Dadgar d2e88f0116 Fix panic when Ack occurs at delivery limit 2016-02-11 11:07:18 -08:00
Alex Dadgar 74135f02a4 Blocked Eval tracker 2016-01-31 18:04:45 -08:00
Alex Dadgar cbb1992929 Make a zero timeout for eval_broker.Dequeue() block 2015-11-23 11:59:49 -08:00
Alex Dadgar af7d896c2a nil-pointer dereference with empty timeout Dequeue 2015-11-20 16:49:55 -08:00
Armon Dadgar b9bb7bdaaa nomad: OutstandingReset returns specific errors 2015-10-23 10:22:17 -07:00
Armon Dadgar 16fd84f25a nomad: Adding OutstandingReset to EvalBroker 2015-10-23 10:14:16 -07:00
Armon Dadgar ea0795995d Use a single implementation of GenerateUUID 2015-09-07 15:23:03 -07:00
Armon Dadgar fe2e046481 nomad: support time wait for evaluations 2015-09-07 13:00:45 -07:00
Armon Dadgar 79a1471b85 nomad: add delivery limit to eval broker 2015-08-16 10:55:55 -07:00
Armon Dadgar 183a238481 nomad: avoid split-brain eval handling after leader transition 2015-08-12 15:25:31 -07:00
Armon Dadgar 011484ea74 nomad: eval broker serializes by JobID 2015-08-05 17:55:15 -07:00
Armon Dadgar 078d80432a nomad: deduplicate enqueue of evaluations 2015-08-05 17:06:02 -07:00