Commit graph

51 commits

Author SHA1 Message Date
Michael Schurter ace5faf948
core: backoff considerably when worker is behind raft (#15523)
Upon dequeuing an evaluation workers snapshot their state store at the
eval's wait index or later. This ensures we process an eval at a point
in time after it was created or updated. Processing an eval on an old
snapshot could cause any number of problems such as:

1. Since job registration atomically updates an eval and job in a single
   raft entry, scheduling against indexes before that may not have the
   eval's job or may have an older version.
2. The older the scheduler's snapshot, the higher the likelihood
   something has changed in the cluster state which will cause the plan
   applier to reject the scheduler's plan. This could waste work or
   even cause eval's to be failed needlessly.

However, the workers run in parallel with a new server pulling the
cluster state from a peer. During this time, which may be many minutes
long, the state store is likely far behind the minimum index required
to process evaluations.

This PR addresses this by adding an additional long backoff period after
an eval is nacked. If the scheduler's indexes catches up within the
additional backoff, it will unblock early to dequeue the next eval.

When the server shuts down we'll get a `context.Canceled` error from the state
store method. We need to bubble this error up so that other callers can detect
it. Handle this case separately when waiting after dequeue so that we can warn
on shutdown instead of throwing an ambiguous error message with just the text
"canceled."

While there may be more precise ways to block scheduling until the
server catches up, this approach adds little risk and covers additional
cases where a server may be temporarily behind due to a spike in load or
a saturated network.

For testing, we make the `raftSyncLimit` into a parameter on the worker's `run` method 
so that we can run backoff tests without waiting 30+ seconds. We haven't followed thru
and made all the worker globals into worker parameters, because there isn't much
use outside of testing, but we can consider that in the future.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-01-24 08:56:35 -05:00
Tim Gross 6415fb4284
eval broker: shed all but one blocked eval per job after ack (#14621)
When an evaluation is acknowledged by a scheduler, the resulting plan is
guaranteed to cover up to the `waitIndex` set by the worker based on the most
recent evaluation for that job in the state store. At that point, we no longer
need to retain blocked evaluations in the broker that are older than that index.

Move all but the highest priority / highest `ModifyIndex` blocked eval into a
canceled set. When the `Eval.Ack` RPC returns from the eval broker it will
signal a reap of a batch of cancelable evals to write to raft. This paces the
cancelations limited by how frequently the schedulers are acknowledging evals;
this should reduce the risk of cancelations from overwhelming raft relative to
scheduler progress. In order to avoid straggling batches when the cluster is
quiet, we also include a periodic sweep through the cancelable list.
2022-11-16 16:10:11 -05:00
Tim Gross 80ec5e1346
fix panic from keyring raft entries being written during upgrade (#14821)
During an upgrade to Nomad 1.4.0, if a server running 1.4.0 becomes the leader
before one of the 1.3.x servers, the old server will crash because the keyring
is initialized and writes a raft entry.

Wait until all members are on a version that supports the keyring before
initializing it.
2022-10-06 12:47:02 -04:00
Seth Hoenig 2631659551 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
Luiz Aoqui b1753d0568
scheduler: detect and log unexpected scheduling collisions (#11793) 2022-01-14 20:09:14 -05:00
Charlie Voiselle 98a240cd99
Make number of scheduler workers reloadable (#11593)
## Development Environment Changes
* Added stringer to build deps

## New HTTP APIs
* Added scheduler worker config API
* Added scheduler worker info API

## New Internals
* (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume()
* Update shutdown to use context
* Add mutex for contended server data
    - `workerLock` for the `workers` slice
    - `workerConfigLock` for the `Server.Config.NumSchedulers` and
      `Server.Config.EnabledSchedulers` values

## Other
* Adding docs for scheduler worker api
* Add changelog message

Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
2022-01-06 11:56:13 -05:00
Charlie Voiselle 176de1bfe6
Refactor sendAck(3) into sendAck(2),sendNack(2),sendAcknowledgement(3) (#11506) 2021-11-17 10:49:55 -05:00
Drew Bailey 6c788fdccd
Events/msgtype cleanup (#9117)
* use msgtype in upsert node

adds message type to signature for upsert node, update tests, remove placeholder method

* UpsertAllocs msg type test setup

* use upsertallocs with msg type in signature

update test usage of delete node

delete placeholder msgtype method

* add msgtype to upsert evals signature, update test call sites with test setup msg type

handle snapshot upsert eval outside of FSM and ignore eval event

remove placeholder upsertevalsmsgtype

handle job plan rpc and prevent event creation for plan

msgtype cleanup upsertnodeevents

updatenodedrain msgtype

msg type 0 is a node registration event, so set the default  to the ignore type

* fix named import

* fix signature ordering on upsertnode to match
2020-10-19 09:30:15 -04:00
Lang Martin 069840bef8
scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105) (#8138)
* scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect

* scheduler/reconcile: thread follupEvalIDs through to results.stop

* scheduler/reconcile: comment typo

* nomad/_test: correct arguments for plan.AppendStoppedAlloc

* scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost|Reschedules)
2020-06-09 17:13:53 -04:00
Seth Hoenig f0c3dca49c tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Michael Schurter e4bc943a68 nomad: SnapshotAfter -> SnapshotMinIndex
Rename SnapshotAfter to SnapshotMinIndex. The old name was not
technically accurate. SnapshotAtOrAfter is more accurate, but wordy and
still lacks context about what precisely it is at or after (the index).

SnapshotMinIndex was chosen as it describes the action (snapshot), a
constraint (minimum), and the object of the constraint (index).
2019-06-24 12:16:46 -07:00
Michael Schurter 0e39927782 nomad: emit more detailed error
Avoid returning context.DeadlineExceeded as it lacks helpful information
and is often ignored or handled specially by callers.
2019-05-17 14:37:42 -07:00
Michael Schurter 9732bc37ff nomad: refactor waitForIndex into SnapshotAfter
Generalize wait for index logic in the state store for reuse elsewhere.
Also begin plumbing in a context to combine handling of timeouts and
shutdown.
2019-05-17 13:30:23 -07:00
Arshneet Singh b7b050cdd1 Change min version required for plan optimization 2019-04-24 12:36:07 -07:00
Arshneet Singh 4cf4324b8f Remove allowPlanOptimization from schedulers 2019-04-23 09:18:02 -07:00
Arshneet Singh 65f5fab131 Add tests for plan normalization 2019-04-23 09:18:01 -07:00
Arshneet Singh b977748a4b Add code for plan normalization 2019-04-23 09:18:01 -07:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Alex Dadgar a6dfffa4fa Add testing interfaces 2018-02-15 13:59:00 -08:00
Preetha Appan 3c36abfe14
Update eval modify index as part of plan apply. 2017-12-18 10:03:55 -06:00
Alex Dadgar c1cc51dbee sync 2017-10-13 14:36:02 -07:00
Michael Schurter a66c53d45a Remove structs import from api
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar 4173834231 Enable more linters 2017-09-26 15:26:33 -07:00
Alex Dadgar 6911bd7676 Worker waits til max ModifyIndex across EvalsByJob
This PR fixes a scheduling race condition in which the plan results from
one invocation of the scheduler were not being considered by the next
since the Worker was not waiting for the correct index.

Fixes https://github.com/hashicorp/nomad/issues/3198
2017-09-14 14:28:43 -07:00
Alex Dadgar 06eddf243c parallel nomad tests 2017-07-25 17:39:36 -07:00
Alex Dadgar 04862ca10e Tests compile 2017-02-07 21:30:57 -08:00
Diptanu Choudhury 6193529040 Fixed more tests 2016-07-25 17:31:40 -07:00
Diptanu Choudhury 50842b88c7 Fixed some bugs 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 7bafb7c675 Updating the job summary while mutating jobs and allocation objects 2016-07-25 17:26:38 -07:00
Alex Dadgar 25decca3ca Worker waitForIndex uses StateStore index, not Raft Applied Index 2016-06-22 09:04:22 -07:00
Alex Dadgar a3336b7761 test fixes and delete 2016-05-24 20:20:06 -07:00
Alex Dadgar bfdd5846e1 Track unblock indexes and check evals on block to see if they missed an update while in the scheduler 2016-05-24 20:10:56 -07:00
Alex Dadgar 15936822a4 Worker annotates evals with their snapshot index 2016-05-24 20:10:56 -07:00
Alex Dadgar 18d9e89065 Reuse the same evaluation and reblock it until there is no more work to do 2016-05-24 20:10:56 -07:00
Alex Dadgar 045f7807e0 eval_broker.Enqueue no longer returns an error 2016-05-18 11:35:15 -07:00
Armon Dadgar 41efc6b7a9 nomad: make tests more robust 2015-10-12 09:42:25 -07:00
Armon Dadgar ea0795995d Use a single implementation of GenerateUUID 2015-09-07 15:23:03 -07:00
Armon Dadgar df16cea2a4 nomad: worker supports create eval 2015-09-07 14:23:48 -07:00
Armon Dadgar 8bedd3769c nomad: unifying the state store API 2015-09-06 20:56:38 -07:00
Armon Dadgar 8c2bc337e6 nomad: adding ability to pause a worker 2015-08-23 10:52:31 -07:00
Armon Dadgar cae67b7f60 nomad: expose UpdateEval as a planner 2015-08-15 14:25:00 -07:00
Armon Dadgar f7007bfeb5 nomad: avoid split-brain in plan processing due to leader transition or eval retry 2015-08-12 15:44:36 -07:00
Armon Dadgar 183a238481 nomad: avoid split-brain eval handling after leader transition 2015-08-12 15:25:31 -07:00
Armon Dadgar 343b1b9c89 nomad: move state and mocks into shared packages 2015-08-11 14:27:14 -07:00
Armon Dadgar b908a76d29 nomad: Adding AllocsByJob 2015-08-06 17:36:10 -07:00
Armon Dadgar 4b35992d06 nomad: test worker plan submission 2015-08-05 16:23:47 -07:00
Armon Dadgar 3938231a9e nomad: more worker testing 2015-07-28 17:20:06 -07:00
Armon Dadgar 8e66fdfdab nomad: test waitForIndex 2015-07-28 17:11:00 -07:00
Armon Dadgar 0cfb67e918 nomad: testing worker acking 2015-07-28 17:05:59 -07:00
Armon Dadgar c65b2c472b nomad: testing worker dequeue 2015-07-28 17:02:54 -07:00