Commit graph

123 commits

Author SHA1 Message Date
nicoche 649831c1d3
deploymentwatcher: fail early whenever possible (#17341)
Given a deployment that has a `progress_deadline`, if a task group runs
out of reschedule attempts, allow it to fail at this time instead of
waiting until the `progress_deadline` is reached.

Fixes: #17260
2023-06-26 14:01:03 -04:00
Seth Hoenig ba728f8f97
api: enable support for setting original job source (#16763)
* api: enable support for setting original source alongside job

This PR adds support for setting job source material along with
the registration of a job.

This includes a new HTTP endpoint and a new RPC endpoint for
making queries for the original source of a job. The
HTTP endpoint is /v1/job/<id>/submission?version=<version> and
the RPC method is Job.GetJobSubmission.

The job source (if submitted, and doing so is always optional), is
stored in the job_submission memdb table, separately from the
actual job. This way we do not incur overhead of reading the large
string field throughout normal job operations.

The server config now includes job_max_source_size for configuring
the maximum size the job source may be, before the server simply
drops the source material. This should help prevent Bad Things from
happening when huge jobs are submitted. If the value is set to 0,
all job source material will be dropped.

* api: avoid writing var content to disk for parsing

* api: move submission validation into RPC layer

* api: return an error if updating a job submission without namespace or job id

* api: be exact about the job index we associate a submission with (modify)

* api: reword api docs scheduling

* api: prune all but the last 6 job submissions

* api: protect against nil job submission in job validation

* api: set max job source size in test server

* api: fixups from pr
2023-04-11 08:45:08 -05:00
hashicorp-copywrite[bot] 005636afa0 [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
Piotr Kazmierczak 14b53df3b6
renamed stanza to block for consistency with other projects (#15941) 2023-01-30 15:48:43 +01:00
Piotr Kazmierczak b63944b5c1
cleanup: replace TypeToPtr helper methods with pointer.Of (#14151)
Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.
2022-08-17 18:26:34 +02:00
Michael Schurter 285979e96c
deployments: fix data races (#14121)
* deployments: fix data races

Both priority and state related fields may be mutated concurrently and
need to be accessed with the lock acquired.
2022-08-16 10:50:40 -07:00
Charles Z 7a8ec90fbe
allow unhealthy canaries without blocking autopromote (#14001) 2022-08-04 11:53:50 -04:00
Seth Hoenig 2631659551 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
Luiz Aoqui ab8ce87bba
Add pagination, filtering and sort to more API endpoints (#12186) 2022-03-08 20:54:17 -05:00
Seth Hoenig 40c714a681 api: return sorted results in certain list endpoints
These API endpoints now return results in chronological order. They
can return results in reverse chronological order by setting the
query parameter ascending=true.

- Eval.List
- Deployment.List
2022-02-15 13:48:28 -06:00
kainoaseto d575b3f4ae rename test variable names to something a easier to identify 2022-01-31 14:59:52 -08:00
kainoaseto 4bb8dee26a fix auto promote of canary task groups when deploying them alongside rolling deploy taskgroups that do not use the canary deployment system 2022-01-18 17:04:03 -08:00
James Rasell 751c8217d1
core: allow setting and propagation of eval priority on job de/registration (#11532)
This change modifies the Nomad job register and deregister RPCs to
accept an updated option set which includes eval priority. This
param is optional and override the use of the job priority to set
the eval priority.

In order to ensure all evaluations as a result of the request use
the same eval priority, the priority is shared to the
allocReconciler and deploymentWatcher. This creates a new
distinction between eval priority and job priority.

The Nomad agent HTTP API has been modified to allow setting the
eval priority on job update and delete. To keep consistency with
the current v1 API, job update accepts this as a payload param;
job delete accepts this as a query param.

Any user supplied value is validated within the agent HTTP handler
removing the need to pass invalid requests to the server.

The register and deregister opts functions now all for setting
the eval priority on requests.

The change includes a small change to the DeregisterOpts function
which handles nil opts. This brings the function inline with the
RegisterOpts.
2021-11-23 09:23:31 +01:00
Mahmood Ali 4d90afb425 gofmt all the files
mostly to handle build directives in 1.17.
2021-10-01 10:14:28 -04:00
Mahmood Ali 33dfe98770
deployment watcher: Reuse allocsCh if allocIndex remains the same (#10756)
Fix deployment watchers to avoid creating unnecessary deployment watcher goroutines and blocking queries. `deploymentWatcher.getAllocsCh` creates a new goroutine that makes a blocking query to fetch updates of deployment allocs.

## Background

When operators submit a new or updated service job, Nomad create a new deployment by default. The deployment object controls how fast to place the allocations through [`max_parallel`](https://www.nomadproject.io/docs/job-specification/update#max_parallel) and health checks configurations.

The `scheduler` and `deploymentwatcher` package collaborate to achieve deployment logic: The scheduler only places the canaries and `max_parallel` allocations for a new deployment; the `deploymentwatcher` monitors for alloc progress and then enqueues a new evaluation whenever the scheduler should reprocess a job and places the next `max_parallel` round of allocations.

The `deploymentwatcher` package makes blocking queries against the state store, to fetch all deployments and the relevant allocs for each running deployments. If `deploymentwatcher` fails or is hindered from fetching the state, the deployments fail to make progress.

`Deploymentwatcher` logic only runs on the leader.

## Why unnecessary deployment watchers can halt cluster progress
Previously, `getAllocsCh` is called on every for loop iteration in `deploymentWatcher.watch()` function. However, the for-loop may iterate many times before the allocs get updated. In fact, whenever a new deployment is created/updated/deleted, *all* `deploymentWatcher`s get notified through `w.deploymentUpdateCh`. The `getAllocsCh` goroutines and blocking queries spike significantly and grow quadratically with respect to the number of running deployments. The growth leads to two adverse outcomes:

1. it spikes the CPU/Memory usage resulting potentially leading to OOM or very slow processing
2. it activates the [query rate limiter](abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L896-L898)), so later the watcher fails to get updates and consequently fails to make progress towards placing new allocations for the deployment!

So the cluster fails to catch up and fails to make progress in almost all deployments. The cluster recovers after a leader transition: the deposed leader stops all watchers and free up goroutines and blocking queries; the new leader recreates the watchers without the quadratic growth and remaining under the rate limiter.  Well, until a spike of deployments are created triggering the condition again.

### Relevant Code References

Path for deployment monitoring:
* [`Watcher.watchDeployments`](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L164-L192)) loops waiting for deployment updates.
* On every deployment update, [`w.getDeploys`](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L194-L229)) returns all deployments in the system
* `watchDeployments` calls `w.add(d)` on every active deployment
* which in turns, [updates existing watcher if one is found](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L251-L255)).
* The deployment watcher [updates local local deployment field and trigger `deploymentUpdateCh` channel]( abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L136-L147))
* The [deployment watcher `deploymentUpdateCh` selector is activated](abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L455-L489)). Most of the time the selector clause is a no-op, because the flow was triggered due to another deployment update
* The `watch` for-loop iterates again and in the previous code we create yet another goroutine and blocking call that risks being rate limited.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-14 16:01:01 -04:00
James Rasell ff75b4da09
deploymentwatcher: fix test data race. 2021-06-11 11:55:21 +02:00
Tim Gross b764f52ab9
deploymentwatcher: reset progress deadline on promotion (#10042)
In a deployment with two groups (ex. A and B), if group A's canary becomes
healthy before group B's, the deadline for the overall deployment will be set
to that of group A. When the deployment is promoted, if group A is done it
will not contribute to the next deadline cutoff. Group B's old deadline will
be used instead, which will be in the past and immediately trigger a
deployment progress failure. Reset the progress deadline when the job is
promotion to avoid this bug, and to better conform with implicit user
expectations around how the progress deadline should interact with promotions.
2021-02-22 16:44:03 -05:00
Kris Hicks 0a3a748053
Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Drew Bailey 6c788fdccd
Events/msgtype cleanup (#9117)
* use msgtype in upsert node

adds message type to signature for upsert node, update tests, remove placeholder method

* UpsertAllocs msg type test setup

* use upsertallocs with msg type in signature

update test usage of delete node

delete placeholder msgtype method

* add msgtype to upsert evals signature, update test call sites with test setup msg type

handle snapshot upsert eval outside of FSM and ignore eval event

remove placeholder upsertevalsmsgtype

handle job plan rpc and prevent event creation for plan

msgtype cleanup upsertnodeevents

updatenodedrain msgtype

msg type 0 is a node registration event, so set the default  to the ignore type

* fix named import

* fix signature ordering on upsertnode to match
2020-10-19 09:30:15 -04:00
Michael Schurter dd09fa1a4a
Merge pull request #9055 from hashicorp/f-9017-resources
api: add field filters to /v1/{allocations,nodes}
2020-10-14 14:49:39 -07:00
Michael Schurter 8ccbd92cb6 api: add field filters to /v1/{allocations,nodes}
Fixes #9017

The ?resources=true query parameter includes resources in the object
stub listings. Specifically:

- For `/v1/nodes?resources=true` both the `NodeResources` and
  `ReservedResources` field are included.
- For `/v1/allocations?resources=true` the `AllocatedResources` field is
  included.

The ?task_states=false query parameter removes TaskStates from
/v1/allocations responses. (By default TaskStates are included.)
2020-10-14 10:35:22 -07:00
Drew Bailey df96b89958
Add EvictCallbackFn to handle removing entries from go-memdb when they
are removed from the event buffer.

Wire up event buffer size config, use pointers for structs.Events
instead of copying.
2020-10-14 12:44:42 -04:00
Drew Bailey 9d48818eb8
writetxn can return error, add alloc and job generic events. Add events
table for durability
2020-10-14 12:44:39 -04:00
Drew Bailey 400455d302
Events/eval alloc events (#9012)
* generic eval update event

first pass at alloc client update events

* api/event client
2020-10-14 12:44:37 -04:00
Drew Bailey 4793bb4e01
Events/deployment events (#9004)
* Node Drain events and Node Events (#8980)

Deployment status updates

handle deployment status updates (paused, failed, resume)

deployment alloc health

generate events from apply plan result

txn err check, slim down deployment event

one ndjson line per index

* consolidate down to node event + type

* fix UpdateDeploymentAllocHealth test invocations

* fix test
2020-10-14 12:44:37 -04:00
Tim Gross d3341a2019 refactor: make it clear where we're accessing dstate
The field name `Deployment.TaskGroups` contains a map of `DeploymentState`,
which makes it a little harder to follow state updates when combined with
inconsistent naming conventions, particularly when we also have the state
store or actual `TaskGroup`s in scope. This changeset changes all uses to
`dstate` so as not to be confused with actual TaskGroups.
2020-07-20 11:25:53 -04:00
Tim Gross c14a75bfab multiregion: use pending instead of paused
The `paused` state is used as an operator safety mechanism, so that they can
debug a deployment or halt one that's causing a wider failure. By using the
`paused` state as the first state of a multiregion deployment, we risked
resuming an intentionally operator-paused deployment because of activity in a
peer region.

This changeset replaces the use of the `paused` state with a `pending` state,
and provides a `Deployment.Run` internal RPC to replace the use of the
`Deployment.Pause` (resume) RPC we were using in `deploymentwatcher`.
2020-06-17 11:06:14 -04:00
Tim Gross fd50b12ee2 multiregion: integrate with deploymentwatcher
* `nextRegion` should take status parameter
* thread Deployment/Job RPCs thru `nextRegion`
* add `nextRegion` calls to `deploymentwatcher`
* use a better description for paused for peer
2020-06-17 11:06:00 -04:00
Tim Gross 48e9f75c1e multiregion: deploymentwatcher hooks
This changeset establishes hooks in deploymentwatcher for multiregion
deployments (for the enterprise version of Nomad).
2020-06-17 11:05:18 -04:00
Tim Gross 473a0f1d44 multiregion: unblock and cancel RPCs 2020-06-17 11:02:26 -04:00
Mahmood Ali 19141f8103 {volume|deployment}watcher: check for nil batcher 2020-05-26 14:54:27 -04:00
Mahmood Ali 81ac098a22 deploymentwatcher: no batcher when disabling
When disabling deploymentwatcher (at the end of a test), avoid starting a
new update batcher with its new goroutine.
2020-05-26 14:44:47 -04:00
Mahmood Ali 9d88f1d568 tests: deflake deploymentwatcher package
This deflake the tests in the deploymentwatcher package.  The package
uses a mock deployment watcher backend, where the watcher in a
background goroutine calls UpdateDeploymentStatus .  If the mock isn't
configured to expect the call, the background goroutine will fail.  One
UpdateDeploymentStatus call is made at the end of the background
goroutine, which may occur after the test completes, thus explaining the
flakiness.
2020-03-12 15:42:01 -04:00
Jasmine Dahilig 8d980edd2e
add create and modify timestamps to evaluations (#5881) 2019-08-07 09:50:35 -07:00
Mahmood Ali d30c3d10b0
Merge pull request #5747 from hashicorp/b-test-fixes-20190521-1
More test fixes
2019-06-05 19:09:18 -04:00
Lang Martin 0f6f543a5f deployment_watcher auto promote iff every task group is auto promotable 2019-05-22 12:34:57 -04:00
Lang Martin 0c668ecc7a log error on autoPromoteDeployment failure 2019-05-22 12:32:08 -04:00
Lang Martin b5fd735960 add update AutoPromote bool 2019-05-22 12:32:08 -04:00
Lang Martin 3c5a9fed22 deployments_watcher_test new TestWatcher_AutoPromoteDeployment 2019-05-22 12:32:08 -04:00
Lang Martin 0bebf5d7f8 deployment_watcher when it's ok to autopromote, do so 2019-05-22 12:32:08 -04:00
Lang Martin 0cf4168ed9 deployments_watcher comments 2019-05-22 12:32:08 -04:00
Lang Martin e1e28307be new deploymentwatcher/doc.go for package level documentation 2019-05-22 12:32:08 -04:00
Mahmood Ali 3b0152d778 tests: fix deploymentwatcher tests data races 2019-05-21 14:29:45 -04:00
Alex Dadgar be54e56570 review fixes 2018-11-08 09:48:36 -08:00
Alex Dadgar 1c31970464 Fix multiple tgs with progress deadline handling
Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.

Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
2018-11-05 16:06:17 -08:00
Alex Dadgar de442226ae Fix other instances of blocking queries 2018-09-24 13:52:39 -07:00
Alex Dadgar 7f0d241ef4 always handle failed allocation 2018-09-21 15:13:54 -07:00
Alex Dadgar b2449ae1ce Fix deployment watcher index usage
Fixes three issues:
1. Retrieving the latest evaluation index was not properly selecting the
greatest index. This would undermine checks we had to reduce the number
of evaluations created when the latest eval index was greater than any
alloc change
2. Fix an issue where the blocking query code was using the incorrect
index such that the index was higher than necassary.
3. Special case handling of blocked evaluation since the create/snapshot
index is no particularly useful since they can be reblocked.
2018-09-21 13:59:11 -07:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Alex Dadgar c6576ddac1 Fix make check errors 2018-09-04 16:03:52 -07:00