Implement the new `nomad job restart` command that allows operators to
restart allocations tasks or reschedule then entire allocation.
Restarts can be batched to target multiple allocations in parallel.
Between each batch the command can stop and hold for a predefined time
or until the user confirms that the process should proceed.
This implements the "Stateless Restarts" alternative from the original
RFC
(https://gist.github.com/schmichael/e0b8b2ec1eb146301175fd87ddd46180).
The original concept is still worth implementing, as it allows this
functionality to be exposed over an API that can be consumed by the
Nomad UI and other clients. But the implementation turned out to be more
complex than we initially expected so we thought it would be better to
release a stateless CLI-based implementation first to gather feedback
and validate the restart behaviour.
Co-authored-by: Shishir Mahajan <smahajan@roblox.com>
The job evaluate endpoint creates a new evaluation for the job which is
a write operation. This change modifies the necessary capability from
`read-job` to `submit-job` to better reflect this.
Most job subcommands allow for job ID prefix match as a convenience
functionality so users don't have to type the full job ID.
But this introduces a hard ACL requirement that the token used to run
these commands have the `list-jobs` permission, even if the token has
enough permission to execute the basic command action and the user
passed an exact job ID.
This change softens this requirement by not failing the prefix match in
case the request results in a permission denied error and instead using
the information passed by the user directly.
* Extend variables under the nomad path prefix to allow for job-templates (#15570)
* Extend variables under the nomad path prefix to allow for job-templates
* Add job-templates to error message hinting
* RadioCard component for Job Templates (#15582)
* chore: add
* test: component API
* ui: component template
* refact: remove bc naming collission
* styles: remove SASS var causing conflicts
* Disallow specific variable at nomad/job-templates (#15681)
* Disallows variables at exactly nomad/job-templates
* idiomatic refactor
* Expanding nomad job init to accept a template flag (#15571)
* Adding a string flag for templates on job init
* data-down actions-up version of a custom template editor within variable
* Dont force grid on job template editor
* list-templates flag started
* Correctly slice from end of path name
* Pre-review cleanup
* Variable form acceptance test for job template editing
* Some review cleanup
* List Job templates test
* Example from template test
* Using must.assertions instead of require etc
* ui: add choose template button (#15596)
* ui: add new routes
* chore: update file directory
* ui: add choose template button
* test: button and page navigation
* refact: update var name
* ui: use `Button` component from `HDS` (#15607)
* ui: integrate buttons
* refact: remove helper
* ui: remove icons on non-tertiary buttons
* refact: update normalize method for key/value pairs (#15612)
* `revert`: `onCancel` for `JobDefinition`
The `onCancel` method isn't included in the component API for `JobEditor` and the primary cancel behavior exists outside of the component. With the exception of the `JobDefinition` page where we include this button in the top right of the component instead of next to the `Plan` button.
* style: increase button size
* style: keep lime green
* ui: select template (#15613)
* ui: deprecate unused component
* ui: deprecate tests
* ui: jobs.run.templates.index
* ui: update logic to handle templates
* refact: revert key/value changes
* style: padding for cards + buttons
* temp: fixtures for mirage testing
* Revert "refact: revert key/value changes"
This reverts commit 124e95d12140be38fc921f7e15243034092c4063.
* ui: guard template for unsaved job
* ui: handle reading template variable
* Revert "refact: update normalize method for key/value pairs (#15612)"
This reverts commit 6f5ffc9b610702aee7c47fbff742cc81f819ab74.
* revert: remove test fixtures
* revert: prettier problems
* refact: test doesnt need filter expression
* styling: button sizes and responsive cards
* refact: remove route guarding
* ui: update variable adapter
* refact: remove model editing behavior
* refact: model should query variables to populate editor
* ui: clear qp on exit
* refact: cleanup deprecated API
* refact: query all namespaces
* refact: deprecate action
* ui: rely on collection
* refact: patch deprecate transition API
* refact: patch test to expect namespace qp
* styling: padding, conditionals
* ui: flashMessage on 404
* test: update for o(n+1) query
* ui: create new job template (#15744)
* refact: remove unused code
* refact: add type safety
* test: select template flow
* test: add data-test attrs
* chore: remove dead code
* test: create new job flow
* ui: add create button
* ui: create job template
* refact: no need for wildcard
* refact: record instead of delete
* styling: spacing
* ui: add error handling and form validation to job create template (#15767)
* ui: handle server side errors
* ui: show error to prevent duplicate
* refact: conditional namespace
* ui: save as template flow (#15787)
* bug: patches failing tests associated with `pretender` (#15812)
* refact: update assertion
* refact: test set-up
* ui: job templates manager view (#15815)
* ui: manager list view
* test: edit flow
* refact: deprecate column-helper
* ui: template edit and delete flow (#15823)
* ui: manager list view
* refact: update title
* refact: update permissions
* ui: template edit page
* bug: typo
* refact: update toast messages
* bug: clear selections on exit (#15827)
* bug: clear controllers on exit
* test: mirage config changes (#15828)
* refact: deprecate column-helper
* style: update z-index for HDS
* Revert "style: update z-index for HDS"
This reverts commit d3d87ceab6d083f7164941587448607838944fc1.
* refact: update delete button
* refact: edit redirect
* refact: patch reactivity issues
* styling: fixed width
* refact: override defaults
* styling: edit text causing overflow
* styling: add inline text
Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
* bug: edit `text` to `template`
Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>
* test: delete flow job templates (#15896)
* refact: edit names
* bug: set correct ref to store
* chore: trim whitespace:
* test: delete flow
* bug: reactively update view (#15904)
* Initialized default jobs (#15856)
* Initialized default jobs
* More jobs scaffolded
* Better commenting on a couple example job specs
* Adapter doing the work
* fall back to epic config
* Label format helper and custom serialization logic
* Test updates to account for a never-empty state
* Test suite uses settled and maintain RecordArray in adapter return
* Updates to hello-world and variables example jobspecs
* Parameterized job gets optional payload output
* Formatting changes for param and service discovery job templates
* Multi-group service discovery job
* Basic test for default templates (#15965)
* Basic test for default templates
* Percy snapshot for manage page
* Some late-breaking design changes
* Some copy edits to the header paragraphs for job templates (#15967)
* Added some init options for job templates (#15994)
* Async method for populating default job templates from the variable adapter
---------
Co-authored-by: Jai <41024828+ChaiWithJai@users.noreply.github.com>
These options are mutually exclusive but, since `-hcl2-strict` defaults
to `true` users had to explicitily set it to `false` when using `-hcl1`.
Also return `255` when job plan fails validation as this is the expected
code in this situation.
Closes#12927Closes#12958
This PR updates the version of redis used in our examples from 3.2 to 7.
The old version is very not supported anymore, and we should be setting
a good example by using a supported version.
The long-form example job is now fixed so that the service stanza uses
nomad as the service discovery provider, and so now the job runs without
a requirement of having Consul running and configured.
* docs: update json jobs docs
Did you know that Nomad has not 1 but 2 JSON formats for jobs? 2½ if you
want to acknowledge that sometimes our JSON job representations have a
Job top-level wrapper and sometimes do not.
The 2½ formats are:
```
1. HCL JSON
2. Input API JSON (top-level Job field)
2.5. Output API JSON (lacks top-level Job field)
```
`#2` is what our docs consider our API JSON. `#2.5` seems to be an
accident of history we can't fix with breaking API compatibility.
`#1` is an even more interesting accident of history: the `jobspec2`
package automatically detects if the input to Parse is JSON and switches
to a JSON parser. This behavior is undocumented, the format is
unspecified, and there is no official HashiCorp tooling to produce this
JSON from HCL. The plot thickens when you discover popular third party
tools like hcl2json.com and https://github.com/tmccombs/hcl2json seem to
produce JSON that `nomad run` accepts!
Since we have no telemetry around whether or not anyone passes HCL JSON
to `nomad run`, and people don't file bugs around features that Just
Work, I'm choosing to leave that code path in place and *acknowledged
but not suggested* in documentation.
See https://github.com/hashicorp/hcl/issues/498 for a more comprehensive
discussion of what officially supporting HCL JSON in Nomad would look
like.
(I also added some of the missing fields to the (Input API flavor) JSON
Job documentation, but it still needs a lot of work to be
comprehensive.)
Co-authored-by: Tim Gross <tgross@hashicorp.com>
After a more detailed analysis of this feature, the approach taken in
PR #12449 was found to be not ideal due to poor UX (users are
responsible for setting the entity alias they would like to use) and
issues around jobs potentially masquerading itself as another Vault
entity.
* cli: add -json flag to support job commands
While the CLI has always supported running JSON jobs, its support has
been via HCLv2's JSON parsing. I have no idea what format it expects the
job to be in, but it's absolutely not in the same format as the API
expects.
So I ignored that and added a new -json flag to explicitly support *API*
style JSON jobspecs.
The jobspecs can even have the wrapping {"Job": {...}} envelope or not!
* docs: fix example for `nomad job validate`
We haven't been able to validate inside driver config stanzas ever since
the move to task driver plugins. 😭
Some operators use very long group/task `shutdown_delay` settings to
safely drain network connections to their workloads after service
deregistration. But during incident response, they may want to cause
that drain to be skipped so they can quickly shed load.
Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and
`nomad job stop` commands that bypasses the delay. This sets a new
desired transition state on the affected allocations that the
allocation/task runner will identify during pre-kill on the client.
Note (as documented here) that using this flag will almost always
result in failed inbound network connections for workloads as the
tasks will exit before clients receive updated service discovery
information and won't be gracefully drained.
This change modifies the Nomad job register and deregister RPCs to
accept an updated option set which includes eval priority. This
param is optional and override the use of the job priority to set
the eval priority.
In order to ensure all evaluations as a result of the request use
the same eval priority, the priority is shared to the
allocReconciler and deploymentWatcher. This creates a new
distinction between eval priority and job priority.
The Nomad agent HTTP API has been modified to allow setting the
eval priority on job update and delete. To keep consistency with
the current v1 API, job update accepts this as a payload param;
job delete accepts this as a query param.
Any user supplied value is validated within the agent HTTP handler
removing the need to pass invalid requests to the server.
The register and deregister opts functions now all for setting
the eval priority on requests.
The change includes a small change to the DeregisterOpts function
which handles nil opts. This brings the function inline with the
RegisterOpts.
Otherwise the spinner would just end, which felt a bit awkward.
I wanted to see a "✓" to know that everything was ok, and a "!" (maybe something else?) if something went wrong.
System and batch jobs don't create deployments, which means nomad tries
to monitor a non-existent deployment when it runs a job and outputs an
error message. This adds a check to make sure a deployment exists before
monitoring. Also fixes some formatting.
Co-authored-by: Tim Gross <tgross@hashicorp.com>
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.