This fixes a bug where Nomad overrides a Dockerfile's STOPSIGNAL with
the default kill_signal (SIGTERM).
This adds a check for kill_signal. If it's not set, it calls
StopContainer instead of Signal, which uses STOPSIGNAL if it's
specified. If both kill_signal and STOPSIGNAL are set, Nomad tries to
stop the container with kill_signal first, before then calling
StopContainer.
Fixes#9989
Follow up to memory oversubscription - expose an env-var to indicate when memory oversubscription is enabled and what the limit is.
This will be helpful for setting hints to app for memory management.
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
This commit ensures Nomad captures the task code more reliably even when the task is killed. This issue affect to `raw_exec` driver, as noted in https://github.com/hashicorp/nomad/issues/10430 .
We fix this issue by ensuring that the TaskRunner only calls `driver.WaitTask` once. The TaskRunner monitors the completion of the task by calling `driver.WaitTask` which should return the task exit code on completion. However, it also could return a "context canceled" error if the agent/executor is shutdown.
Previously, when a task is to be stopped, the killTask path makes two WaitTask calls, and the second returns "context canceled" occasionally because of a "race" in task shutting down and depending on driver, and how fast it shuts down after task completes.
By having a single WaitTask call and consistently waiting for the task, we ensure we capture the exit code reliably before the executor is shutdown or the contexts expired.
I opted to change the TaskRunner implementation to avoid changing the driver interface or requiring 3rd party drivers to update.
Additionally, the PR ensures that attempts to kill the task terminate when the task "naturally" dies. Without this change, if the task dies at the right moment, the `killTask` call may retry to kill an already-dead task for up to 5 minutes before giving up.
The error returned from the stdlib's `exec` package is always a message with
the exit code of the exec'd process, not any error message that process might
have given us. This results in opaque failures for the Nomad user. Cast to an
`ExitError` so that we can access the output from stderr.
This changeset expands on the existing demonstration we had for Ceph by
showing volume creation. It includes a demo setup for Ceph on Vagrant so that
you don't need a whole Ceph cluster to try it out.
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.
This fixes 2 bugs:
- 'job plan' output is now idempotent for jobs making use of injected ports
- tasks will no longer be destructively updated when jobs making use of
injected ports are re-run without changing any user specified part of
job config.
Closes: https://github.com/hashicorp/nomad/issues/10099
Cluster operators want to have better control over memory
oversubscription and may want to enable/disable it based on their
experience.
This PR adds a scheduler configuration field to control memory
oversubscription. It's additional field that can be set in the [API via Scheduler Config](https://www.nomadproject.io/api-docs/operator/scheduler), or [the agent server config](https://www.nomadproject.io/docs/configuration/server#configuring-scheduler-config).
I opted to have the memory oversubscription be an opt-in, but happy to change it. To enable it, operators should call the API with:
```json
{
"MemoryOversubscriptionEnabled": true
}
```
If memory oversubscription is disabled, submitting jobs specifying `memory_max` will get a "Memory oversubscription is not
enabled" warnings, but the jobs will be accepted without them accessing
the additional memory.
The warning message is like:
```
$ nomad job run /tmp/j
Job Warnings:
1 warning(s):
* Memory oversubscription is not enabled; Task cache.redis memory_max value will be ignored
==> Monitoring evaluation "7c444157"
Evaluation triggered by job "example"
==> Monitoring evaluation "7c444157"
Evaluation within deployment: "9d826f13"
Allocation "aa5c3cad" created: node "9272088e", group "cache"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "7c444157" finished with status "complete"
# then you can examine the Alloc AllocatedResources to validate whether the task is allowed to exceed memory:
$ nomad alloc status -json aa5c3cad | jq '.AllocatedResources.Tasks["redis"].Memory'
{
"MemoryMB": 256,
"MemoryMaxMB": 0
}
```
This rethinks namespaces as a filter on list pages rather than a global setting.
The biggest net-new feature here is being able to select All (*) to list all jobs
or CSI volumes across namespaces.
This is the first step in #10268. If a maximum is not specified, the
task group sum uses the memory number instead. The maximum is only
shown when it’s higher than the memory sum.
This updates the UI to use the new fuzzy search API. It’s a drop-in
replacement so the / shortcut to jump to search is preserved, and
results can be cycled through and chosen via arrow keys and the
enter key.
It doesn’t use everything returned by the API:
* deployments and evaluations: these match by id, doesn’t seem like
people would know those or benefit from quick navigation to them
* namespaces: doesn’t seem useful as they currently function
* scaling policies
* tasks: the response doesn’t include an allocation id, which means they
can’t be navigated to in the UI without an additional query
* CSI volumes: aren’t actually returned by the API
Since there’s no API to check the server configuration and know whether
the feature has been disabled, this adds another query in
route:application#beforeModel that acts as feature detection: if the
attempt to query fails (500), the global search field is hidden.
Upon having added another query on load, I realised that beforeModel was
being triggered any time service:router#transitionTo was being called,
which happens upon navigating to a search result, for instance, because
of refreshModel being present on the region query parameter. This PR
adds a check for transition.queryParamsOnly and skips rerunning the
onload queries (token permissions check, license check, fuzzy search
feature detection).
Implementation notes:
* there are changes to unrelated tests to ignore the on-load feature
detection query
* some lifecycle-related guards against undefined were required to
address failures when navigating to an allocation
* the minimum search length of 2 characters is hard-coded as there’s
currently no way to determine min_term_length in the UI
The `Terminates At` field can't be removed from the struct for backwards
compatibility reasons, but there's no purpose to it anymore so we shouldn't be
showing it to end users of the command.