Use the same output format when listing multiple evals in the `eval
list` command and when `eval status <prefix>` matches more than one
eval.
Include the eval namespace in all output formats and always include the
job ID in `eval status` since, even `node-update` evals are related to a
job.
Add Node ID to the evals table output to help differentiate
`node-update` evals.
Co-authored-by: James Rasell <jrasell@hashicorp.com>
Whenever a node joins the cluster, either for the first time or after
being `down`, we emit a evaluation for every system job to ensure all
applicable system jobs are running on the node.
This patch adds an optimization to skip creating evaluations for system
jobs not in the current node's DC. While the scheduler performs the same
feasability check, skipping the creation of the evaluation altogether
saves disk, network, and memory.
This PR fixes a bug where client configuration max_kill_timeout was
not being enforced. The feature was introduced in 9f44780 but seems
to have been removed during the major drivers refactoring.
We can make sure the value is enforced by pluming it through the DriverHandler,
which now uses the lesser of the task.killTimeout or client.maxKillTimeout.
Also updates Event.SetKillTimeout to require both the task.killTimeout and
client.maxKillTimeout so that we don't make the mistake of using the wrong
value - as it was being given only the task.killTimeout before.
api: apply new ACL check for wildcard namespace
In #13606 the ACL check was refactored to better support the all
namespaces wildcard (`*`). This commit applies the changes to the jobs
and alloc list endpoints.
Improve how the all namespaces wildcard (`*`) is handled when checking
ACL permissions. When using the wildcard namespace the `AllowNsOp` would
return false since it looks for a namespace called `*` to match.
This commit changes this behavior to return `true` when the queried
namespace is `*` and the token allows the operation in _any_ namespace.
Actual permission must be checked per object. The helper function
`AllowNsOpFunc` returns a function that can be used to make this
verification.
The `operator debug` command doesn't output the leader anywhere in the
output, which adds extra burden to offline debugging (away from an
ongoing incident where you can simply check manually). Query the
`/v1/status/leader` API but degrade gracefully.
* core: allow pause/un-pause of eval broker on region leader.
* agent: add ability to pause eval broker via scheduler config.
* cli: add operator scheduler commands to interact with config.
* api: add ability to pause eval broker via scheduler config
* e2e: add operator scheduler test for eval broker pause.
* docs: include new opertor scheduler CLI and pause eval API info.
This PR adds 2 helper functions to the helpers/bbolt package
- Iterate: iterate every key in a bucket. Automatically decodes the
msg pack value into the provided value argument.
- DeletePrefix: deletes every key in a bucket starting with a
given prefix. Manages the wrapper's hash values accordingly. Uses
a cursor & sync to operate efficiently.
It appears way back when this was first implemented in
9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from
NodeEvict (with a correct comment) to NodeUpdate. The comment was
changed from referring to only evictions to referring to "all allocs" in
the first sentence and "stop or evict" in the second.
This confuses every time I see it because I read the name (NodeUpdate)
and first sentence ("all the allocs") and assume this represents *all*
allocations... which isn't true.
I'm going to assume I'm the only one who doesn't read the 2nd sentence
and that's why this suboptimal wording has lasted 7 years, but can we
change it for my sake?
- Use latest nomad 1.3.1 and consul 1.12.2
- Use latest Ubuntu LTS (easier to get podman plugin working, 18.04 does not have podman in main repositories).
- Add increased memory for libvirt to match other VM providers
When calculating a services namespace for registration, the code
assumed the first task within the task array would include a
service block. This is incorrect as it is possible only a latter
task within the array contains a service definition.
This change fixes the logic, so we correctly search for a service
definition before identifying the namespace.
This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint.
The value of 'choose' is in the form '<number>|<key>', number is the number
of desired services and key is a value unique but consistent to the requester
(e.g. allocID).
Folks aren't really expected to use this API directly, but rather through consul-template
which will soon be getting a new helper function making use of this query parameter.
Example,
curl 'localhost:4646/v1/service/redis?choose=2|abc123'
Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.