This PR adds 2 helper functions to the helpers/bbolt package
- Iterate: iterate every key in a bucket. Automatically decodes the
msg pack value into the provided value argument.
- DeletePrefix: deletes every key in a bucket starting with a
given prefix. Manages the wrapper's hash values accordingly. Uses
a cursor & sync to operate efficiently.
It appears way back when this was first implemented in
9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from
NodeEvict (with a correct comment) to NodeUpdate. The comment was
changed from referring to only evictions to referring to "all allocs" in
the first sentence and "stop or evict" in the second.
This confuses every time I see it because I read the name (NodeUpdate)
and first sentence ("all the allocs") and assume this represents *all*
allocations... which isn't true.
I'm going to assume I'm the only one who doesn't read the 2nd sentence
and that's why this suboptimal wording has lasted 7 years, but can we
change it for my sake?
- Use latest nomad 1.3.1 and consul 1.12.2
- Use latest Ubuntu LTS (easier to get podman plugin working, 18.04 does not have podman in main repositories).
- Add increased memory for libvirt to match other VM providers
When calculating a services namespace for registration, the code
assumed the first task within the task array would include a
service block. This is incorrect as it is possible only a latter
task within the array contains a service definition.
This change fixes the logic, so we correctly search for a service
definition before identifying the namespace.
This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint.
The value of 'choose' is in the form '<number>|<key>', number is the number
of desired services and key is a value unique but consistent to the requester
(e.g. allocID).
Folks aren't really expected to use this API directly, but rather through consul-template
which will soon be getting a new helper function making use of this query parameter.
Example,
curl 'localhost:4646/v1/service/redis?choose=2|abc123'
Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.
The plan applier has to get a snapshot with a minimum index for the
plan it's working on in order to ensure consistency. Under heavy raft
loads, we can exceed the timeout. When this happens, we hit a bug
where the plan applier blocks waiting on the `indexCh` forever, and
all schedulers will block in `Plan.Submit`.
Closing the `indexCh` when the `asyncPlanWait` is done with it will
prevent the deadlock without impacting correctness of the previous
snapshot index.
This changeset includes the a PoC failing test that works by injecting
a large timeout into the state store. We need to turn this into a test
we can run normally without breaking the state store before we can
merge this PR.
Increase `snapshotMinIndex` timeout to 10s.
This timeout creates backpressure where any concurrent `Plan.Submit`
RPCs will block waiting for results. This sheds load across all
servers and gives raft some CPU to catch up, because schedulers won't
dequeue more work while waiting. Increase it to 10s based on
observations of large production clusters.
The changelog entry for #13340 indicated it was an improvement. But on
discussion, it was determined that this was a workaround for a
regression. Update the changelog to make this clear.
In addition to jobs, there are other objects in Nomad that have a
specific format and can be provided to commands and API endpoints.
This commit creates a new menu section to hold the specification for
volumes and update the command pages to point to the new centralized
definition.
Redirecting the previous entries is not possible with `redirect.js`
because they are done server-side and URL fragments are not accessible
to detect a match. So we provide hidden anchors with a link to the new
page to guide users towards the new documentation.
Co-authored-by: Tim Gross <tgross@hashicorp.com>