The iowait metric obtained from `/proc/stat` can under some circumstances
decrease. The relevant condition is when an interrupt arrives on a different
core than the one that gets woken up for the IO, and a particular counter in the
kernel for that core gets interrupted. This is documented in the man page for
the `proc(5)` pseudo-filesystem, and considered an unfortunate behavior that
can't be changed for the sake of ABI compatibility.
In Nomad, we get the current "busy" time (everything except for idle) and
compare it to the previous busy time to get the counter incremeent. If the
iowait counter decreases and the idle counter increases more than the increase
in the total busy time, we can get a negative total. This previously caused a
panic in our metrics collection (see #15861) but that is being prevented by
reporting an error message.
Fix the bug by putting a zero floor on the values we return from the host CPU
stats calculator.
Backport-of: #18835
this can save a bit of cpu when
running plans for tasks that already exist,
and prevents Nomad tokens from changing,
which can cause task template{}s to restart
unnecessarily.
When waiting on a previous alloc we must query against the leader before
switching to a stale query with index set.
Also check to ensure the response is fresh before using it like #18269
Similar to #18269, it is possible that even if Node.GetClientAllocs
retrieves fresh allocs that the subsequent Alloc.GetAllocs call
retrieves stale allocs. While `diffAlloc(existing, updated)` properly
ignores stale alloc *updates*, alloc deletions have no such check.
So if a client retrieves an alloc created at index 123, and then a
subsequent Alloc.GetAllocs call hits a new server which returns results
at index 100, the client will stop the alloc created at 123 because it
will be missing from the stale response.
This change applies the same logic as #18269 and ensures only fresh
responses are used.
Glossary:
* fresh - modified at an index > the query index
* stale - modified at an index <= the query index
* Rename pages to include roles
* Models and adapters
* [ui] Any policy checks in the UI now check for roles' policies as well as token policies (#18346)
* combinedPolicies as a concept
* Classic decorator on role adapter
* We added a new request for roles, so the test based on a specific order of requests got fickle fast
* Mirage roles cluster scaffolded
* Acceptance test for roles and policies on the login page
* Update mirage mock for nodes fetch to account for role policies / empty token.policies
* Roles-derived policies checks
* [ui] Access Control with Roles and Tokens (#18413)
* top level policies routes moved into access control
* A few more routes and name cleanup
* Delog and test fixes to account for new url prefix and document titles
* Overview page
* Tokens and Roles routes
* Tokens helios table
* Add a role
* Hacky role page and deletion
* New policy keyboard shortcut and roles breadcrumb nav
* If you leave New Role but havent made any changes, remove the newly-created record from store
* Roles index list and general role route crud
* Roles index actually links to roles now
* Helios button styles for new roles and policies
* Handle when you try to create a new role without having any policies
* Token editing generally
* Create Token functionality
* Cant delete self-token but management token editing and deleting is fine
* Upgrading helios caused codemirror to explode, shimmed
* Policies table fix
* without bang-element condition, modifier would refire over and over
* Token TTL or Time setting
* time will take you on
* Mirage hooks for create and list roles
* Ensure policy names only use allow characters in mirage mocks
* Mirage mocked roles and policies in the default cluster
* log and lintfix
* chromedriver to 2.1.2
* unused unit tests removed
* Nice profile dropdown
* With the HDS accordion, rename our internal component scss ref
* design revisions after discussion
* Tooltip on deleted-policy tokens
* Two-step button peripheral isDeleting gcode removed
* Never to null on token save
* copywrite headers added and empty routefiles removed
* acceptance test fixes for policies endpoint
* Route for updating a token
* Policies testfixes
* Ember on-click-outside modifier upgraded with general ember-modifier upgrade
* Test adjustments to account for new profile header dropdown
* Test adjustments for tokens via policy pages
* Removed an unused route
* Access Control index page tests
* a11y tests
* Tokens index acceptance tests generally
* Lintfix
* Token edit page tests
* Token editing tests
* New token expiration tests
* Roles Index tests
* Role editing policies tests
* A complete set of Access Control Roles tests
* Policies test
* Be more specific about which row to check for expiration time
* Nil check on expirationTime equality
* Management tokens shouldnt show No Roles/Policies, give them their own designation
* Route guard on selftoken, conditional columns, and afterModel at parent to prevent orphaned policies on tokens/roles from stopping a new save
* Policy unloading on delete and other todos plus autofocus conditionally re-enabled
* Invalid policies non-links now a concept for Roles index
* HDS style links to make job.variables.alert links look like links again
* Mirage finding looks weird so making model async in hash even though redundant
* Drop rsvp
* RSVP wasnt the problem, cached lookups were
* remove old todo comments
* de-log