open-nomad/nomad
Tim Gross a51149736d
Rename nomad.broker.total_blocked metric (#15835)
This changeset fixes a long-standing point of confusion in metrics emitted by
the eval broker. The eval broker has a queue of "blocked" evals that are waiting
for an in-flight ("unacked") eval of the same job to be completed. But this
"blocked" state is not the same as the `blocked` status that we write to raft
and expose in the Nomad API to end users. There's a second metric
`nomad.blocked_eval.total_blocked` that refers to evaluations in that
state. This has caused ongoing confusion in major customer incidents and even in
our own documentation! (Fixed in this PR.)

There's little functional change in this PR aside from the name of the metric
emitted, but there's a bit refactoring to clean up the names in `eval_broker.go`
so that there aren't name collisions and multiple names for the same
state. Changes included are:
* Everything that was previously called "pending" referred to entities that were
  associated witht he "ready" metric. These are all now called "ready" to match
  the metric.
* Everything named "blocked" in `eval_broker.go` is now named "pending", except
  for a couple of comments that actually refer to blocked RPCs.
* Added a note to the upgrade guide docs for 1.5.0.
* Fixed the scheduling performance metrics docs because the description for
  `nomad.broker.total_blocked` was actually the description for
  `nomad.blocked_eval.total_blocked`.
2023-01-20 14:23:56 -05:00
..
deploymentwatcher cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
drainer cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
mock rpc: add OIDC login related endpoints. 2023-01-13 13:14:29 +00:00
state bugfix: unit test for GetACLBindingRules (#15583) 2022-12-20 15:06:09 +01:00
stream event stream: ensure token expiry is correctly checked for subs. 2022-10-27 13:08:05 -04:00
structs consul: correctly interpret missing consul checks as unhealthy (#15822) 2023-01-19 14:01:12 -06:00
volumewatcher volumewatcher: prevent panic on nil volume (#15101) 2022-11-01 16:53:10 -04:00
acl.go Authenticate method improvements (#15734) 2023-01-10 09:46:38 -05:00
acl_endpoint.go Merge branch 'main' into sso/gh-13120-oidc-login 2023-01-18 10:05:31 +00:00
acl_endpoint_test.go rpc: add OIDC login related endpoints. 2023-01-13 13:14:29 +00:00
acl_test.go Authenticate method improvements (#15734) 2023-01-10 09:46:38 -05:00
alloc_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
alloc_endpoint_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
autopilot.go autopilot: include only servers from the same region (#15290) 2022-11-17 12:09:36 -05:00
autopilot_oss.go migrate autopilot implementation to raft-autopilot (#14441) 2022-09-01 14:27:10 -04:00
autopilot_test.go autopilot: include only servers from the same region (#15290) 2022-11-17 12:09:36 -05:00
blocked_evals.go cleanup: remove more copies of min/max from helper 2022-08-24 09:56:15 -05:00
blocked_evals_stats.go
blocked_evals_stats_test.go
blocked_evals_system.go
blocked_evals_test.go
client_agent_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
client_agent_endpoint_test.go Data race fixes in tests and a new semgrep rule (#14594) 2022-09-15 10:35:08 -07:00
client_alloc_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
client_alloc_endpoint_test.go Data race fixes in tests and a new semgrep rule (#14594) 2022-09-15 10:35:08 -07:00
client_csi_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
client_csi_endpoint_test.go remove most static RPC handlers (#15451) 2022-12-02 10:12:05 -05:00
client_fs_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
client_fs_endpoint_test.go
client_rpc.go
client_rpc_test.go
client_stats_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
client_stats_endpoint_test.go
config.go sso: add ACL auth-method HTTP API CRUD endpoints (#15338) 2022-11-23 09:38:02 +01:00
consul.go consul: Removed unused ConsulUsage.Kinds. (#11303) 2022-09-22 10:07:14 -05:00
consul_oss_test.go consul: Removed unused ConsulUsage.Kinds. (#11303) 2022-09-22 10:07:14 -05:00
consul_policy.go
consul_policy_oss_test.go
consul_policy_test.go
consul_test.go
core_sched.go variables: limit rekey eval to half the nack timeout (#15102) 2022-11-01 16:50:50 -04:00
core_sched_test.go keyring: safely handle missing keys and restore GC (#15092) 2022-11-01 15:00:50 -04:00
csi_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
csi_endpoint_test.go remove most static RPC handlers (#15451) 2022-12-02 10:12:05 -05:00
deployment_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
deployment_endpoint_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
deployment_watcher_shims.go
drainer_int_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
drainer_shims.go
encrypter.go keyring: update handle to state inside replication loop (#15227) 2022-11-17 08:40:12 -05:00
encrypter_test.go keyring: update handle to state inside replication loop (#15227) 2022-11-17 08:40:12 -05:00
endpoints_oss.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
eval_broker.go Rename nomad.broker.total_blocked metric (#15835) 2023-01-20 14:23:56 -05:00
eval_broker_test.go Rename nomad.broker.total_blocked metric (#15835) 2023-01-20 14:23:56 -05:00
eval_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
eval_endpoint_test.go eval delete: move batching of deletes into RPC handler and state (#15117) 2022-11-14 14:08:13 -05:00
event_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
event_endpoint_test.go event stream: ensure token expiry is correctly checked for subs. 2022-10-27 13:08:05 -04:00
fsm.go acl: add binding rule object state schema and functionality. (#15511) 2022-12-14 08:48:18 +01:00
fsm_oss.go
fsm_registry_oss.go
fsm_test.go acl: add binding rule object state schema and functionality. (#15511) 2022-12-14 08:48:18 +01:00
heartbeat.go remove most static RPC handlers (#15451) 2022-12-02 10:12:05 -05:00
heartbeat_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
job_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
job_endpoint_hook_connect.go cleanup more helper updates (#14638) 2022-09-21 14:53:25 -05:00
job_endpoint_hook_connect_test.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
job_endpoint_hook_expose_check.go
job_endpoint_hook_expose_check_test.go
job_endpoint_hook_vault.go cleanup more helper updates (#14638) 2022-09-21 14:53:25 -05:00
job_endpoint_hook_vault_oss.go
job_endpoint_hooks.go servicedisco: implicit constraint for nomad v1.4 when using nsd checks (#14868) 2022-10-11 08:21:42 -05:00
job_endpoint_hooks_test.go servicedisco: implicit constraint for nomad v1.4 when using nsd checks (#14868) 2022-10-11 08:21:42 -05:00
job_endpoint_oss.go scheduler: create placements for non-register MRD (#15325) 2022-11-25 12:45:34 -05:00
job_endpoint_oss_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
job_endpoint_test.go [ui] Adds meta to job list stub and displays a pack logo on the jobs index (#14833) 2022-11-02 16:58:24 -04:00
job_endpoint_validators.go
job_endpoint_validators_test.go allocrunner: refactor task coordinator (#14009) 2022-08-22 18:38:49 -04:00
keyring_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
keyring_endpoint_test.go
leader.go core: add ACL binding rule to replication system. (#15555) 2022-12-16 09:08:00 +01:00
leader_oss.go
leader_test.go cleanup: remove usage of consul/sdk/testutil/retry (#15609) 2023-01-02 08:06:20 -06:00
merge.go
namespace_endpoint.go Authenticate method improvements (#15734) 2023-01-10 09:46:38 -05:00
namespace_endpoint_test.go
node_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
node_endpoint_test.go remove most static RPC handlers (#15451) 2022-12-02 10:12:05 -05:00
operator_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
operator_endpoint_test.go ci: swap freeport for portal in packages (#15661) 2023-01-03 11:25:20 -06:00
periodic.go make version checks specific to region (1.4.x) (#14912) 2022-10-17 16:23:51 -04:00
periodic_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
periodic_endpoint_test.go
periodic_test.go
plan_apply.go keyring: safely handle missing keys and restore GC (#15092) 2022-11-01 15:00:50 -04:00
plan_apply_node_tracker.go
plan_apply_node_tracker_test.go
plan_apply_oss.go
plan_apply_pool.go
plan_apply_pool_test.go
plan_apply_test.go fix panic from keyring raft entries being written during upgrade (#14821) 2022-10-06 12:47:02 -04:00
plan_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
plan_endpoint_test.go
plan_normalization_test.go
plan_queue.go Add missing timer reset (#15134) 2022-11-03 18:57:57 -04:00
plan_queue_test.go
raft_rpc.go
regions_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
regions_endpoint_test.go
rpc.go
rpc_test.go Pre forwarding authentication (#15417) 2022-12-06 14:44:03 -05:00
scaling_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
scaling_endpoint_test.go
search_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
search_endpoint_oss.go rename SecureVariables to Variables throughout 2022-08-26 16:06:24 -04:00
search_endpoint_test.go
serf.go migrate autopilot implementation to raft-autopilot (#14441) 2022-09-01 14:27:10 -04:00
serf_test.go ci: fix TestNomad_BootstrapExpect_NonVoter test (#14407) 2022-08-30 16:32:54 -04:00
server.go Add raft snapshot configuration options (#15522) 2023-01-20 14:21:51 -05:00
server_setup.go core: move LicenseConfig to shared file (#14247) 2022-08-23 13:44:10 -07:00
server_setup_oss.go migrate autopilot implementation to raft-autopilot (#14441) 2022-09-01 14:27:10 -04:00
server_test.go Add raft snapshot configuration options (#15522) 2023-01-20 14:21:51 -05:00
service_registration_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
service_registration_endpoint_test.go deps: update set and test (#14680) 2022-09-26 08:28:03 -05:00
stats_fetcher.go migrate autopilot implementation to raft-autopilot (#14441) 2022-09-01 14:27:10 -04:00
stats_fetcher_test.go test: fix concurrent map access in TestStatsFetcher (#14496) 2022-09-08 10:41:15 -04:00
status_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
status_endpoint_test.go
system_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
system_endpoint_test.go
testing.go ci: swap freeport for portal in packages (#15661) 2023-01-03 11:25:20 -06:00
testing_oss.go
timetable.go
timetable_test.go
util.go make version checks specific to region (1.4.x) (#14912) 2022-10-17 16:23:51 -04:00
util_test.go make version checks specific to region (1.4.x) (#14912) 2022-10-17 16:23:51 -04:00
variables_endpoint.go provide RPCContext to all RPC handlers (#15430) 2022-12-01 10:05:15 -05:00
variables_endpoint_test.go deps: update shoenig/test to v0.6.0 (#15715) 2023-01-09 09:37:08 -06:00
vault.go vault: configure user agent on Nomad vault clients (#15745) 2023-01-10 10:39:45 -06:00
vault_test.go cleanup: replace TypeToPtr helper methods with pointer.Of (#14151) 2022-08-17 18:26:34 +02:00
vault_testing.go vault: detect namespace change in config reload (#14298) 2022-08-24 17:03:29 -04:00
worker.go make version checks specific to region (1.4.x) (#14912) 2022-10-17 16:23:51 -04:00
worker_string_schedulerworkerstatus.go
worker_string_workerstatus.go
worker_test.go eval broker: shed all but one blocked eval per job after ack (#14621) 2022-11-16 16:10:11 -05:00