00be4fc63c
This PR deflakes TestTaskRunner_StatsHook_Periodic tests and adds backoff when the driver closes the channel. TestTaskRunner_StatsHook_Periodic is currently the most flaky test - failing ~4% of the time (20 out of 486 workflows). A sample failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/14028/workflows/957b674f-cbcc-4228-96d9-1094fdee5b9c/jobs/128563 . This change has two components: First, it updates the StatsHook so that it backs off when stats channel is closed. In the context of the test where the mock driver emits a single stats update and closes the channel, the test may make tens of thousands update during the period. In real context, if a driver doesn't implement the stats handler properly or when a task finishes, we may generate way too many Stats queries in a tight loop. Here, the backoff reduces these queries. I've added a failing test that shows 154,458 stats updates within 500ms in https://app.circleci.com/pipelines/github/hashicorp/nomad/14092/workflows/50672445-392d-4661-b19e-e3561ed32746/jobs/129423 . Second, the test ignores the first stats update after a task exit. Due to the asynchronicity of updates and channel/context use, it's possible that an update is enqueued while the test marks the task as exited, resulting into a spurious update. |
||
---|---|---|
.. | ||
allocdir | ||
allochealth | ||
allocrunner | ||
allocwatcher | ||
config | ||
consul | ||
devicemanager | ||
dynamicplugins | ||
fingerprint | ||
interfaces | ||
lib | ||
logmon | ||
pluginmanager | ||
servers | ||
state | ||
stats | ||
structs | ||
taskenv | ||
testutil | ||
vaultclient | ||
acl.go | ||
acl_test.go | ||
agent_endpoint.go | ||
agent_endpoint_test.go | ||
alloc_endpoint.go | ||
alloc_endpoint_test.go | ||
alloc_watcher_e2e_test.go | ||
client.go | ||
client_stats_endpoint.go | ||
client_stats_endpoint_test.go | ||
client_test.go | ||
csi_endpoint.go | ||
csi_endpoint_test.go | ||
driver_manager_test.go | ||
enterprise_client_oss.go | ||
fingerprint_manager.go | ||
fingerprint_manager_test.go | ||
fs_endpoint.go | ||
fs_endpoint_test.go | ||
gc.go | ||
gc_test.go | ||
heartbeatstop.go | ||
heartbeatstop_test.go | ||
node_updater.go | ||
rpc.go | ||
rpc_test.go | ||
testing.go | ||
util.go | ||
util_test.go |