00be4fc63c
This PR deflakes TestTaskRunner_StatsHook_Periodic tests and adds backoff when the driver closes the channel. TestTaskRunner_StatsHook_Periodic is currently the most flaky test - failing ~4% of the time (20 out of 486 workflows). A sample failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/14028/workflows/957b674f-cbcc-4228-96d9-1094fdee5b9c/jobs/128563 . This change has two components: First, it updates the StatsHook so that it backs off when stats channel is closed. In the context of the test where the mock driver emits a single stats update and closes the channel, the test may make tens of thousands update during the period. In real context, if a driver doesn't implement the stats handler properly or when a task finishes, we may generate way too many Stats queries in a tight loop. Here, the backoff reduces these queries. I've added a failing test that shows 154,458 stats updates within 500ms in https://app.circleci.com/pipelines/github/hashicorp/nomad/14092/workflows/50672445-392d-4661-b19e-e3561ed32746/jobs/129423 . Second, the test ignores the first stats update after a task exit. Due to the asynchronicity of updates and channel/context use, it's possible that an update is enqueued while the test marks the task as exited, resulting into a spurious update. |
||
---|---|---|
.. | ||
interfaces | ||
state | ||
taskrunner | ||
alloc_runner.go | ||
alloc_runner_hooks.go | ||
alloc_runner_test.go | ||
alloc_runner_unix_test.go | ||
allocdir_hook.go | ||
config.go | ||
consul_grpc_sock_hook.go | ||
consul_grpc_sock_hook_test.go | ||
consul_http_sock_hook.go | ||
consul_http_sock_hook_test.go | ||
csi_hook.go | ||
groupservice_hook.go | ||
groupservice_hook_test.go | ||
health_hook.go | ||
health_hook_test.go | ||
migrate_hook.go | ||
network_hook.go | ||
network_hook_test.go | ||
network_manager_linux.go | ||
network_manager_linux_test.go | ||
network_manager_nonlinux.go | ||
networking.go | ||
networking_bridge_linux.go | ||
networking_cni.go | ||
task_hook_coordinator.go | ||
task_hook_coordinator_test.go | ||
testing.go | ||
upstream_allocs_hook.go | ||
util.go |