open-nomad/client/allocrunner/taskrunner
Mahmood Ali 00be4fc63c
tests: deflake TestTaskRunner_StatsHook_Periodic (#9734)
This PR deflakes TestTaskRunner_StatsHook_Periodic tests and adds backoff when the driver closes the channel.

TestTaskRunner_StatsHook_Periodic is currently the most flaky test - failing ~4% of the time (20 out of 486 workflows). A sample failure: https://app.circleci.com/pipelines/github/hashicorp/nomad/14028/workflows/957b674f-cbcc-4228-96d9-1094fdee5b9c/jobs/128563 .

This change has two components:

First, it updates the StatsHook so that it backs off when stats channel is closed. In the context of the test where the mock driver emits a single stats update and closes the channel, the test may make tens of thousands update during the period. In real context, if a driver doesn't implement the stats handler properly or when a task finishes, we may generate way too many Stats queries in a tight loop. Here, the backoff reduces these queries. I've added a failing test that shows 154,458 stats updates within 500ms in https://app.circleci.com/pipelines/github/hashicorp/nomad/14092/workflows/50672445-392d-4661-b19e-e3561ed32746/jobs/129423 .

Second, the test ignores the first stats update after a task exit. Due to the asynchronicity of updates and channel/context use, it's possible that an update is enqueued while the test marks the task as exited, resulting into a spurious update.
2021-01-06 16:03:00 -05:00
..
getter update template and artifact interpolation to use client-relative paths 2021-01-04 22:25:34 +00:00
interfaces template: trigger change_mode for dynamic secrets on restore (#9636) 2020-12-16 13:36:19 -05:00
restarts lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
state client: test logmon cleanup 2019-03-04 13:15:15 -08:00
template enabled broken test that is no longer broken 2021-01-04 22:25:35 +00:00
testdata executor/linux: make chroot binary paths absolute 2019-04-01 15:45:31 -07:00
artifact_hook.go update template and artifact interpolation to use client-relative paths 2021-01-04 22:25:34 +00:00
artifact_hook_test.go update template and artifact interpolation to use client-relative paths 2021-01-04 22:25:34 +00:00
connect_native_hook.go consul/connect: add support for bridge networks with connect native tasks 2020-07-29 09:26:01 -05:00
connect_native_hook_test.go consul/connect: fixup tests to use new consul sdk 2020-08-24 12:02:41 -05:00
device_hook.go
device_hook_test.go
dispatch_hook.go
dispatch_hook_test.go
driver_handle.go implement client endpoint of nomad exec 2019-05-09 16:49:08 -04:00
envoy_bootstrap_hook.go consul/connect: dynamically select envoy sidecar at runtime 2020-10-13 09:14:12 -05:00
envoy_bootstrap_hook_test.go consul/connect: dynamically select envoy sidecar at runtime 2020-10-13 09:14:12 -05:00
envoy_version_hook.go consul/connect: fix regression where client connect images ignored 2020-12-14 09:47:55 -06:00
envoy_version_hook_test.go update template and artifact interpolation to use client-relative paths 2021-01-04 22:25:34 +00:00
errors.go
errors_test.go
lazy_handle.go
lifecycle.go template: trigger change_mode for dynamic secrets on restore (#9636) 2020-12-16 13:36:19 -05:00
logmon_hook.go address review comments 2019-12-13 11:21:00 -05:00
logmon_hook_test.go driver: allow disabling log collection 2019-12-08 14:15:03 -05:00
logmon_hook_unix_test.go driver: allow disabling log collection 2019-12-08 14:15:03 -05:00
plugin_supervisor_hook.go Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
script_check_hook.go consul: fix validation of task in group-level script-checks 2020-09-28 15:02:59 -05:00
script_check_hook_test.go consul: fix validation of task in group-level script-checks 2020-09-28 15:02:59 -05:00
service_hook.go Consul with CNI and host_network addresses (#9095) 2020-10-15 15:32:21 -04:00
service_hook_test.go tr/service_hook: reset initialized flag during deregister 2020-04-06 16:05:36 -04:00
sids_hook.go client: PR cleanup - improved logging around kill task in SIDS hook 2020-01-31 19:05:23 -06:00
sids_hook_test.go tests: set consul token for nomad client for testing SIDS TR hook 2020-01-31 19:06:15 -06:00
stats_hook.go tests: deflake TestTaskRunner_StatsHook_Periodic (#9734) 2021-01-06 16:03:00 -05:00
stats_hook_test.go tests: deflake TestTaskRunner_StatsHook_Periodic (#9734) 2021-01-06 16:03:00 -05:00
task_dir_hook.go update template and artifact interpolation to use client-relative paths 2021-01-04 22:25:34 +00:00
task_runner.go lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
task_runner_getters.go lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
task_runner_hooks.go consul: always include task services hook 2021-01-05 08:47:19 -06:00
task_runner_test.go tests: restart restartpolicy for all tasks in tests 2020-03-24 21:52:48 -04:00
tasklet.go comments: cleanup some leftover debug comments and such 2020-01-31 19:04:35 -06:00
tasklet_test.go support script checks for task group services (#6197) 2019-09-03 15:09:04 -04:00
template_hook.go oss compoments for multi-vault namespaces 2020-07-24 10:14:59 -04:00
validate_hook.go s/0.13/1.0/g 2020-10-14 15:17:47 -07:00
validate_hook_test.go
vault_hook.go
vault_hook_test.go
volume_hook.go volumes: return better error messages for unsupported task drivers (#8030) 2020-05-21 09:18:02 -04:00
volume_hook_test.go volumes: return better error messages for unsupported task drivers (#8030) 2020-05-21 09:18:02 -04:00