open-nomad/client
Tim Gross 06972fae0c
prioritized client updates (#17354)
The allocrunner sends several updates to the server during the early lifecycle
of an allocation and its tasks. Clients batch-up allocation updates every 200ms,
but experiments like the C2M challenge has shown that even with this batching,
servers can be overwhelmed with client updates during high volume
deployments. Benchmarking done in #9451 has shown that client updates can easily
represent ~70% of all Nomad Raft traffic.

Each allocation sends many updates during its lifetime, but only those that
change the `ClientStatus` field are critical for progressing a deployment or
kicking off a reschedule to recover from failures.

Add a priority to the client allocation sync and update the `syncTicker`
receiver so that we only send an update if there's a high priority update
waiting, or on every 5th tick. This means when there are no high priority
updates, the client will send updates at most every 1s instead of
200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as
well as reduce client-to-server RPC traffic.

This changeset also switches from a channel-based collection of updates to a
shared buffer, so as to split batching from sending and prevent backpressure
onto the allocrunner when the RPC is slow. This doesn't have a major performance
benefit in the benchmarks but makes the implementation of the prioritized update
simpler.

Fixes: #9451
2023-05-31 15:34:16 -04:00
..
allocdir users: eliminate nobody user memoization (#16904) 2023-04-17 12:30:30 -05:00
allochealth [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
allocrunner prioritized client updates (#17354) 2023-05-31 15:34:16 -04:00
allocwatcher allocrunner: provide factory function so we can build mock ARs (#17161) 2023-05-12 13:29:44 -04:00
config compliance: add headers with fixed copywrite tool (#17353) 2023-05-30 09:20:32 -05:00
consul [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
devicemanager [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
dynamicplugins Merge pull request #16836 from hashicorp/compliance/add-headers 2023-04-10 16:32:03 -07:00
fingerprint client: fix Consul version finterprint (#17349) 2023-05-30 11:07:57 -04:00
interfaces [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
lib [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
logmon [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
pluginmanager Merge pull request #16836 from hashicorp/compliance/add-headers 2023-04-10 16:32:03 -07:00
servers [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
serviceregistration [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
state allocrunner: provide factory function so we can build mock ARs (#17161) 2023-05-12 13:29:44 -04:00
stats [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
structs prioritized client updates (#17354) 2023-05-31 15:34:16 -04:00
taskenv fix host port handling for ipv6 (#16723) 2023-04-20 19:53:20 -07:00
testutil Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
vaultclient [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
acl.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
acl_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
agent_endpoint.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
agent_endpoint_test.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
alloc_endpoint.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
alloc_endpoint_test.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
alloc_watcher_e2e_test.go cli: tls certs not created with correct SANs (#16959) 2023-05-22 09:31:56 -04:00
client.go prioritized client updates (#17354) 2023-05-31 15:34:16 -04:00
client_interface_test.go prioritized client updates (#17354) 2023-05-31 15:34:16 -04:00
client_stats_endpoint.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
client_stats_endpoint_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
client_test.go prioritized client updates (#17354) 2023-05-31 15:34:16 -04:00
csi_endpoint.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
csi_endpoint_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
drain.go compliance: add headers with fixed copywrite tool (#17353) 2023-05-30 09:20:32 -05:00
drain_test.go compliance: add headers with fixed copywrite tool (#17353) 2023-05-30 09:20:32 -05:00
driver_manager_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
enterprise_client_oss.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
fingerprint_manager.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
fingerprint_manager_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
fs_endpoint.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
fs_endpoint_test.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
gc.go allocrunner: provide factory function so we can build mock ARs (#17161) 2023-05-12 13:29:44 -04:00
gc_test.go allocrunner: provide factory function so we can build mock ARs (#17161) 2023-05-12 13:29:44 -04:00
heartbeatstop.go allocrunner: provide factory function so we can build mock ARs (#17161) 2023-05-12 13:29:44 -04:00
heartbeatstop_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
meta_endpoint.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
meta_endpoint_test.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
node_updater.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
rpc.go Revert "hashicorp/go-msgpack v2 (#16810)" (#17047) 2023-05-01 17:18:34 -04:00
rpc_test.go cli: tls certs not created with correct SANs (#16959) 2023-05-22 09:31:56 -04:00
testing.go [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
util.go client: de-duplicate alloc updates and gate during restore (#17074) 2023-05-11 09:05:24 -04:00