rocksdb/util
Hui Xiao 240c4126fd Implement superior user & mid IO priority level in GenericRateLimiter (#8595)
Summary:
Context:
An extra IO_USER priority in rate limiter allows users to optionally charge WAL writes / SST reads to rate limiter at this priority level, which then has higher priority than IO_HIGH and IO_LOW. With an extra IO_USER priority, it allows users to better specify the relative urgency/importance among different requests in rate limiter. As a consequence, IO resource management can better prioritize and limit resource based on user's need.

The IO_USER is implemented as superior priority in GenericRateLimiter, in the sense that its request queue will always be iterated first without being constrained to fairness. The reason is that the notion of fairness is only meaningful in helping lower priorities in background IO (i.e, IO_HIGH/MID/LOW) to gain some fair chance to run so that it does not block foreground IO (i.e, the ones that are charged at the level of IO_USER). As we can see, the ultimate goal here is to not blocking foreground IO at IO_USER level, which justifies the superiority of IO_USER.

Similar benefits exist for IO_MID priority.
- Rewrote the logic of deciding the order of iterating request queues of high/low priorities to include the extra user/mid priority w/o affecting the existing behavior (see PR's [comment](https://github.com/facebook/rocksdb/pull/8595/files#r678749331))
- Included the request queue of user-pri/mid-pri in the code path of next-leader-candidate signaling and GenericRateLimiter's destructor
- Included the extra user/mid-pri in bookkeeping data structures: total_bytes_through_ and total_requests_
- Re-written the previous impl of explicitly iterating priorities with a loop from Env::IO_LOW to Env::IO_TOTAL

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8595

Test Plan:
- passed existing rate_limiter_test.cc
- passed added unit tests in rate_limiter_test.cc
- run performance test to verify performance with only high/low requests is not affected by this change
   - Set-up command:
   `TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillrandom --duration=5 --compression_type=none --num=100000000 --disable_auto_compactions=true --write_buffer_size=1048576 --writable_file_max_buffer_size=65536 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --level0_slowdown_writes_trigger=$(((1 << 31) - 1)) --level0_stop_writes_trigger=$(((1 << 31) - 1))`

    - Test command:
   `TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=overwrite --use_existing_db=true --disable_wal=true --duration=30 --compression_type=none --num=100000000 --write_buffer_size=1048576 --writable_file_max_buffer_size=65536 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --level0_slowdown_writes_trigger=$(((1 << 31) - 1)) --level0_stop_writes_trigger=$(((1 << 31) - 1)) --statistics=true --rate_limiter_bytes_per_sec=1048576 --rate_limiter_refill_period_us=1000  --threads=32 |& grep -E '(flush|compact)\.write\.bytes'`

   - Before (on branch upstream/master):
   `rocksdb.compact.write.bytes COUNT : 4014162`
   `rocksdb.flush.write.bytes COUNT : 26715832`
    rocksdb.flush.write.bytes/rocksdb.compact.write.bytes ~= 6.66

   - After (on branch rate_limiter_user_pri):
  `rocksdb.compact.write.bytes COUNT : 3807822`
  `rocksdb.flush.write.bytes COUNT : 26098659`
   rocksdb.flush.write.bytes/rocksdb.compact.write.bytes ~= 6.85

Reviewed By: ajkr

Differential Revision: D30577783

Pulled By: hx235

fbshipit-source-id: 0881f2705ffd13ecd331256bde7e8ec874a353f4
2021-08-31 11:24:27 -07:00
..
aligned_buffer.h Fix wrong comments about function TruncateToPageBoundary. (#6975) 2020-10-07 12:34:34 -07:00
autovector.h Change autovector to have a reserved size in LITE mode (#6868) 2020-05-21 14:48:10 -07:00
autovector_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
bloom_impl.h Ribbon: InterleavedSolutionStorage (#7598) 2020-11-03 12:46:36 -08:00
bloom_test.cc Add Bloom/Ribbon hybrid API support (#8679) 2021-08-20 18:00:16 -07:00
build_version.cc.in Make builds reproducible (#7866) 2021-01-28 17:42:16 -08:00
cast_util.h Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
channel.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
coding_lean.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
coding_test.cc Fix potential overflow of unsigned type in for loop (#6902) 2020-06-02 15:05:07 -07:00
compaction_job_stats_impl.cc Update compaction statistics to include the amount of data read from blob files (#8022) 2021-03-04 00:43:48 -08:00
comparator.cc Add customizable_util.h to the public API (#8301) 2021-06-29 09:08:57 -07:00
compression.h rocksdb: don't call LZ4_loadDictHC with null dictionary 2021-08-09 16:05:46 -07:00
compression_context_cache.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compression_context_cache.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
concurrent_task_limiter_impl.cc Remove TaskLimiterToken::ReleaseOnce for fix (#8567) 2021-07-21 17:37:53 -07:00
concurrent_task_limiter_impl.h Remove TaskLimiterToken::ReleaseOnce for fix (#8567) 2021-07-21 17:37:53 -07:00
core_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
crc32c.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
crc32c.h Implementation of Crc32c combine function (#8305) 2021-06-16 18:30:34 -07:00
crc32c_arm64.cc Mac M1 crc32 intrinsics ARM64 check support proposal (#7893) 2021-03-10 09:05:56 -08:00
crc32c_arm64.h Fix compilation on Apple Silicon (#7714) 2020-12-04 15:22:33 -08:00
crc32c_ppc.c Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_ppc.h Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_ppc_asm.S Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_ppc_constants.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
crc32c_test.cc Implementation of Crc32c combine function (#8305) 2021-06-16 18:30:34 -07:00
defer.h Stable cache keys on ingested SST files (#8669) 2021-08-18 11:33:03 -07:00
defer_test.cc Fix insecure internal API for GetImpl (#8590) 2021-07-29 17:23:01 -07:00
duplicate_detector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom.h Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
dynamic_bloom_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
fastrange.h Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
file_checksum_helper.cc Refactor with VersionEditHandler (#6581) 2020-11-11 08:00:14 -08:00
file_checksum_helper.h Real fix for race in backup custom checksum checking (#7309) 2020-08-26 10:39:20 -07:00
file_reader_writer_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
filelock_test.cc Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
filter_bench.cc Rename variables in ImmutableCFOptions to avoid conflicts with ImmutableDBOptions (#8227) 2021-04-26 12:43:45 -07:00
gflags_compat.h Fix many tests to run with MEM_ENV and ENCRYPTED_ENV; Introduce a MemoryFileSystem class (#7566) 2020-10-27 10:33:09 -07:00
hash.cc Built-in support for generating unique IDs, bug fix (#8708) 2021-08-30 15:20:41 -07:00
hash.h Built-in support for generating unique IDs, bug fix (#8708) 2021-08-30 15:20:41 -07:00
hash128.h Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00
hash_map.h Change HashMap::Insert()'s value to a const reference (#6567) 2020-03-20 14:59:54 -07:00
hash_test.cc Built-in support for generating unique IDs, bug fix (#8708) 2021-08-30 15:20:41 -07:00
heap.h Avoid self-move-assign in pop operation of binary heap. (#7942) 2021-02-19 13:47:25 -08:00
heap_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
kv_map.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_write_bench.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
math.h Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
math128.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
murmurhash.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
murmurhash.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
mutexlock.h Prevents Table Cache to open same files more times (#6707) 2020-04-21 13:16:31 -07:00
ppc-opcode.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
random.cc Add De/Serialization for CompactionInput/Result (#8247) 2021-05-12 12:36:43 -07:00
random.h Add De/Serialization for CompactionInput/Result (#8247) 2021-05-12 12:36:43 -07:00
random_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rate_limiter.cc Implement superior user & mid IO priority level in GenericRateLimiter (#8595) 2021-08-31 11:24:27 -07:00
rate_limiter.h Implement superior user & mid IO priority level in GenericRateLimiter (#8595) 2021-08-31 11:24:27 -07:00
rate_limiter_test.cc Implement superior user & mid IO priority level in GenericRateLimiter (#8595) 2021-08-31 11:24:27 -07:00
repeatable_thread.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
repeatable_thread_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
ribbon_alg.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_config.cc Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_config.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_impl.h Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00
ribbon_test.cc Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00
set_comparator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice_test.cc Handoff checksum Implementation (#7523) 2021-02-10 22:20:32 -08:00
slice_transform_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
status.cc Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
stderr_logger.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
stop_watch.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
string_util.cc Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
string_util.h Built-in support for generating unique IDs, bug fix (#8708) 2021-08-30 15:20:41 -07:00
thread_guard.h Introduce a ThreadGuard class and use it in ExternalSSTFileTest.PickedLevelBug (#8112) 2021-03-25 22:08:58 -07:00
thread_list_test.cc fix thread status synchronization in thread_list_test (#7825) 2021-01-04 10:46:24 -08:00
thread_local.cc Fix typo in ThreadData comment (#7131) 2020-07-15 09:23:23 -07:00
thread_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_local_test.cc Add StartThread type checking wrapper (#8303) 2021-05-19 16:51:13 -07:00
thread_operation.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
threadpool_imp.cc Prevent joining detached thread in ThreadPoolImpl (#8635) 2021-08-06 19:06:02 -07:00
threadpool_imp.h Make it able to lower cpu priority to specific level in threadpool (#6969) 2020-06-13 13:25:20 -07:00
timer.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
timer_queue.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
timer_queue_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
timer_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
user_comparator_wrapper.h Enable backward iterator for keys with user-defined timestamp (#8035) 2021-03-10 11:15:46 -08:00
vector_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
work_queue.h Revamp cache_bench to resemble a real workload (#6629) 2020-04-03 10:26:49 -07:00
work_queue_test.cc Add pipelined & parallel compression optimization (#6262) 2020-04-01 16:40:18 -07:00
xxhash.cc Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00
xxhash.h Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00
xxph3.h Upgrade xxhash, add Hash128 (#8634) 2021-08-20 18:41:51 -07:00