rocksdb/table/block_based
Peter Dillinger f9db0c6e9c Refactor block cache tracing w/improved MultiGet (#11339)
Summary:
After https://github.com/facebook/rocksdb/issues/11301, I wasn't sure whether I had regressed block cache tracing with MultiGet. Demo PR https://github.com/facebook/rocksdb/issues/11330 shows the flawed state of tracing MultiGet before my change, and based on the unit test, there was essentially no change in tracing behavior with https://github.com/facebook/rocksdb/issues/11301. This change is to leave that code and behavior better than I found it.

This change is not intended to change any production behaviors except when block cache tracing is active, though might improve general read path efficiency by disabling some related tracking when such tracing is disabled.

More detail on production code:
* Refactoring to consolidate the construction of BlockCacheTraceRecord, and other related functionality, in block-based table reader, though it's somewhat awkward to preserve an optimization to avoid copying Slices into temporary strings in BlockCacheLookupContext.
* Accurately track cache hits and misses (etc.) for each data block accessed by a MultiGet(). (Previously reported hits as misses.)
* Reduced repeated checking of `block_cache_tracer_` state (by creating lookup_context only when active) for efficiency and to reduce the risk of corner case bugs where tracing is enabled or disabled for different parts of a read op. (See a TODO below)
* Improved estimate calculation for num_keys_in_block (see code comment)

Possible follow-up:
* `XXX:` use_cache=true means double cache query? (possible double-query of block cache when allow_mmap_reads=true)
* `TODO:` need more than one lookup_context here to track individual filter and index partition hits and misses
* `TODO:` optimize more state checks of `block_cache_tracer_` down to `lookup_context != nullptr`
* Pre-existing `XXX:` There appear to be 'break' statements above that bypass this writing of the block cache trace record
* Expand test coverage (see below)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11339

Test Plan:
* Added a basic unit test for block cache tracing MultiGet, for now just covering one data block with two keys.
* Added HitMissCountingCache to independently verify that the actual block cache trace and expected block cache trace also agree with the actual number of cache hits / misses (nothing missing or mislabeled). For now only used with MultiGet test.
* Better testing of num_keys_in_block, for now just with MultiGet
* Misc improvements to table_test to improve clarity, such as making it clear that certain keys are auto-inserted at the start of every test.

Performance test:
Testing multireadrandom as in https://github.com/facebook/rocksdb/issues/11301, except averaging over distinct runs rather than [-X30] which doesn't seem to sufficiently reset after each run to work as an independent test run.

Base with revert of 11301: 3148926 ops/sec
Base: 3019146 ops/sec
New: 2999529 ops/sec

Possibly a tiny MultiGet CPU regression with this change. We are now always allocating an additional vector for the LookupContexts. I'm still contemplating options to try to correct the regression in https://github.com/facebook/rocksdb/issues/11301.

Testing readrandom:
Base with revert of 11301: 2311988
Base: 2281726
New: 2299722

Possibly a tiny Get CPU improvement with this change. We are now avoiding some unnecessary LookupContext population.

Reviewed By: akankshamahajan15

Differential Revision: D44557845

Pulled By: pdillinger

fbshipit-source-id: b841691799d2a48fb59cc8880dc7cbb1e107ae3d
2023-04-07 12:55:56 -07:00
..
binary_search_index_reader.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
binary_search_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
block.cc Better support for merge operation with data block hash index (#11356) 2023-04-07 10:06:03 -07:00
block.h Put Cache and CacheWrapper in new public header (#11192) 2023-02-09 12:12:02 -08:00
block_based_table_builder.cc Remove compressed block cache (#11117) 2023-01-24 17:09:19 -08:00
block_based_table_builder.h Major Cache refactoring, CPU efficiency improvement (#10975) 2023-01-11 14:20:40 -08:00
block_based_table_factory.cc Change default block cache from 8MB to 32MB (#11350) 2023-04-04 15:33:24 -07:00
block_based_table_factory.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
block_based_table_iterator.cc Provide support for direct_reads with async_io (#10197) 2022-07-06 11:42:59 -07:00
block_based_table_iterator.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_based_table_reader.cc Refactor block cache tracing w/improved MultiGet (#11339) 2023-04-07 12:55:56 -07:00
block_based_table_reader.h Refactor block cache tracing w/improved MultiGet (#11339) 2023-04-07 12:55:56 -07:00
block_based_table_reader_impl.h HyperClockCache support for SecondaryCache, with refactoring (#11301) 2023-03-17 20:23:49 -07:00
block_based_table_reader_sync_and_async.h Refactor block cache tracing w/improved MultiGet (#11339) 2023-04-07 12:55:56 -07:00
block_based_table_reader_test.cc Add a new MultiGetEntity API (#11222) 2023-02-15 09:34:17 -08:00
block_builder.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_builder.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_cache.cc Simplify tracking entries already in SecondaryCache (#11299) 2023-03-15 17:51:44 -07:00
block_cache.h Remove compressed block cache (#11117) 2023-01-24 17:09:19 -08:00
block_prefetcher.cc Fix stress test failure for async_io (#10660) 2022-09-12 14:48:06 -07:00
block_prefetcher.h Provide support for direct_reads with async_io (#10197) 2022-07-06 11:42:59 -07:00
block_prefix_index.cc Fix bug with kHashSearch and changing prefix_extractor with SetOptions (#10128) 2022-06-10 08:51:45 -07:00
block_prefix_index.h Fix bug with kHashSearch and changing prefix_extractor with SetOptions (#10128) 2022-06-10 08:51:45 -07:00
block_test.cc Print stack traces on frozen tests in CI (#10828) 2022-10-18 00:35:35 -07:00
block_type.h Remove deprecated block-based filter (#10184) 2022-06-16 15:51:33 -07:00
cachable_entry.h HyperClockCache support for SecondaryCache, with refactoring (#11301) 2023-03-17 20:23:49 -07:00
data_block_footer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_footer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
data_block_hash_index.h Fix build with gcc 13 by including <cstdint> (#11118) 2023-01-25 14:30:32 -08:00
data_block_hash_index_test.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
filter_block.h Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
filter_block_reader_common.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
filter_block_reader_common.h Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
filter_policy.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
filter_policy_internal.h Remove deprecated block-based filter (#10184) 2022-06-16 15:51:33 -07:00
flush_block_policy.cc Remove FactoryFunc from LoadXXXObject (#11203) 2023-02-17 12:54:07 -08:00
flush_block_policy.h Make FlushBlockPolicyFactory into a Customizable class (#8432) 2021-07-12 09:04:59 -07:00
full_filter_block.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
full_filter_block.h Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
full_filter_block_test.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
hash_index_reader.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
hash_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
index_builder.cc Make InternalKeyComparator not configurable (#10342) 2022-07-14 10:09:31 -07:00
index_builder.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
index_reader_common.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
index_reader_common.h Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
mock_block_based_table.h Remove deprecated block-based filter (#10184) 2022-06-16 15:51:33 -07:00
parsed_full_filter_block.cc Hide FilterBits{Builder,Reader} from public API (#9592) 2022-02-17 16:34:46 -08:00
parsed_full_filter_block.h Major Cache refactoring, CPU efficiency improvement (#10975) 2023-01-11 14:20:40 -08:00
partitioned_filter_block.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
partitioned_filter_block.h Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
partitioned_filter_block_test.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
partitioned_index_iterator.cc Provide support for direct_reads with async_io (#10197) 2022-07-06 11:42:59 -07:00
partitioned_index_iterator.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
partitioned_index_reader.cc Use user-provided ReadOptions for metadata block reads more often (#11208) 2023-04-04 16:53:14 -07:00
partitioned_index_reader.h Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
reader_common.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
reader_common.h Put Cache and CacheWrapper in new public header (#11192) 2023-02-09 12:12:02 -08:00
uncompression_dict_reader.cc HyperClockCache support for SecondaryCache, with refactoring (#11301) 2023-03-17 20:23:49 -07:00
uncompression_dict_reader.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00