rocksdb/table
anand76 35cdd3e71e MultiGet async IO across multiple levels (#10535)
Summary:
This PR exploits parallelism in MultiGet across levels. It applies only to the coroutine version of MultiGet. Previously, MultiGet file reads from SST files in the same level were parallelized. With this PR, MultiGet batches with keys distributed across multiple levels are read in parallel. This is accomplished by splitting the keys not present in a level (determined by bloom filtering) into a separate batch, and processing the new batch in parallel with the original batch.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10535

Test Plan:
1. Ensure existing MultiGet unit tests pass, updating them as necessary
2. New unit tests - TODO
3. Run stress test - TODO

No noticeable regression (<1%) without async IO -
Without PR: `multireadrandom :       7.261 micros/op 1101724 ops/sec 60.007 seconds 66110936 operations;  571.6 MB/s (8168992 of 8168992 found)`
With PR: `multireadrandom :       7.305 micros/op 1095167 ops/sec 60.007 seconds 65717936 operations;  568.2 MB/s (8271992 of 8271992 found)`

For a fully cached DB, but with async IO option on, no regression observed (<1%) -
Without PR: `multireadrandom :       5.201 micros/op 1538027 ops/sec 60.005 seconds 92288936 operations;  797.9 MB/s (11540992 of 11540992 found) `
With PR: `multireadrandom :       5.249 micros/op 1524097 ops/sec 60.005 seconds 91452936 operations;  790.7 MB/s (11649992 of 11649992 found) `

Reviewed By: akankshamahajan15

Differential Revision: D38774009

Pulled By: anand1976

fbshipit-source-id: c955e259749f1c091590ade73105b3ee46cd0007
2022-08-19 16:52:52 -07:00
..
adaptive More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
block_based Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
cuckoo Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
plain Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
block_fetcher.cc Provide support for direct_reads with async_io (#10197) 2022-07-06 11:42:59 -07:00
block_fetcher.h Seek parallelization (#9994) 2022-05-20 16:09:33 -07:00
block_fetcher_test.cc Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
cleanable_test.cc Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
format.cc Set Read rate limiter priority dynamically and pass it to FS (#9996) 2022-05-18 19:41:44 -07:00
format.h Remove code that only compiles for Visual Studio versions older than 2015 (#10065) 2022-05-26 16:55:08 -07:00
get_context.cc Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
get_context.h Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
internal_iterator.h Make InternalKeyComparator not configurable (#10342) 2022-07-14 10:09:31 -07:00
iter_heap.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
iterator.cc Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
iterator_wrapper.h Reuse internal auto readhead_size at each Level (expect L0) for Iterations (#9056) 2021-11-10 16:20:04 -08:00
merger_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
merging_iterator.cc Seek parallelization (#9994) 2022-05-20 16:09:33 -07:00
merging_iterator.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
meta_blocks.cc Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
meta_blocks.h Tests for filter compatibility (#9773) 2022-04-06 15:54:40 -07:00
mock_table.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
mock_table.h Fix some minor issues in the Customizable infrastructure (#8566) 2021-08-19 10:10:47 -07:00
multiget_context.h MultiGet async IO across multiple levels (#10535) 2022-08-19 16:52:52 -07:00
persistent_cache_helper.cc New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
persistent_cache_helper.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
persistent_cache_options.h Use STATIC_AVOID_DESTRUCTION for static objects with non-trivial destructors (#9958) 2022-05-17 09:39:22 -07:00
scoped_arena_iterator.h Remove code that only compiles for Visual Studio versions older than 2015 (#10065) 2022-05-26 16:55:08 -07:00
sst_file_dumper.cc Remove customized naming from InternalKeyComparator (#10343) 2022-07-12 13:30:35 -07:00
sst_file_dumper.h Support using ZDICT_finalizeDictionary to generate zstd dictionary (#9857) 2022-05-20 12:09:09 -07:00
sst_file_reader.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
sst_file_reader_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
sst_file_writer.cc Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
sst_file_writer_collectors.h Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
table_builder.h Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
table_factory.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00
table_properties.cc Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
table_properties_internal.h Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
table_reader.h Break TableReader MultiGet into filter and lookup stages (#10432) 2022-08-04 12:51:57 -07:00
table_reader_bench.cc Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
table_reader_caller.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
table_test.cc Add support for wide-column point lookups (#10540) 2022-08-19 11:51:12 -07:00
two_level_iterator.cc Clarify caching behavior for index and filter partitions (#9068) 2021-10-27 17:23:04 -07:00
two_level_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
unique_id.cc Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
unique_id_impl.h Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00