rocksdb/utilities/blob_db
anand76 fefd4b98c5 Introduce a new MultiGet batching implementation (#5011)
Summary:
This PR introduces a new MultiGet() API, with the underlying implementation grouping keys based on SST file and batching lookups in a file. The reason for the new API is twofold - the definition allows callers to allocate storage for status and values on stack instead of std::vector, as well as return values as PinnableSlices in order to avoid copying, and it keeps the original MultiGet() implementation intact while we experiment with batching.

Batching is useful when there is some spatial locality to the keys being queries, as well as larger batch sizes. The main benefits are due to -
1. Fewer function calls, especially to BlockBasedTableReader::MultiGet() and FullFilterBlockReader::KeysMayMatch()
2. Bloom filter cachelines can be prefetched, hiding the cache miss latency

The next step is to optimize the binary searches in the level_storage_info, index blocks and data blocks, since we could reduce the number of key comparisons if the keys are relatively close to each other. The batching optimizations also need to be extended to other formats, such as PlainTable and filter formats. This also needs to be added to db_stress.

Benchmark results from db_bench for various batch size/locality of reference combinations are given below. Locality was simulated by offsetting the keys in a batch by a stride length. Each SST file is about 8.6MB uncompressed and key/value size is 16/100 uncompressed. To focus on the cpu benefit of batching, the runs were single threaded and bound to the same cpu to eliminate interference from other system events. The results show a 10-25% improvement in micros/op from smaller to larger batch sizes (4 - 32).

Batch   Sizes

1        | 2        | 4         | 8      | 16  | 32

Random pattern (Stride length 0)
4.158 | 4.109 | 4.026 | 4.05 | 4.1 | 4.074        - Get
4.438 | 4.302 | 4.165 | 4.122 | 4.096 | 4.075 - MultiGet (no batching)
4.461 | 4.256 | 4.277 | 4.11 | 4.182 | 4.14        - MultiGet (w/ batching)

Good locality (Stride length 16)
4.048 | 3.659 | 3.248 | 2.99 | 2.84 | 2.753
4.429 | 3.728 | 3.406 | 3.053 | 2.911 | 2.781
4.452 | 3.45 | 2.833 | 2.451 | 2.233 | 2.135

Good locality (Stride length 256)
4.066 | 3.786 | 3.581 | 3.447 | 3.415 | 3.232
4.406 | 4.005 | 3.644 | 3.49 | 3.381 | 3.268
4.393 | 3.649 | 3.186 | 2.882 | 2.676 | 2.62

Medium locality (Stride length 4096)
4.012 | 3.922 | 3.768 | 3.61 | 3.582 | 3.555
4.364 | 4.057 | 3.791 | 3.65 | 3.57 | 3.465
4.479 | 3.758 | 3.316 | 3.077 | 2.959 | 2.891

dbbench command used (on a DB with 4 levels, 12 million keys)-
TEST_TMPDIR=/dev/shm numactl -C 10  ./db_bench.tmp -use_existing_db=true -benchmarks="readseq,multireadrandom" -write_buffer_size=4194304 -target_file_size_base=4194304 -max_bytes_for_level_base=16777216 -num=12000000 -reads=12000000 -duration=90 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5011

Differential Revision: D14348703

Pulled By: anand1976

fbshipit-source-id: 774406dab3776d979c809522a67bedac6c17f84b
2019-04-11 14:28:26 -07:00
..
blob_compaction_filter.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
blob_compaction_filter.h Blob DB: Improve FIFO eviction 2018-03-06 11:57:42 -08:00
blob_db.cc BlobDB: Remove GC interval option (#5044) 2019-03-07 10:19:05 -08:00
blob_db.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
blob_db_impl.cc Fix many bugs in log statement arguments (#5089) 2019-04-04 12:12:11 -07:00
blob_db_impl.h BlobDB::Open() should put all existing trash files to delete scheduler (#5103) 2019-03-26 10:53:19 -07:00
blob_db_impl_filesnapshot.cc Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
blob_db_iterator.h Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs 2018-05-17 02:56:56 -07:00
blob_db_listener.h Blob DB: Improve FIFO eviction 2018-03-06 11:57:42 -08:00
blob_db_test.cc Smooth the deletion of WAL files (#5116) 2019-03-28 15:17:13 -07:00
blob_dump_tool.cc Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
blob_dump_tool.h Ensure delete[] and not delete is used on buffer_ (#4647) 2018-11-07 11:59:50 -08:00
blob_file.cc Fix many bugs in log statement arguments (#5089) 2019-04-04 12:12:11 -07:00
blob_file.h BlobDB: handle IO error on read (#4410) 2018-09-20 16:58:45 -07:00
blob_index.h Blob DB: Inline small values in base DB 2017-10-26 12:30:54 -07:00
blob_log_format.cc utilities: Fix build failure with -Werror=maybe-uninitialized (#5074) 2019-03-18 11:35:06 -07:00
blob_log_format.h BlobDB: use char array instead of string as buffer (#4662) 2018-11-13 12:49:29 -08:00
blob_log_reader.cc Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
blob_log_reader.h BlobDB: use char array instead of string as buffer (#4662) 2018-11-13 12:49:29 -08:00
blob_log_writer.cc Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
blob_log_writer.h BlobDB: refactor DB open logic 2017-12-11 12:12:38 -08:00