mirror of
https://github.com/facebook/rocksdb.git
synced 2024-11-27 02:44:18 +00:00
15053f3ab4
Summary: Logically strip the user-defined timestamp when L0 files are created during flush when `AdvancedColumnFamilyOptions.persist_user_defined_timestamps` is false. Logically stripping timestamp here means replacing the original user-defined timestamp with a mininum timestamp, which for now is hard coded to be all zeros bytes. While working on this, I caught a missing piece on the `BlockBuilder` level for this feature. The current quick path `std::min(buffer_size, last_key_size)` needs a bit tweaking to work for this feature. When user-defined timestamp is stripped during block building, on writing first entry or right after resetting, `buffer` is empty and `buffer_size` is zero as usual. However, in follow-up writes, depending on the size of the stripped user-defined timestamp, and the size of the value, what's in `buffer` can sometimes be smaller than `last_key_size`, leading `std::min(buffer_size, last_key_size)` to truncate the `last_key`. Previous test doesn't caught the bug because in those tests, the size of the stripped user-defined timestamps bytes is smaller than the length of the value. In order to avoid the conditional operation, this PR changed the original trivial `std::min` operation into an arithmetic operation. Since this is a change in a hot and performance critical path, I did the following benchmark to check no observable regression is introduced. ```TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000``` Compiled with DEBUG_LEVEL=0 Test vs. control runs simulaneous for better accuracy, units = ops/sec PR vs base: Round 1: 350652 vs 349055 Round 2: 365733 vs 364308 Round 3: 355681 vs 354475 Pull Request resolved: https://github.com/facebook/rocksdb/pull/11557 Test Plan: New timestamp specific test added or existing tests augmented, both are parameterized with `UserDefinedTimestampTestMode`: `UserDefinedTimestampTestMode::kNormal` -> UDT feature enabled, write / read with min timestamp `UserDefinedTimestampTestMode::kStripUserDefinedTimestamps` -> UDT feature enabled, write / read with min timestamp, set Options.persist_user_defined_timestamps to false. ``` make all check ./db_wal_test --gtest_filter="*WithTimestamp*" ./flush_job_test --gtest_filter="*WithTimestamp*" ./repair_test --gtest_filter="*WithTimestamp*" ./block_based_table_reader_test ``` Reviewed By: pdillinger Differential Revision: D47027664 Pulled By: jowlyzhang fbshipit-source-id: e729193b6334dfc63aaa736d684d907a022571f5 |
||
---|---|---|
.. | ||
binary_search_index_reader.cc | ||
binary_search_index_reader.h | ||
block.cc | ||
block.h | ||
block_based_table_builder.cc | ||
block_based_table_builder.h | ||
block_based_table_factory.cc | ||
block_based_table_factory.h | ||
block_based_table_iterator.cc | ||
block_based_table_iterator.h | ||
block_based_table_reader.cc | ||
block_based_table_reader.h | ||
block_based_table_reader_impl.h | ||
block_based_table_reader_sync_and_async.h | ||
block_based_table_reader_test.cc | ||
block_builder.cc | ||
block_builder.h | ||
block_cache.cc | ||
block_cache.h | ||
block_prefetcher.cc | ||
block_prefetcher.h | ||
block_prefix_index.cc | ||
block_prefix_index.h | ||
block_test.cc | ||
block_type.h | ||
cachable_entry.h | ||
data_block_footer.cc | ||
data_block_footer.h | ||
data_block_hash_index.cc | ||
data_block_hash_index.h | ||
data_block_hash_index_test.cc | ||
filter_block.h | ||
filter_block_reader_common.cc | ||
filter_block_reader_common.h | ||
filter_policy.cc | ||
filter_policy_internal.h | ||
flush_block_policy.cc | ||
flush_block_policy_impl.h | ||
full_filter_block.cc | ||
full_filter_block.h | ||
full_filter_block_test.cc | ||
hash_index_reader.cc | ||
hash_index_reader.h | ||
index_builder.cc | ||
index_builder.h | ||
index_reader_common.cc | ||
index_reader_common.h | ||
mock_block_based_table.h | ||
parsed_full_filter_block.cc | ||
parsed_full_filter_block.h | ||
partitioned_filter_block.cc | ||
partitioned_filter_block.h | ||
partitioned_filter_block_test.cc | ||
partitioned_index_iterator.cc | ||
partitioned_index_iterator.h | ||
partitioned_index_reader.cc | ||
partitioned_index_reader.h | ||
reader_common.cc | ||
reader_common.h | ||
uncompression_dict_reader.cc | ||
uncompression_dict_reader.h |