rocksdb/table
Yu Zhang 15053f3ab4 Logically strip timestamp during flush (#11557)
Summary:
Logically strip the user-defined timestamp when L0 files are created during flush when `AdvancedColumnFamilyOptions.persist_user_defined_timestamps` is false. Logically stripping timestamp here means replacing the original user-defined timestamp with a mininum timestamp, which for now is hard coded to be all zeros bytes.

While working on this, I caught a missing piece on the `BlockBuilder` level for this feature. The current quick path `std::min(buffer_size, last_key_size)` needs a bit tweaking to work for this feature. When user-defined timestamp is stripped during block building, on writing first entry or right after resetting, `buffer` is empty and `buffer_size` is zero as usual. However, in follow-up writes, depending on the size of the stripped user-defined timestamp, and the size of the value, what's in `buffer` can sometimes be smaller than `last_key_size`, leading `std::min(buffer_size, last_key_size)` to truncate the `last_key`. Previous test doesn't caught the bug because in those tests, the size of the stripped user-defined timestamps bytes is smaller than the length of the value. In order to avoid the conditional operation, this PR changed the original trivial `std::min` operation into an arithmetic operation. Since this is a change in a hot and performance critical path, I did the following benchmark to check no observable regression is introduced.
```TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000```
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec
                       PR  vs base:
Round 1: 350652 vs 349055
Round 2: 365733 vs 364308
Round 3: 355681 vs 354475

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11557

Test Plan:
New timestamp specific test added or existing tests augmented, both are parameterized with `UserDefinedTimestampTestMode`:
`UserDefinedTimestampTestMode::kNormal` -> UDT feature enabled, write / read with min timestamp
`UserDefinedTimestampTestMode::kStripUserDefinedTimestamps` -> UDT feature enabled, write / read with min timestamp, set Options.persist_user_defined_timestamps to false.

```
make all check
./db_wal_test --gtest_filter="*WithTimestamp*"
./flush_job_test --gtest_filter="*WithTimestamp*"
./repair_test --gtest_filter="*WithTimestamp*"
./block_based_table_reader_test
```

Reviewed By: pdillinger

Differential Revision: D47027664

Pulled By: jowlyzhang

fbshipit-source-id: e729193b6334dfc63aaa736d684d907a022571f5
2023-06-29 15:50:50 -07:00
..
adaptive
block_based Logically strip timestamp during flush (#11557) 2023-06-29 15:50:50 -07:00
cuckoo Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
plain Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
block_fetcher.cc Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
block_fetcher.h Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
block_fetcher_test.cc Record and use the tail size to prefetch table tail (#11406) 2023-05-08 13:14:28 -07:00
cleanable_test.cc
compaction_merging_iterator.cc
compaction_merging_iterator.h
format.cc Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
format.h Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
get_context.cc Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
get_context.h
internal_iterator.h
iter_heap.h
iterator.cc
iterator_wrapper.h
merger_test.cc
merging_iterator.cc
merging_iterator.h
meta_blocks.cc Record the persist_user_defined_timestamps flag in manifest (#11515) 2023-06-21 21:49:01 -07:00
meta_blocks.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
mock_table.cc Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
mock_table.h
multiget_context.h
persistent_cache_helper.cc
persistent_cache_helper.h Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
persistent_cache_options.h Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
scoped_arena_iterator.h
sst_file_dumper.cc Fix crash_test crash (#11554) 2023-06-22 12:36:22 -07:00
sst_file_dumper.h
sst_file_reader.cc Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
sst_file_reader_test.cc
sst_file_writer.cc validate SstFileWriter range tombstones cover positive ranges (#11322) 2023-03-22 21:03:13 -07:00
sst_file_writer_collectors.h
table_builder.h Add support to strip / pad timestamp when creating / reading a block based table (#11495) 2023-06-01 11:10:03 -07:00
table_factory.cc
table_properties.cc Record the persist_user_defined_timestamps flag in manifest (#11515) 2023-06-21 21:49:01 -07:00
table_properties_internal.h
table_reader.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
table_reader_bench.cc Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
table_test.cc Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
two_level_iterator.cc
two_level_iterator.h
unique_id.cc
unique_id_impl.h