rocksdb/db/compaction
Changyu Bi fd165c869d Add memtable per key-value checksum (#10281)
Summary:
Append per key-value checksum to internal key. These checksums are verified on read paths including Get, Iterator and during Flush. Get and Iterator will return `Corruption` status if there is a checksum verification failure. Flush will make DB become read-only upon memtable entry checksum verification failure.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10281

Test Plan:
- Added new unit test cases: `make check`
- Benchmark on memtable insert
```
TEST_TMPDIR=/dev/shm/memtable_write ./db_bench -benchmarks=fillseq -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100

# avg over 10 runs
Baseline: 1166936 ops/sec
memtable 2 bytes kv checksum : 1.11674e+06 ops/sec (-4%)
memtable 2 bytes kv checksum + write batch 8 bytes kv checksum: 1.08579e+06 ops/sec (-6.95%)
write batch 8 bytes kv checksum: 1.17979e+06 ops/sec (+1.1%)
```
-  Benchmark on only memtable read: ops/sec dropped 31% for `readseq` due to time spend on verifying checksum.
ops/sec for `readrandom` dropped ~6.8%.
```
# Readseq
sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillseq,readseq"[-X20]" -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100

readseq [AVG    20 runs] : 7432840 (± 212005) ops/sec;  822.3 (± 23.5) MB/sec
readseq [MEDIAN 20 runs] : 7573878 ops/sec;  837.9 MB/sec

With -memtable_protection_bytes_per_key=2:

readseq [AVG    20 runs] : 5134607 (± 119596) ops/sec;  568.0 (± 13.2) MB/sec
readseq [MEDIAN 20 runs] : 5232946 ops/sec;  578.9 MB/sec

# Readrandom
sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillrandom,readrandom"[-X10]" -disable_wal=true -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=100
readrandom [AVG    10 runs] : 140236 (± 3938) ops/sec;    9.8 (± 0.3) MB/sec
readrandom [MEDIAN 10 runs] : 140545 ops/sec;    9.8 MB/sec

With -memtable_protection_bytes_per_key=2:
readrandom [AVG    10 runs] : 130632 (± 2738) ops/sec;    9.1 (± 0.2) MB/sec
readrandom [MEDIAN 10 runs] : 130341 ops/sec;    9.1 MB/sec
```

- Stress test: `python3 -u tools/db_crashtest.py whitebox --duration=1800`

Reviewed By: ajkr

Differential Revision: D37607896

Pulled By: cbi42

fbshipit-source-id: fdaefb475629d2471780d4a5f5bf81b44ee56113
2022-08-12 13:51:32 -07:00
..
clipping_iterator.h Make InternalKeyComparator not configurable (#10342) 2022-07-14 10:09:31 -07:00
clipping_iterator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
compaction.cc Provide support for subcompactions with user-defined timestamps (#10344) 2022-07-31 11:39:16 -07:00
compaction.h Change `bottommost_temperture` to `last_level_temperture` (#10471) 2022-08-08 14:36:34 -07:00
compaction_iteration_stats.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
compaction_iterator.cc Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
compaction_iterator.h Include minimal contextual information in `CompactionIterator` (#10505) 2022-08-09 17:07:24 -07:00
compaction_iterator_test.cc Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
compaction_job.cc LOG more info on oldest snapshot and sequence numbers (#10454) 2022-08-12 13:08:50 -07:00
compaction_job.h Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
compaction_job_stats_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
compaction_job_test.cc Remove unused fields from FileMetaData (temporarily) (#10443) 2022-08-01 17:56:13 -07:00
compaction_outputs.cc Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
compaction_outputs.h Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
compaction_picker.cc Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
compaction_picker.h Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
compaction_picker_fifo.cc Fix underflow in FIFOCompactionPicker (#10386) 2022-07-22 09:20:35 -07:00
compaction_picker_fifo.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
compaction_picker_level.cc Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
compaction_picker_level.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
compaction_picker_test.cc Improve universal compaction picker for tiered compaction (#10467) 2022-08-08 14:34:36 -07:00
compaction_picker_universal.cc Improve universal compaction picker for tiered compaction (#10467) 2022-08-08 14:34:36 -07:00
compaction_picker_universal.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
compaction_service_job.cc Improve SubCompaction Partitioning (#10393) 2022-07-23 17:38:49 -07:00
compaction_service_test.cc Fix failed VerifySstUniqueIds unittests (#10043) 2022-05-24 09:00:06 -07:00
compaction_state.cc Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
compaction_state.h Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
file_pri.h Try to start TTL earlier with kMinOverlappingRatio is used (#8749) 2021-11-01 14:36:31 -07:00
sst_partitioner.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00
subcompaction_state.cc Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
subcompaction_state.h Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
tiered_compaction_test.cc Change `bottommost_temperture` to `last_level_temperture` (#10471) 2022-08-08 14:36:34 -07:00