rocksdb/options
Changyu Bi defd97bc9d Add an option to verify memtable key order during reads (#12889)
Summary:
add a new CF option `paranoid_memory_checks` that allows additional data integrity validations during read/scan. Currently, skiplist-based memtable will validate the order of keys visited. Further data validation can be added in different layers. The option will be opt-in due to performance overhead.

The motivation for this feature is for services where data correctness is critical and want to detect in-memory corruption earlier. For a corrupted memtable key, this feature can help to detect it during during reads instead of during flush with existing protections (OutputValidator that verifies key order or per kv checksum). See internally linked task for more context.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12889

Test Plan:
* new unit test added for paranoid_memory_checks=true.
* existing unit test for paranoid_memory_checks=false.
* enable in stress test.

Performance Benchmark: we check for performance regression in read path where data is in memtable only. For each benchmark, the script was run at the same time for main and this PR:
* Memtable-only randomread ops/sec:
```
(for I in $(seq 1 50);do ./db_bench --benchmarks=fillseq,readrandom --write_buffer_size=268435456 --writes=250000 --num=250000 --reads=500000  --seed=1723056275 2>&1 | grep "readrandom"; done;) | awk '{ t += $5; c++; print } END { print 1.0 * t / c }';

Main: 608146
PR with paranoid_memory_checks=false: 607727 (- %0.07)
PR with paranoid_memory_checks=true: 521889 (-%14.2)
```

* Memtable-only sequential scan ops/sec:
```
(for I in $(seq 1 50); do ./db_bench--benchmarks=fillseq,readseq[-X10] --write_buffer_size=268435456 --num=1000000  --seed=1723056275 2>1 | grep "\[AVG 10 runs\]"; done;) | awk '{ t += $6; c++; print; } END { printf "%.0f\n", 1.0 * t / c }';

Main: 9180077
PR with paranoid_memory_checks=false: 9536241 (+%3.8)
PR with paranoid_memory_checks=true: 7653934 (-%16.6)
```

* Memtable-only reverse scan ops/sec:
```
(for I in $(seq 1 20); do ./db_bench --benchmarks=fillseq,readreverse[-X10] --write_buffer_size=268435456 --num=1000000  --seed=1723056275 2>1 | grep "\[AVG 10 runs\]"; done;) | awk '{ t += $6; c++; print; } END { printf "%.0f\n", 1.0 * t / c }';

 Main: 1285719
 PR with integrity_checks=false: 1431626 (+%11.3)
 PR with integrity_checks=true: 811031 (-%36.9)
```

The `readrandom` benchmark shows no regression. The scanning benchmarks show improvement that I can't explain.

Reviewed By: pdillinger

Differential Revision: D60414267

Pulled By: cbi42

fbshipit-source-id: a70b0cbeea131f1a249a5f78f9dc3a62dacfaa91
2024-08-19 13:53:25 -07:00
..
cf_options.cc Add an option to verify memtable key order during reads (#12889) 2024-08-19 13:53:25 -07:00
cf_options.h Add an option to verify memtable key order during reads (#12889) 2024-08-19 13:53:25 -07:00
configurable.cc internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
configurable_helper.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
configurable_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
configurable_test.h internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
customizable.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
customizable_test.cc Add some compressed and tiered secondary cache stats (#12150) 2023-12-15 11:34:08 -08:00
db_options.cc Ensure Close() before LinkFile() for WALs in Checkpoint (#12734) 2024-06-12 11:48:45 -07:00
db_options.h Remove unreachable code (#12846) 2024-07-09 09:24:43 -07:00
offpeak_time_info.cc Mark more files for periodic compaction during offpeak (#12031) 2023-11-06 11:43:59 -08:00
offpeak_time_info.h Fix build on alpine 3.19 (#12345) 2024-02-12 11:24:56 -08:00
options.cc Improve universal compaction sorted-run trigger (#12477) 2024-05-24 10:10:31 -07:00
options_helper.cc Add an option to verify memtable key order during reads (#12889) 2024-08-19 13:53:25 -07:00
options_helper.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
options_parser.cc Group SST write in flush, compaction and db open with new stats (#11910) 2023-12-29 15:29:23 -08:00
options_parser.h Group SST write in flush, compaction and db open with new stats (#11910) 2023-12-29 15:29:23 -08:00
options_settable_test.cc Add an option to verify memtable key order during reads (#12889) 2024-08-19 13:53:25 -07:00
options_test.cc default_write_temperature option (#12388) 2024-02-28 14:36:13 -08:00