rocksdb/db
Jay Huh 909ff2c208 MultiCFSnapshot Refactor - separate multiget key range info from CFD & superversion info (#12561)
Summary:
While implementing MultiCFIterators (CoalescingIterator and AttributeGroupIterator), we found that the existing `NewIterators()` API does not ensure a uniform view of the DB across all column families. The `NewIterators()` function is utilized to generate child iterators for the MultiCfIterators, and it's expected that all child iterators maintain a consistent view of the DB.

For example, within the loop where the super version for each CF is being obtained, if a CF undergoes compaction after the super versions for previous CFs have already been retrieved, we lose the consistency in the view of the CFs for the iterators due to the API not under a db mutex.

This preliminary refactoring of `MultiCFSnapshot` aims to address this issue in the `NewIterators()` API in the later PR. Currently, `MultiCFSnapshot` is used to achieve a consistent view across CFs in `MultiGet`. The `MultiGetColumnFamilyData` contains MultiGet-specific information that can be decoupled from the cfd and sv, allowing `MultiCFSnapshot` to be used in other places.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12561

Test Plan:
**Existing Unit Tests for `MultiCFSnapshot()`**

```
./db_basic_test -- --gtest_filter="*MultiGet*"
```

**Performance Test**

Setup
```
make -j64 release

TEST_TMPDIR=/dev/shm/db_bench ./db_bench -benchmarks="filluniquerandom" -key_size=32 -value_size=512 -num=10000000 -compression_type=none
```
Run
```
TEST_TMPDIR=/dev/shm/db_bench ./db_bench -use_existing_db=1 -benchmarks="multireadrandom" -cache_size=10485760000
```
Before the change
```
DB path: [/dev/shm/db_bench/dbbench]
multireadrandom :       4.760 micros/op 210072 ops/sec 4.760 seconds 1000000 operations; (0 of 1000000 found)
```

After the change
```
DB path: [/dev/shm/db_bench/dbbench]
multireadrandom :       4.593 micros/op 217727 ops/sec 4.593 seconds 1000000 operations; (0 of 1000000 found)
```

Reviewed By: anand1976

Differential Revision: D56309422

Pulled By: jaykorean

fbshipit-source-id: 7a9164d12c810b6c2d2db062827fcc4a36cbc77b
2024-04-18 20:11:01 -07:00
..
blob Fix kBlockCacheTier read when merge-chain base value is in a blob file (#12462) 2024-03-21 12:38:53 -07:00
compaction Limit compaction input files expansion (#12484) 2024-03-29 11:34:29 -07:00
db_impl MultiCFSnapshot Refactor - separate multiget key range info from CFD & superversion info (#12561) 2024-04-18 20:11:01 -07:00
wide Fix the move semantics of PinnableWideColumns (#12557) 2024-04-17 18:56:23 -07:00
arena_wrapped_db_iter.cc Access DBImpl* and CFD* by CFHImpl* in Iterators (#12395) 2024-03-01 10:28:20 -08:00
arena_wrapped_db_iter.h Access DBImpl* and CFD* by CFHImpl* in Iterators (#12395) 2024-03-01 10:28:20 -08:00
attribute_group_iterator_impl.cc MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534) 2024-04-16 08:45:38 -07:00
attribute_group_iterator_impl.h MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534) 2024-04-16 08:45:38 -07:00
builder.cc Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
builder.h Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
c.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
c_test.c Fix windows build and CI (#12426) 2024-03-14 12:04:41 -07:00
coalescing_iterator.cc MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534) 2024-04-16 08:45:38 -07:00
coalescing_iterator.h MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534) 2024-04-16 08:45:38 -07:00
column_family.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
column_family.h Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
column_family_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
compact_files_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
comparator_db_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
convenience.cc
convenience_impl.h
corruption_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
cuckoo_table_db_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_basic_test.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
db_block_cache_test.cc don't run ZSTD_TrainDictionary in BlockBasedTableBuilder if there isn't compression needed (#12453) 2024-03-20 11:07:32 -07:00
db_bloom_filter_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_clip_test.cc
db_compaction_filter_test.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
db_compaction_test.cc Add some debug print for flaky test `DBCompactionTest.CompactionLimiter` (#12509) 2024-04-04 15:21:40 -07:00
db_dynamic_level_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_encryption_test.cc
db_filesnapshot.cc Miscellaneous improvement to info printing (#12504) 2024-04-05 10:23:31 -07:00
db_flush_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_info_dumper.cc Don't log an error when an auxiliary dir is missing (#12326) 2024-02-05 10:26:41 -08:00
db_info_dumper.h
db_inplace_update_test.cc
db_io_failure_test.cc Retry DB::Open upon a corruption detected while reading the MANIFEST (#12518) 2024-04-18 17:36:33 -07:00
db_iter.cc Reset user-facing wide-column stuctures upon deserialization failures (#12562) 2024-04-18 13:08:34 -07:00
db_iter.h Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
db_iter_stress_test.cc
db_iter_test.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
db_iterator_test.cc Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
db_kv_checksum_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_log_iter_test.cc Disable flaky part of TransactionLogIteratorCheckWhenArchive (#12423) 2024-03-12 12:54:53 -07:00
db_logical_block_size_cache_test.cc
db_memtable_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_merge_operand_test.cc Add `ContinueCallback` to `GetMergeOperands()` (#12438) 2024-03-15 12:25:49 -07:00
db_merge_operator_test.cc
db_options_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_properties_test.cc Remove the force mode for EnableFileDeletions API (#12337) 2024-02-13 18:36:25 -08:00
db_range_del_test.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
db_rate_limiter_test.cc
db_readonly_with_timestamp_test.cc
db_secondary_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_sst_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_statistics_test.cc
db_table_properties_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_tailing_iter_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_test.cc MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534) 2024-04-16 08:45:38 -07:00
db_test2.cc Fix kBlockCacheTier read with table cache miss (#12443) 2024-03-15 14:41:58 -07:00
db_test_util.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
db_test_util.h Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
db_universal_compaction_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_wal_test.cc Drop unsynced data in `TestFSWritableFile::Close()` (#12528) 2024-04-12 09:57:56 -07:00
db_with_timestamp_basic_test.cc Deprecate some variants of Get and MultiGet (#12327) 2024-02-16 09:21:06 -08:00
db_with_timestamp_compaction_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_with_timestamp_test_util.cc
db_with_timestamp_test_util.h
db_write_buffer_manager_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_write_test.cc Enable recycle_log_file_num option for point in time recovery (#12403) 2024-03-21 12:29:35 -07:00
dbformat.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
dbformat.h Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
dbformat_test.cc
deletefile_test.cc Add an option to wait for purge in WaitForCompact (#12520) 2024-04-17 17:33:27 -07:00
error_handler.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
error_handler.h Consolidate stats recording in error handler (#11992) 2024-01-22 14:57:30 -08:00
error_handler_fs_test.cc Drop unsynced data in `TestFSWritableFile::Close()` (#12528) 2024-04-12 09:57:56 -07:00
event_helpers.cc Fix/cleanup SeqnoToTimeMapping (#12253) 2024-01-19 21:50:38 -08:00
event_helpers.h
experimental.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
external_sst_file_basic_test.cc Fix windows build and CI (#12426) 2024-03-14 12:04:41 -07:00
external_sst_file_ingestion_job.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
external_sst_file_ingestion_job.h Add support to bulk load external files for UDT in memtable only feature (#12356) 2024-02-21 15:41:53 -08:00
external_sst_file_test.cc Add support to bulk load external files for UDT in memtable only feature (#12356) 2024-02-21 15:41:53 -08:00
fault_injection_test.cc Drop unsynced data in `TestFSWritableFile::Close()` (#12528) 2024-04-12 09:57:56 -07:00
file_indexer.cc
file_indexer.h
file_indexer_test.cc
filename_test.cc
flush_job.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
flush_job.h Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
flush_job_test.cc Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
flush_scheduler.cc
flush_scheduler.h
forward_iterator.cc Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
forward_iterator.h Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
forward_iterator_bench.cc
history_trimming_iterator.h
import_column_family_job.cc Fix error for CF smallest and largest keys computation in ImportColumnFamilyJob::Prepare (#12526) 2024-04-11 21:54:51 -07:00
import_column_family_job.h
import_column_family_test.cc Fix error for CF smallest and largest keys computation in ImportColumnFamilyJob::Prepare (#12526) 2024-04-11 21:54:51 -07:00
internal_stats.cc Log pending compaction bytes in a couple places (#12267) 2024-01-23 09:14:59 -08:00
internal_stats.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
job_context.h Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
kv_checksum.h
listener_test.cc
log_format.h
log_reader.cc Miscellaneous improvement to info printing (#12504) 2024-04-05 10:23:31 -07:00
log_reader.h Enable recycle_log_file_num option for point in time recovery (#12403) 2024-03-21 12:29:35 -07:00
log_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
log_writer.cc Don't write to WAL after previous WAL write error (#12448) 2024-03-18 12:27:49 -07:00
log_writer.h
logs_with_prep_tracker.cc
logs_with_prep_tracker.h
lookup_key.h
malloc_stats.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
malloc_stats.h
manual_compaction_test.cc
memtable.cc Fix `max_successive_merges` counting CPU overhead regression (#12546) 2024-04-17 12:11:24 -07:00
memtable.h Fix `max_successive_merges` counting CPU overhead regression (#12546) 2024-04-17 12:11:24 -07:00
memtable_list.cc Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
memtable_list.h Support returning write unix time in iterator property (#12428) 2024-03-15 15:37:37 -07:00
memtable_list_test.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
merge_context.h Add `ContinueCallback` to `GetMergeOperands()` (#12438) 2024-03-15 12:25:49 -07:00
merge_helper.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
merge_helper.h
merge_helper_test.cc
merge_operator.cc
merge_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
multi_cf_iterator_impl.h MultiCfIterator - Tests for lower/upper bounds (#12548) 2024-04-16 14:20:13 -07:00
multi_cf_iterator_test.cc MultiCfIterator - Tests for lower/upper bounds (#12548) 2024-04-16 14:20:13 -07:00
obsolete_files_test.cc Remove the force mode for EnableFileDeletions API (#12337) 2024-02-13 18:36:25 -08:00
options_file_test.cc
output_validator.cc Removed `check_flush_compaction_key_order` (#12311) 2024-01-31 16:30:26 -08:00
output_validator.h Removed `check_flush_compaction_key_order` (#12311) 2024-01-31 16:30:26 -08:00
perf_context_test.cc Add `write_memtable_time` to perf level `kEnableWait` (#12394) 2024-02-29 15:08:26 -08:00
periodic_task_scheduler.cc Remove extra semi colon from instagram/ranking/mezql/shots/parser/fast/Token.cpp 2024-03-04 06:32:50 -08:00
periodic_task_scheduler.h
periodic_task_scheduler_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
pinned_iterators_manager.h Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
plain_table_db_test.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
post_memtable_callback.h
pre_release_callback.h
prefix_test.cc
range_del_aggregator.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
range_del_aggregator.h Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
range_del_aggregator_bench.cc
range_del_aggregator_test.cc
range_tombstone_fragmenter.cc Add support for range deletion when user timestamps are not persisted (#12254) 2024-01-29 11:37:34 -08:00
range_tombstone_fragmenter.h Add support for range deletion when user timestamps are not persisted (#12254) 2024-01-29 11:37:34 -08:00
range_tombstone_fragmenter_test.cc
read_callback.h
repair.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
repair_test.cc
seqno_time_test.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
seqno_to_time_mapping.cc Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
seqno_to_time_mapping.h Follow ups for TimedPut and write time property (#12455) 2024-03-21 10:00:15 -07:00
snapshot_checker.h Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
snapshot_impl.cc
snapshot_impl.h
table_cache.cc Fix kBlockCacheTier read when merge-chain base value is in a blob file (#12462) 2024-03-21 12:38:53 -07:00
table_cache.h Fix kBlockCacheTier read when merge-chain base value is in a blob file (#12462) 2024-03-21 12:38:53 -07:00
table_cache_sync_and_async.h Fix kBlockCacheTier read when merge-chain base value is in a blob file (#12462) 2024-03-21 12:38:53 -07:00
table_properties_collector.cc
table_properties_collector.h Rename IntTblPropCollector -> InternalTblPropColl (#12320) 2024-02-02 14:14:43 -08:00
table_properties_collector_test.cc Rename IntTblPropCollector -> InternalTblPropColl (#12320) 2024-02-02 14:14:43 -08:00
transaction_log_impl.cc
transaction_log_impl.h Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
trim_history_scheduler.cc
trim_history_scheduler.h
version_builder.cc
version_builder.h
version_builder_test.cc
version_edit.cc Remove extra semi colon from hbt/src/tagstack/tests/SlicerTest.cpp (#12461) 2024-03-20 12:44:50 -07:00
version_edit.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
version_edit_handler.cc Best-effort recovery support for atomic flush (#12406) 2024-03-06 14:40:40 -08:00
version_edit_handler.h Best-effort recovery support for atomic flush (#12406) 2024-03-06 14:40:40 -08:00
version_edit_test.cc
version_set.cc Retry DB::Open upon a corruption detected while reading the MANIFEST (#12518) 2024-04-18 17:36:33 -07:00
version_set.h Retry DB::Open upon a corruption detected while reading the MANIFEST (#12518) 2024-04-18 17:36:33 -07:00
version_set_sync_and_async.h
version_set_test.cc Best-effort recovery support for atomic flush (#12406) 2024-03-06 14:40:40 -08:00
version_util.h
wal_edit.cc
wal_edit.h
wal_edit_test.cc
wal_manager.cc Fix data race in WalManager (#12439) 2024-03-14 21:24:06 -07:00
wal_manager.h Fix data race in WalManager (#12439) 2024-03-14 21:24:06 -07:00
wal_manager_test.cc
write_batch.cc Fix `max_successive_merges` counting CPU overhead regression (#12546) 2024-04-17 12:11:24 -07:00
write_batch_base.cc
write_batch_internal.h Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
write_batch_test.cc Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470) 2024-03-22 13:40:42 -07:00
write_callback.h
write_callback_test.cc Retry DB::Open upon a corruption detected while reading the MANIFEST (#12518) 2024-04-18 17:36:33 -07:00
write_controller.cc
write_controller.h
write_controller_test.cc
write_stall_stats.cc
write_stall_stats.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
write_thread.cc WriteThread::EnterAsBatchGroupLeader reorder writers (#12138) 2024-02-27 15:23:54 -08:00
write_thread.h