rocksdb/db
leipeng 68ce5d84f6 Add new Iterator API Refresh(const snapshot*) (#10594)
Summary:
This PR resolves https://github.com/facebook/rocksdb/issues/10487 & https://github.com/facebook/rocksdb/issues/10536, user code needs to call Refresh() periodically.

The main code change is to support range deletions. A range tombstone iterator uses a sequence number as upper bound to decide which range tombstones are effective. During Iterator refresh, this sequence number upper bound needs to be updated for all range tombstone iterators under DBIter and LevelIterator. LevelIterator may create new table iterators and range tombstone iterator during scanning, so it needs to be aware of iterator refresh. The code path that propagates this change is `db_iter_->set_sequence(read_seq)  -> MergingIterator::SetRangeDelReadSeqno() -> TruncatedRangeDelIterator::SetRangeDelReadSeqno() and LevelIterator::SetRangeDelReadSeqno()`.

This change also fixes an issue where range tombstone iterators created by LevelIterator may access ReadOptions::snapshot, even though we do not explicitly require users to keep a snapshot alive after creating an Iterator.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10594

Test Plan:
* New unit tests.
* Add Iterator::Refresh(snapshot) to stress test. Note that this change only adds tests for refreshing to the same snapshot since this is the main target use case.

TODO in a following PR:
* Stress test Iterator::Refresh() to different snapshots or no snapshot.

Reviewed By: ajkr

Differential Revision: D48456896

Pulled By: cbi42

fbshipit-source-id: 2e642c04e91235cc9542ef4cd37b3c20823bd779
2023-09-15 10:44:43 -07:00
..
blob Track full_history_ts_low per SuperVersion (#11784) 2023-09-13 16:34:18 -07:00
compaction Conditionally exclude some L0 input files in size amp compaction (#11749) 2023-09-12 15:53:15 -07:00
db_impl Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
wide GetEntity Support for ReadOnlyDB and SecondaryDB (#11799) 2023-09-15 08:30:44 -07:00
arena_wrapped_db_iter.cc Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
arena_wrapped_db_iter.h Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
builder.cc Compare the number of input keys and processed keys for compactions (#11571) 2023-07-28 09:47:31 -07:00
builder.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
c.cc Add C API for WaitForCompact (#11737) 2023-08-22 14:32:35 -07:00
c_test.c Add C API for WaitForCompact (#11737) 2023-08-22 14:32:35 -07:00
column_family.cc Track full_history_ts_low per SuperVersion (#11784) 2023-09-13 16:34:18 -07:00
column_family.h Track full_history_ts_low per SuperVersion (#11784) 2023-09-13 16:34:18 -07:00
column_family_test.cc Respect cutoff timestamp during flush (#11599) 2023-07-26 16:25:06 -07:00
compact_files_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
comparator_db_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
convenience.cc Group rocksdb.sst.read.micros stat by different user read IOActivity + misc (#11444) 2023-08-08 17:26:50 -07:00
convenience_impl.h Group rocksdb.sst.read.micros stat by different user read IOActivity + misc (#11444) 2023-08-08 17:26:50 -07:00
corruption_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
cuckoo_table_db_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
db_basic_test.cc Fix a bug in multiget for cleaning up SuperVersion (#11830) 2023-09-15 09:50:39 -07:00
db_block_cache_test.cc Placeholder for AutoHyperClockCache, more (#11692) 2023-08-11 16:27:38 -07:00
db_bloom_filter_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
db_clip_test.cc Support Clip DB to KeyRange (#11379) 2023-05-18 13:25:01 -07:00
db_compaction_filter_test.cc Always allow L0->L1 trivial move during manual compaction (#11375) 2023-04-20 11:10:48 -07:00
db_compaction_test.cc Fix unit test tsan failure (#11828) 2023-09-13 15:53:05 -07:00
db_dynamic_level_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_encryption_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_filesnapshot.cc Flush option in WaitForCompact() (#11483) 2023-05-31 12:53:51 -07:00
db_flush_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
db_info_dumper.cc Log host name (#11776) 2023-08-31 08:39:09 -07:00
db_info_dumper.h
db_inplace_update_test.cc
db_io_failure_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_iter.cc Add helper methods WideColumnsHelper::{Has,Get}DefaultColumn (#11813) 2023-09-11 16:32:32 -07:00
db_iter.h Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
db_iter_stress_test.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
db_iter_test.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
db_iterator_test.cc Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
db_kv_checksum_test.cc
db_log_iter_test.cc Reverse sort order in dedup to enable iter checking in callback (#11725) 2023-08-22 11:22:35 -07:00
db_logical_block_size_cache_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_memtable_test.cc
db_merge_operand_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
db_merge_operator_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
db_options_test.cc Change compaction_readahead_size default value to 2MB (#11762) 2023-08-30 14:57:08 -07:00
db_properties_test.cc format_version=6 and context-aware block checksums (#9058) 2023-07-30 16:40:01 -07:00
db_range_del_test.cc Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
db_rate_limiter_test.cc Group rocksdb.sst.read.micros stat by different user read IOActivity + misc (#11444) 2023-08-08 17:26:50 -07:00
db_readonly_with_timestamp_test.cc Track full_history_ts_low per SuperVersion (#11784) 2023-09-13 16:34:18 -07:00
db_secondary_test.cc GetEntity Support for ReadOnlyDB and SecondaryDB (#11799) 2023-09-15 08:30:44 -07:00
db_sst_test.cc Remove wait_unscheduled from waitForCompact internal API (#11443) 2023-05-17 18:13:50 -07:00
db_statistics_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
db_table_properties_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
db_tailing_iter_test.cc Add an interface to provide support for underlying FS to pass their own buffer during reads (#11324) 2023-06-23 11:48:49 -07:00
db_test.cc Add statistics COMPACTION_CPU_TOTAL_TIME for total compaction time (#11741) 2023-09-12 15:48:36 -07:00
db_test2.cc Add unit test for default temperature (#11722) 2023-08-21 12:14:03 -07:00
db_test_util.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
db_test_util.h Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
db_universal_compaction_test.cc Clarify usage for options ttl and periodic_compaction_seconds for universal compaction (#11552) 2023-07-26 11:31:54 -07:00
db_wal_test.cc Increase full_history_ts_low when flush happens during recovery (#11774) 2023-08-30 09:34:31 -07:00
db_with_timestamp_basic_test.cc Track full_history_ts_low per SuperVersion (#11784) 2023-09-13 16:34:18 -07:00
db_with_timestamp_compaction_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_with_timestamp_test_util.cc
db_with_timestamp_test_util.h
db_write_buffer_manager_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
db_write_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
dbformat.cc Add documentation to some formatting util functions (#11674) 2023-08-14 22:04:18 -07:00
dbformat.h Add documentation to some formatting util functions (#11674) 2023-08-14 22:04:18 -07:00
dbformat_test.cc Logically strip timestamp during flush (#11557) 2023-06-29 15:50:50 -07:00
deletefile_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
error_handler.cc Add some unit tests when file read returns error during compaction/scanning (#11788) 2023-09-06 10:23:41 -07:00
error_handler.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
error_handler_fs_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
event_helpers.cc Log user_defined_timestamps_persisted flag in event logger (#11683) 2023-08-08 12:25:21 -07:00
event_helpers.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
experimental.cc Record the persist_user_defined_timestamps flag in manifest (#11515) 2023-06-21 21:49:01 -07:00
external_sst_file_basic_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
external_sst_file_ingestion_job.cc Fix a bug of rocksdb.file.read.verify.file.checksums.micros not being populated (#11836) 2023-09-15 10:36:14 -07:00
external_sst_file_ingestion_job.h Add missing range conflict check between file ingestion and RefitLevel() (#10988) 2022-12-29 15:05:36 -08:00
external_sst_file_test.cc Wide Column support in ldb (#11754) 2023-08-30 12:45:52 -07:00
fault_injection_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
file_indexer.cc Simplify conditional judgment (#11580) 2023-07-03 09:41:48 -07:00
file_indexer.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
file_indexer_test.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
filename_test.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
flush_job.cc Increase full_history_ts_low when flush happens during recovery (#11774) 2023-08-30 09:34:31 -07:00
flush_job.h Respect cutoff timestamp during flush (#11599) 2023-07-26 16:25:06 -07:00
flush_job_test.cc Support switching on / off UDT together with in-Memtable-only feature (#11623) 2023-07-26 20:16:32 -07:00
flush_scheduler.cc
flush_scheduler.h
forward_iterator.cc Add an interface to provide support for underlying FS to pass their own buffer during reads (#11324) 2023-06-23 11:48:49 -07:00
forward_iterator.h Ignore async_io ReadOption if FileSystem doesn't support it (#11296) 2023-03-17 14:57:09 -07:00
forward_iterator_bench.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
history_trimming_iterator.h Refactor AddRangeDels() + consider range tombstone during compaction file cutting (#11113) 2023-02-22 12:28:18 -08:00
import_column_family_job.cc Delay bottommost level single file compactions (#11701) 2023-08-16 17:45:44 -07:00
import_column_family_job.h Support to create a CF by importing multiple non-overlapping CFs (#11378) 2023-06-15 12:25:04 -07:00
import_column_family_test.cc Support to create a CF by importing multiple non-overlapping CFs (#11378) 2023-06-15 12:25:04 -07:00
internal_stats.cc add property "rocksdb.obsolete-sst-files-size" (#11533) 2023-06-13 15:52:45 -07:00
internal_stats.h add property "rocksdb.obsolete-sst-files-size" (#11533) 2023-06-13 15:52:45 -07:00
job_context.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
kv_checksum.h Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
listener_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
log_format.h Add support in log writer and reader for a user-defined timestamp size record (#11433) 2023-05-11 17:26:19 -07:00
log_reader.cc Add support in log writer and reader for a user-defined timestamp size record (#11433) 2023-05-11 17:26:19 -07:00
log_reader.h switch to use RocksDB UnorderedMap (#11507) 2023-06-05 13:36:26 -07:00
log_test.cc switch to use RocksDB UnorderedMap (#11507) 2023-06-05 13:36:26 -07:00
log_writer.cc switch to use RocksDB UnorderedMap (#11507) 2023-06-05 13:36:26 -07:00
log_writer.h switch to use RocksDB UnorderedMap (#11507) 2023-06-05 13:36:26 -07:00
logs_with_prep_tracker.cc
logs_with_prep_tracker.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
lookup_key.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
malloc_stats.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
malloc_stats.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
manual_compaction_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
memtable.cc Add an option to trigger flush when the number of range deletions reach a threshold (#11358) 2023-08-02 19:58:56 -07:00
memtable.h Add an option to trigger flush when the number of range deletions reach a threshold (#11358) 2023-08-02 19:58:56 -07:00
memtable_list.cc remove a sub-condition that is always true (#11746) 2023-08-29 18:40:13 -07:00
memtable_list.h Respect cutoff timestamp during flush (#11599) 2023-07-26 16:25:06 -07:00
memtable_list_test.cc Respect cutoff timestamp during flush (#11599) 2023-07-26 16:25:06 -07:00
merge_context.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
merge_helper.cc Add helper methods WideColumnsHelper::{Has,Get}DefaultColumn (#11813) 2023-09-11 16:32:32 -07:00
merge_helper.h Add API to limit blast radius of merge operator failure (#11092) 2023-01-20 14:40:30 -08:00
merge_helper_test.cc
merge_operator.cc Add helper methods WideColumnsHelper::{Has,Get}DefaultColumn (#11813) 2023-09-11 16:32:32 -07:00
merge_test.cc Introduce a wide column aware MergeOperator API (#11807) 2023-09-11 12:13:58 -07:00
obsolete_files_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
options_file_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
output_validator.cc
output_validator.h
perf_context_test.cc Add missing status check when compiling with ASSERT_STATUS_CHECKED=1 (#11686) 2023-08-09 15:46:44 -07:00
periodic_task_scheduler.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
periodic_task_scheduler.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
periodic_task_scheduler_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
pinned_iterators_manager.h
plain_table_db_test.cc Make option level_compaction_dynamic_level_bytes true by default (#11525) 2023-06-15 21:12:39 -07:00
post_memtable_callback.h
pre_release_callback.h
prefix_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
range_del_aggregator.cc Improve documentation for MergingIterator (#11161) 2023-03-03 12:17:30 -08:00
range_del_aggregator.h Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
range_del_aggregator_bench.cc
range_del_aggregator_test.cc Refactor AddRangeDels() + consider range tombstone during compaction file cutting (#11113) 2023-02-22 12:28:18 -08:00
range_tombstone_fragmenter.cc
range_tombstone_fragmenter.h Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
range_tombstone_fragmenter_test.cc snapshots of FragmentedRangeTombstoneList must in ascending order (#11046) 2022-12-19 15:06:22 -08:00
read_callback.h
repair.cc Delay bottommost level single file compactions (#11701) 2023-08-16 17:45:44 -07:00
repair_test.cc Respect cutoff timestamp during flush (#11599) 2023-07-26 16:25:06 -07:00
seqno_time_test.cc CompactRange() always compacts to bottommost level for leveled compaction (#11468) 2023-06-01 15:27:29 -07:00
seqno_to_time_mapping.cc
seqno_to_time_mapping.h
snapshot_checker.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
snapshot_impl.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
snapshot_impl.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
table_cache.cc Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
table_cache.h Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
table_cache_sync_and_async.h Fix StopWatch bug; Remove setting record_read_stats (#11474) 2023-05-25 10:16:58 -07:00
table_properties_collector.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
table_properties_collector.h Fix an assertion failure in TimestampTablePropertiesCollector for empty output (#11015) 2022-12-05 13:46:27 -08:00
table_properties_collector_test.cc Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
transaction_log_impl.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
transaction_log_impl.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
trim_history_scheduler.cc
trim_history_scheduler.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
version_builder.cc Add unit test for default temperature (#11722) 2023-08-21 12:14:03 -07:00
version_builder.h Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
version_builder_test.cc Delay bottommost level single file compactions (#11701) 2023-08-16 17:45:44 -07:00
version_edit.cc Support switching on / off UDT together with in-Memtable-only feature (#11623) 2023-07-26 20:16:32 -07:00
version_edit.h Compare the number of input keys and processed keys for compactions (#11571) 2023-07-28 09:47:31 -07:00
version_edit_handler.cc Support switching on / off UDT together with in-Memtable-only feature (#11623) 2023-07-26 20:16:32 -07:00
version_edit_handler.h Support switching on / off UDT together with in-Memtable-only feature (#11623) 2023-07-26 20:16:32 -07:00
version_edit_test.cc Support switching on / off UDT together with in-Memtable-only feature (#11623) 2023-07-26 20:16:32 -07:00
version_set.cc Add new Iterator API Refresh(const snapshot*) (#10594) 2023-09-15 10:44:43 -07:00
version_set.h remove an unused typedef (#11286) 2023-08-25 18:01:14 -07:00
version_set_sync_and_async.h Add helper methods WideColumnsHelper::{Has,Get}DefaultColumn (#11813) 2023-09-11 16:32:32 -07:00
version_set_test.cc Delay bottommost level single file compactions (#11701) 2023-08-16 17:45:44 -07:00
version_util.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
wal_edit.cc
wal_edit.h
wal_edit_test.cc
wal_manager.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
wal_manager.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
wal_manager_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
write_batch.cc Add a helper method WideColumnsHelper::SortColumns (#11823) 2023-09-12 12:36:07 -07:00
write_batch_base.cc
write_batch_internal.h Set default cf ts sz for a reused transaction (#11685) 2023-08-09 13:49:42 -07:00
write_batch_test.cc add WriteBatch::Release() (#11482) 2023-05-26 18:15:14 -07:00
write_callback.h
write_callback_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
write_controller.cc
write_controller.h Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
write_controller_test.cc Ran clang-format on db/ directory (#10910) 2022-11-02 14:34:24 -07:00
write_stall_stats.cc Fix initialization-order-fiasco in write_stall_stats.cc (#11355) 2023-04-05 14:42:31 -07:00
write_stall_stats.h Fix initialization-order-fiasco in write_stall_stats.cc (#11355) 2023-04-05 14:42:31 -07:00
write_thread.cc Ensure LockWAL() stall cleared for UnlockWAL() return (#11172) 2023-02-03 12:08:37 -08:00
write_thread.h Ensure LockWAL() stall cleared for UnlockWAL() return (#11172) 2023-02-03 12:08:37 -08:00