rocksdb/db
sdong e03f8a0c12 L0 Subcompaction to trim input files (#9802)
Summary:
When sub compaction is decided for L0->L1 compaction, most of the cases, all L0 files will be involved in all sub compactions. However, it is not always the case. When files are generally (but not strictly) inserted in sequential order, there can be a subset of L0 files invovled. Yet RocksDB always open all those L0 files, and build an iterator, read many of the files' first of last block with expensive readahead. We trim some input files to reduce overhead a little bit.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9802

Test Plan: Add a unit test to cover this case and manually validate the behavior while running the test.

Reviewed By: ajkr

Differential Revision: D35371031

fbshipit-source-id: 701ed7375b5cbe41672e93b38fe8a1503dad08b6
2022-04-06 18:19:19 -07:00
..
blob Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
compaction L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
db_impl Fix GetMergeOperands() heap-use-after-free on flushed memtable (#9805) 2022-04-05 12:26:36 -07:00
arena_wrapped_db_iter.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
arena_wrapped_db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
builder.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
builder.h Expose blob file information through the EventListener interface (#8675) 2021-09-16 17:23:36 -07:00
c.cc Fix a major performance bug in 7.0 re: filter compatibility (#9736) 2022-03-23 10:00:54 -07:00
c_test.c Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
column_family.cc Fix heap use-after-free race with DropColumnFamily (#9730) 2022-03-24 13:05:17 -07:00
column_family.h Fix heap use-after-free race with DropColumnFamily (#9730) 2022-03-24 13:05:17 -07:00
column_family_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
compact_files_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
comparator_db_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
convenience.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
corruption_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
cuckoo_table_db_test.cc Experimental support for SST unique IDs (#8990) 2021-10-18 23:32:01 -07:00
db_basic_test.cc Fix some MultiGet batching stats (#9583) 2022-02-17 16:31:41 -08:00
db_block_cache_test.cc Enhance new cache key testing & comments (#9329) 2022-02-04 14:15:58 -08:00
db_bloom_filter_test.cc Tests for filter compatibility (#9773) 2022-04-06 15:54:40 -07:00
db_compaction_filter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_compaction_test.cc L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
db_dynamic_level_test.cc Remove deprecated API AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit (#9452) 2022-01-27 13:01:09 -08:00
db_encryption_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_filesnapshot.cc Improve comments for some files (#9793) 2022-04-01 16:06:14 -07:00
db_flush_test.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
db_info_dumper.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc fix a bug, c api, if enable inplace_update_support, and use create sn… (#9471) 2022-03-21 12:04:33 -07:00
db_io_failure_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_iter.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
db_iter_stress_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_iter_test.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iterator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_kv_checksum_test.cc Revise APIs related to user-defined timestamp (#8946) 2022-02-01 22:19:01 -08:00
db_log_iter_test.cc Attempt to deflake DBTestXactLogIterator.TransactionLogIteratorCorruptedLog (#8627) 2021-08-10 11:10:07 -07:00
db_logical_block_size_cache_test.cc Attempt to deflake DBLogicalBlockSizeCacheTest.CreateColumnFamilies (#9516) 2022-03-04 11:35:28 -08:00
db_memtable_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_merge_operand_test.cc Fix GetMergeOperands() heap-use-after-free on flushed memtable (#9805) 2022-04-05 12:26:36 -07:00
db_merge_operator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_options_test.cc Remove deprecated option new_table_reader_for_compaction_inputs (#9443) 2022-02-08 19:31:28 -08:00
db_properties_test.cc Tests for filter compatibility (#9773) 2022-04-06 15:54:40 -07:00
db_range_del_test.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
db_rate_limiter_test.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
db_secondary_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
db_sst_test.cc Account memory of big memory users in BlockBasedTable in global memory limit (#9748) 2022-04-06 10:33:00 -07:00
db_statistics_test.cc Bytes read stat for VerifyChecksum() and VerifyFileChecksums() APIs (#8741) 2021-09-07 13:28:29 -07:00
db_table_properties_test.cc Account memory of big memory users in BlockBasedTable in global memory limit (#9748) 2022-04-06 10:33:00 -07:00
db_tailing_iter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_test.cc Fix FileStorageInfo fields from GetLiveFilesMetaData (#9769) 2022-03-29 14:36:35 -07:00
db_test2.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
db_test_util.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
db_test_util.h Update Cache::Release param from force_erase to erase_if_last_ref (#9728) 2022-03-22 10:22:18 -07:00
db_universal_compaction_test.cc Adhere to per-DB concurrency limit when bottom-pri compactions exist (#9179) 2021-11-18 17:31:50 -08:00
db_wal_test.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
db_with_timestamp_basic_test.cc Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
db_with_timestamp_compaction_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
db_write_buffer_manager_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_write_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat.cc Track per-SST user-defined timestamp information in MANIFEST (#9092) 2021-11-10 10:49:04 -08:00
dbformat.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
dbformat_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
deletefile_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
error_handler.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler_fs_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
event_helpers.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
event_helpers.h Add a listener callback for end of auto error recovery (#9244) 2021-12-08 14:30:57 -08:00
experimental.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
external_sst_file_basic_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
external_sst_file_ingestion_job.cc Do not rely on ADL when invoking std::max_element (#9608) 2022-03-02 17:41:02 -08:00
external_sst_file_ingestion_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
external_sst_file_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
fault_injection_test.cc Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc fixing issue #8345 RocksDB does not work when using UNC network paths (#9384) 2022-03-30 15:55:31 -07:00
flush_job.cc Fix a bug in PosixClock (#9695) 2022-03-21 16:11:02 -07:00
flush_job.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
flush_job_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
forward_iterator.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
forward_iterator_bench.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
history_trimming_iterator.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
import_column_family_job.cc Add Temperature info in NewSequentialFile() (#9499) 2022-02-18 18:23:07 -08:00
import_column_family_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
import_column_family_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
internal_stats.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
internal_stats.h Support GetMapProperty() with "rocksdb.dbstats" (#9057) 2021-10-20 13:17:00 -07:00
job_context.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
kv_checksum.h fix compile errors in db/kv_checksum.h (#9173) 2021-11-16 10:20:50 -08:00
listener_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
log_format.h Add record to set WAL compression type if enabled (#9556) 2022-02-17 16:19:31 -08:00
log_reader.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_reader.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_test.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
malloc_stats.cc Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
memtable.cc Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
memtable.h Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
memtable_list.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
memtable_list.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable_list_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper_test.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
obsolete_files_test.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
output_validator.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.cc Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
periodic_work_scheduler.h Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
periodic_work_scheduler_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
pinned_iterators_manager.h Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
plain_table_db_test.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
pre_release_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_del_aggregator_bench.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
read_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
repair.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
repair_test.cc Some fixes and enhancements to ldb repair (#8544) 2021-07-28 16:44:14 -07:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
table_cache.cc fix a bug of the ticker NO_FILE_OPENS (#9677) 2022-03-15 09:55:49 -07:00
table_cache.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Track each SST's timestamp information as user properties (#9093) 2021-11-19 11:37:06 -08:00
table_properties_collector_test.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
transaction_log_impl.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
transaction_log_impl.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_builder.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
version_builder_test.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_edit.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
version_edit.h Clean up VersionStorageInfo a bit (#9494) 2022-02-04 08:19:20 -08:00
version_edit_handler.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc File temperature information should be preserved when restart the DB (#9242) 2021-12-03 14:43:14 -08:00
version_set.cc L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
version_set.h L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
version_set_test.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
version_util.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_manager.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
wal_manager.h Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
wal_manager_test.cc Make SystemClock into a Customizable Class (#8636) 2021-09-21 09:23:48 -07:00
write_batch.cc Fix a few documentation errors including in public APIs (#9789) 2022-04-01 10:30:17 -07:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch_test.cc Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_callback_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
write_thread.h Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00