rocksdb/table
Changyu Bi f02c708aa3 Consider range tombstone in compaction output file cutting (#10802)
Summary:
This PR is the first step for Issue https://github.com/facebook/rocksdb/issues/4811. Currently compaction output files are cut at point keys, and the decision is made mainly in `CompactionOutputs::ShouldStopBefore()`. This makes it possible for range tombstones to cause large compactions that does not respect `max_compaction_bytes`. For example, we can have a large range tombstone that overlaps with too many files from the next level. Another example is when there is a gap between a range tombstone and another key. The first issue may be more acceptable, as a lot of data is deleted. This PR address the second issue by calling `ShouldStopBefore()` for range tombstone start keys. The main change is for `CompactionIterator` to emit range tombstone start keys to be processed by `CompactionOutputs`. A new `CompactionMergingIterator` is introduced and only used under `CompactionIterator` for this purpose. Further improvement after this PR include 1) cut compaction output at some grandparent boundary key instead of at the next point key or range tombstone start key and 2) cut compaction output file within a large range tombstone (it may be easier and reasonable to only do it for range tombstones at the end of a compaction output).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10802

Test Plan:
- added unit tests in db_range_del_test.
- stress test: `python3 tools/db_crashtest.py whitebox --[simple|enable_ts] --verify_iterator_with_expected_state_one_in=5 --delrangepercent=5 --prefixpercent=2 --writepercent=58 --readpercen=21 --duration=36000 --range_deletion_width=1000000`

Reviewed By: ajkr, jay-zhuang

Differential Revision: D40308827

Pulled By: cbi42

fbshipit-source-id: a8fd6f70a3f09d0ef7a40e006f6c964bba8c00df
2022-12-15 09:11:54 -08:00
..
adaptive Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
block_based replace sprintf with its safe version snprintf (v2) (#11011) 2022-12-12 10:39:53 -08:00
cuckoo Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
plain Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_fetcher.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_fetcher.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
block_fetcher_test.cc Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
cleanable_test.cc Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
compaction_merging_iterator.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_merging_iterator.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
format.cc Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
format.h Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
get_context.cc Revisit the interface of MergeHelper::TimedFullMerge(WithEntity) (#10932) 2022-11-09 12:54:05 -08:00
get_context.h Support Merge for wide-column entities during point lookups (#10916) 2022-11-03 08:35:42 -07:00
internal_iterator.h Skip swaths of range tombstone covered keys in merging iterator (2022 edition) (#10449) 2022-09-02 09:51:19 -07:00
iter_heap.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
iterator.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
iterator_wrapper.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
merger_test.cc Print stack traces on frozen tests in CI (#10828) 2022-10-18 00:35:35 -07:00
merging_iterator.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
merging_iterator.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
meta_blocks.cc Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
meta_blocks.h Refactor to avoid confusing "raw block" (#10408) 2022-09-22 11:25:32 -07:00
mock_table.cc Align compaction output file boundaries to the next level ones (#10655) 2022-09-29 19:43:55 -07:00
mock_table.h Align compaction output file boundaries to the next level ones (#10655) 2022-09-29 19:43:55 -07:00
multiget_context.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
persistent_cache_helper.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
persistent_cache_helper.h Refactor to avoid confusing "raw block" (#10408) 2022-09-22 11:25:32 -07:00
persistent_cache_options.h Use STATIC_AVOID_DESTRUCTION for static objects with non-trivial destructors (#9958) 2022-05-17 09:39:22 -07:00
scoped_arena_iterator.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
sst_file_dumper.cc Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
sst_file_dumper.h Support using ZDICT_finalizeDictionary to generate zstd dictionary (#9857) 2022-05-20 12:09:09 -07:00
sst_file_reader.cc Remove range tombstone test code from sst_file_reader (#10847) 2022-10-23 20:17:14 -07:00
sst_file_reader_test.cc Remove range tombstone test code from sst_file_reader (#10847) 2022-10-23 20:17:14 -07:00
sst_file_writer.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
sst_file_writer_collectors.h Refactor to avoid confusing "raw block" (#10408) 2022-09-22 11:25:32 -07:00
table_builder.h Always verify SST unique IDs on SST file open (#10532) 2022-09-07 22:52:42 -07:00
table_factory.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00
table_properties.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
table_properties_internal.h
table_reader.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
table_reader_bench.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
table_test.cc Improve error messages for SST footer and size errors (#11009) 2022-12-09 10:03:47 -08:00
two_level_iterator.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
two_level_iterator.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
unique_id.cc Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
unique_id_impl.h Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00