rocksdb/db/compaction
Changyu Bi f02c708aa3 Consider range tombstone in compaction output file cutting (#10802)
Summary:
This PR is the first step for Issue https://github.com/facebook/rocksdb/issues/4811. Currently compaction output files are cut at point keys, and the decision is made mainly in `CompactionOutputs::ShouldStopBefore()`. This makes it possible for range tombstones to cause large compactions that does not respect `max_compaction_bytes`. For example, we can have a large range tombstone that overlaps with too many files from the next level. Another example is when there is a gap between a range tombstone and another key. The first issue may be more acceptable, as a lot of data is deleted. This PR address the second issue by calling `ShouldStopBefore()` for range tombstone start keys. The main change is for `CompactionIterator` to emit range tombstone start keys to be processed by `CompactionOutputs`. A new `CompactionMergingIterator` is introduced and only used under `CompactionIterator` for this purpose. Further improvement after this PR include 1) cut compaction output at some grandparent boundary key instead of at the next point key or range tombstone start key and 2) cut compaction output file within a large range tombstone (it may be easier and reasonable to only do it for range tombstones at the end of a compaction output).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10802

Test Plan:
- added unit tests in db_range_del_test.
- stress test: `python3 tools/db_crashtest.py whitebox --[simple|enable_ts] --verify_iterator_with_expected_state_one_in=5 --delrangepercent=5 --prefixpercent=2 --writepercent=58 --readpercen=21 --duration=36000 --range_deletion_width=1000000`

Reviewed By: ajkr, jay-zhuang

Differential Revision: D40308827

Pulled By: cbi42

fbshipit-source-id: a8fd6f70a3f09d0ef7a40e006f6c964bba8c00df
2022-12-15 09:11:54 -08:00
..
clipping_iterator.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
clipping_iterator_test.cc Print stack traces on frozen tests in CI (#10828) 2022-10-18 00:35:35 -07:00
compaction.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_iteration_stats.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
compaction_iterator.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_iterator.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_iterator_test.cc Basic Support for Merge with user-defined timestamp (#10819) 2022-10-31 22:28:58 -07:00
compaction_job.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_job.h Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_job_stats_test.cc clang-format for db/compaction (#10882) 2022-10-26 12:35:12 -07:00
compaction_job_test.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_outputs.cc Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_outputs.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
compaction_picker.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker.h Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_fifo.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_fifo.h Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_level.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_level.h Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_test.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_universal.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_picker_universal.h Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_service_job.cc Sort L0 files by newly introduced epoch_num (#10922) 2022-12-13 13:29:37 -08:00
compaction_service_test.cc Deflake CompactionServiceTest.BasicCompactions (#10697) 2022-09-19 14:10:05 -07:00
compaction_state.cc Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
compaction_state.h Tiered Compaction: per key placement support (#9964) 2022-07-13 20:54:49 -07:00
file_pri.h Try to start TTL earlier with kMinOverlappingRatio is used (#8749) 2021-11-01 14:36:31 -07:00
sst_partitioner.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00
subcompaction_state.cc Refactor Compaction file cut ShouldStopBefore() (#10629) 2022-09-14 22:09:12 -07:00
subcompaction_state.h Consider range tombstone in compaction output file cutting (#10802) 2022-12-15 09:11:54 -08:00
tiered_compaction_test.cc Add an unittest for Periodic compaction conflict with ongoing compaction (#10908) 2022-12-12 10:37:55 -08:00