rocksdb/db/db_impl
Zichen Zhu 8860fc902a Support subcmpct using reserved resources for round-robin priority (#10341)
Summary:
Earlier implementation of round-robin priority can only pick one file at a time and disallows parallel compactions within the same level. In this PR, round-robin compaction policy will expand towards more input files with respecting some additional constraints, which are summarized as follows:
 * Constraint 1: We can only pick consecutive files
   - Constraint 1a: When a file is being compacted (or some input files are being compacted after expanding), we cannot choose it and have to stop choosing more files
   - Constraint 1b: When we reach the last file (with the largest keys), we cannot choose more files (the next file will be the first one with small keys)
 * Constraint 2: We should ensure the total compaction bytes (including the overlapped files from the next level) is no more than `mutable_cf_options_.max_compaction_bytes`
 * Constraint 3: We try our best to pick as many files as possible so that the post-compaction level size can be just less than `MaxBytesForLevel(start_level_)`
 * Constraint 4: If trivial move is allowed, we reuse the logic of `TryNonL0TrivialMove()` instead of expanding files with Constraint 3

More details can be found in `LevelCompactionBuilder::SetupOtherFilesWithRoundRobinExpansion()`.

The above optimization accelerates the process of moving the compaction cursor, in which the write-amp can be further reduced. While a large compaction may lead to high write stall, we break this large compaction into several subcompactions **regardless of** the `max_subcompactions` limit.  The number of subcompactions for round-robin compaction priority is determined through the following steps:
* Step 1: Initialized against `max_output_file_limit`, the number of input files in the start level, and also the range size limit `ranges.size()`
* Step 2: Call `AcquireSubcompactionResources()`when max subcompactions is not sufficient, but we may or may not obtain desired resources, additional number of resources is stored in `extra_num_subcompaction_threads_reserved_`). Subcompaction limit is changed and update `num_planned_subcompactions` with `GetSubcompactionLimit()`
* Step 3: Call `ShrinkSubcompactionResources()` to ensure extra resources can be released (extra resources may exist for round-robin compaction when the number of actual number of subcompactions is less than the number of planned subcompactions)

More details can be found in `CompactionJob::AcquireSubcompactionResources()`,`CompactionJob::ShrinkSubcompactionResources()`, and `CompactionJob::ReleaseSubcompactionResources()`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10341

Test Plan: Add `CompactionPriMultipleFilesRoundRobin[1-3]` unit test in `compaction_picker_test.cc` and `RoundRobinSubcompactionsAgainstResources.SubcompactionsUsingResources/[0-4]`, `RoundRobinSubcompactionsAgainstPressureToken.PressureTokenTest/[0-1]` in `db_compaction_test.cc`

Reviewed By: ajkr, hx235

Differential Revision: D37792644

Pulled By: littlepig2013

fbshipit-source-id: 7fecb7c4ffd97b34bbf6e3b760b2c35a772a0657
2022-07-24 11:12:44 -07:00
..
compacted_db_impl.cc Return "invalid argument" when read timestamp is too old (#10109) 2022-06-06 14:36:22 -07:00
compacted_db_impl.h Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
db_impl.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl.h Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl_compaction_flush.cc Support subcmpct using reserved resources for round-robin priority (#10341) 2022-07-24 11:12:44 -07:00
db_impl_debug.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl_experimental.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
db_impl_files.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl_open.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl_readonly.cc Return "invalid argument" when read timestamp is too old (#10109) 2022-06-06 14:36:22 -07:00
db_impl_readonly.h Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
db_impl_secondary.cc Update code comment and logging for secondary instance (#10260) 2022-07-05 10:09:44 -07:00
db_impl_secondary.h Update code comment and logging for secondary instance (#10260) 2022-07-05 10:09:44 -07:00
db_impl_write.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00