rocksdb/db/compaction
Gang Liao e6432dfd4c Make it possible to enable blob files starting from a certain LSM tree level (#10077)
Summary:
Currently, if blob files are enabled (i.e. `enable_blob_files` is true), large values are extracted both during flush/recovery (when SST files are written into level 0 of the LSM tree) and during compaction into any LSM tree level. For certain use cases that have a mix of short-lived and long-lived values, it might make sense to support extracting large values only during compactions whose output level is greater than or equal to a specified LSM tree level (e.g. compactions into L1/L2/... or above). This could reduce the space amplification caused by large values that are turned into garbage shortly after being written at the price of some write amplification incurred by long-lived values whose extraction to blob files is delayed.

In order to achieve this, we would like to do the following:
- Add a new configuration option `blob_file_starting_level` (default: 0) to `AdvancedColumnFamilyOptions` (and `MutableCFOptions` and extend the related logic)
- Instantiate `BlobFileBuilder` in `BuildTable` (used during flush and recovery, where the LSM tree level is L0) and `CompactionJob` iff `enable_blob_files` is set and the LSM tree level is `>= blob_file_starting_level`
- Add unit tests for the new functionality, and add the new option to our stress tests (`db_stress` and `db_crashtest.py` )
- Add the new option to our benchmarking tool `db_bench` and the BlobDB benchmark script `run_blob_bench.sh`
- Add the new option to the `ldb` tool (see https://github.com/facebook/rocksdb/wiki/Administration-and-Data-Access-Tool)
- Ideally extend the C and Java bindings with the new option
- Update the BlobDB wiki to document the new option.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10077

Reviewed By: ltamasi

Differential Revision: D36884156

Pulled By: gangliao

fbshipit-source-id: 942bab025f04633edca8564ed64791cb5e31627d
2022-06-02 20:04:33 -07:00
..
clipping_iterator.h Add a clipping internal iterator (#8327) 2021-06-09 15:41:16 -07:00
clipping_iterator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
compaction.cc Support specifying blob garbage collection parameters when CompactRange() (#10073) 2022-06-01 19:40:26 -07:00
compaction.h Support specifying blob garbage collection parameters when CompactRange() (#10073) 2022-06-01 19:40:26 -07:00
compaction_iteration_stats.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
compaction_iterator.cc Add a temporary option for user to opt-out enforcement of SingleDelete contract (#9983) 2022-05-16 15:44:59 -07:00
compaction_iterator.h Support specifying blob garbage collection parameters when CompactRange() (#10073) 2022-06-01 19:40:26 -07:00
compaction_iterator_test.cc Add a temporary option for user to opt-out enforcement of SingleDelete contract (#9983) 2022-05-16 15:44:59 -07:00
compaction_job.cc Make it possible to enable blob files starting from a certain LSM tree level (#10077) 2022-06-02 20:04:33 -07:00
compaction_job.h Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
compaction_job_stats_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
compaction_job_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
compaction_picker.cc Support specifying blob garbage collection parameters when CompactRange() (#10073) 2022-06-01 19:40:26 -07:00
compaction_picker.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
compaction_picker_fifo.cc Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
compaction_picker_fifo.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
compaction_picker_level.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
compaction_picker_level.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
compaction_picker_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
compaction_picker_universal.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
compaction_picker_universal.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
compaction_service_test.cc Fix failed VerifySstUniqueIds unittests (#10043) 2022-05-24 09:00:06 -07:00
file_pri.h Try to start TTL earlier with kMinOverlappingRatio is used (#8749) 2021-11-01 14:36:31 -07:00
sst_partitioner.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00