Go to file
sdong b397dcd390 Change The Way Level Target And Compaction Score Are Calculated (#10057)
Summary:
The current level targets for dynamical leveling has a problem: the target level size will dramatically change after a L0->L1 compaction. When there are many L0 bytes, lower level compactions are delayed, but they will be resumed after the L0->L1 compaction finishes, so the expected write amplification benefits might not be realized. The proposal here is to revert the level targetting size, but instead relying on adjusting score for each level to prioritize levels that need to compact most.
Basic idea:
(1) target level size isn't adjusted, but score is adjusted. The reasoning is that with parallel compactions, holding compactions from happening might not be desirable, but we would like the compactions are scheduled from the level we feel most needed. For example, if we have a extra-large L2, we would like all compactions are scheduled for L2->L3 compactions, rather than L4->L5. This gets complicated when a large L0->L1 compaction is going on. Should we compact L2->L3 or L4->L5. So the proposal for that is:
(2) the score is calculated by actual level size / (target size + estimated upper bytes coming down). The reasoning is that if we have a large amount of pending L0/L1 bytes coming down, compacting L2->L3 might be more expensive, as when the L0 bytes are compacted down to L2, the actual L2->L3 fanout would change dramatically. On the other hand, when the amount of bytes coming down to L5, the impacts to L5->L6 fanout are much less. So when calculating target score, we can adjust it by adding estimated downward bytes to the target level size.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10057

Test Plan: Repurpose tests VersionStorageInfoTest.MaxBytesForLevelDynamicWithLargeL0_* tests to cover this scenario.

Reviewed By: ajkr

Differential Revision: D37539742

fbshipit-source-id: 9c154cbfe92023f918cf5d80875d8776ad4831a4
2022-06-30 13:32:47 -07:00
.circleci Adapt benchmark result script to new fields. (#10120) 2022-06-22 09:26:13 -07:00
.github/workflows Use released clang-format instead of the one from dev branch (#9646) 2022-03-01 10:51:38 -08:00
buckifier Use optimized folly DistributedMutex in LRUCache when available (#10179) 2022-06-17 13:08:45 -07:00
build_tools Adapt benchmark result script to new fields. (#10120) 2022-06-22 09:26:13 -07:00
cache Clock cache (#10273) 2022-06-29 21:50:39 -07:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Fix commit_prereq and other targets (#9797) 2022-04-04 09:58:18 -07:00
db Change The Way Level Target And Compaction Score Are Calculated (#10057) 2022-06-30 13:32:47 -07:00
db_stress_tool Clock cache (#10273) 2022-06-29 21:50:39 -07:00
docs Bump nokogiri from 1.13.4 to 1.13.6 in /docs (#10019) 2022-05-20 11:00:15 -07:00
env Fix typo in comments and code (#10233) 2022-06-22 15:45:21 -07:00
examples Add a simple example of backup and restore (#10054) 2022-06-03 23:25:31 -07:00
file Fix typo in comments and code (#10233) 2022-06-22 15:45:21 -07:00
fuzz Fix compilation errors and add fuzzers to CircleCI (#9420) 2022-02-01 10:32:15 -08:00
include/rocksdb Add load_latest_options() to C api (#10152) 2022-06-30 11:03:52 -07:00
java Clock cache (#10273) 2022-06-29 21:50:39 -07:00
logging Deflake unit test BackupEngineTest.Concurrency (#10069) 2022-05-31 09:36:32 -07:00
memory Fail DB::Open() if logger cannot be created (#9984) 2022-05-27 07:23:31 -07:00
memtable Rewrite memory-charging feature's option API (#9926) 2022-05-17 15:01:51 -07:00
microbench Fix overflow in ribbon_bench after #10184 (#10195) 2022-06-17 12:53:57 -07:00
monitoring Add blob cache tickers, perf context statistics, and DB properties (#10203) 2022-06-28 13:52:35 -07:00
options Dynamically changeable `MemPurge` option (#10011) 2022-06-23 09:42:18 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Fix GetWindowsErrSz nullptr bug (#10282) 2022-06-29 20:41:54 -07:00
table Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
test_util Fix a bug of not setting enforce_single_del_contracts (#10027) 2022-05-20 16:48:50 -07:00
third-party Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
tools Clock cache (#10273) 2022-06-29 21:50:39 -07:00
trace_replay Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
util Pass rate_limiter_priority through filter block reader functions to FS (#10251) 2022-06-24 16:13:44 -07:00
utilities Deflake RateLimiting/BackupEngineRateLimitingTestWithParam (#10271) 2022-06-28 14:27:49 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Snapshots with user-specified timestamps (#9879) 2022-06-10 16:07:03 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Change The Way Level Target And Compaction Score Are Calculated (#10057) 2022-06-30 13:32:47 -07:00
INSTALL.md Update supported VS versions in INSTALL.md (#9823) 2022-04-13 13:03:40 -07:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
PLUGINS.md Add pmem-rocksdb-plugin link in PLUGINs.md (#9934) 2022-05-12 22:02:28 -07:00
README.md README: De-list slack channel, list Google group (#9387) 2022-01-18 08:19:48 -08:00
ROCKSDB_LITE.md Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
TARGETS Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
USERS.md Add Solana's RocksDB use case in USERS.md (#9558) 2022-02-16 09:23:01 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
crash_test.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Add API for writing wide-column entities (#10242) 2022-06-25 15:30:47 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.