Find a file
Peter Dillinger 17bc27741f Improve memory efficiency of many OptimisticTransactionDBs (#11439)
Summary:
Currently it's easy to use a ton of memory with many small OptimisticTransactionDB instances, because each one by default allocates a million mutexes (40 bytes each on my compiler) for validating transactions. It even puts a lot of pressure on the allocator by allocating each one individually!

In this change:
* Create a new object and option that enables sharing these buckets of mutexes between instances. This is generally good for load balancing potential contention as various DBs become hotter or colder with txn writes. About the only cases where this sharing wouldn't make sense (e.g. each DB usually written by one thread) are cases that would be better off with OccValidationPolicy::kValidateSerial which doesn't use the buckets anyway.
* Allocate the mutexes in a contiguous array, for efficiency
* Add an option to ensure the mutexes are cache-aligned. In several other places we use cache-aligned mutexes but OptimisticTransactionDB historically does not. It should be a space-time trade-off the user can choose.
* Provide some visibility into the memory used by the mutex buckets with an ApproximateMemoryUsage() function (also used in unit testing)
* Share code with other users of "striped" mutexes, appropriate refactoring for customization & efficiency (e.g. using FastRange instead of modulus)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11439

Test Plan: unit tests added. Ran sized-up versions of stress test in unit test, including a before-and-after performance test showing no consistent difference. (NOTE: OptimisticTransactionDB not currently covered by db_stress!)

Reviewed By: ltamasi

Differential Revision: D45796393

Pulled By: pdillinger

fbshipit-source-id: ae2b3a26ad91ceeec15debcdc63ff48df6736a54
2023-05-24 11:57:15 -07:00
.circleci Revert enabling IO uring in db_stress (#11242) 2023-02-21 12:53:55 -08:00
.github/workflows ci: add GitHub token permissions for workflow (#10549) 2022-10-04 12:10:30 -07:00
buckifier Fix duplicate symbols in linking with buck2 (#11421) 2023-05-01 10:45:36 -07:00
build_tools Repair/instate jemalloc build on M1 (#11257) 2023-05-22 11:06:41 -07:00
cache Compatibility step for separating BlockCache and GeneralCache APIs (#11450) 2023-05-18 20:40:19 -07:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Remove platform009 and default to platform010 (#11333) 2023-03-30 09:56:37 -07:00
db Improve memory efficiency of many OptimisticTransactionDBs (#11439) 2023-05-24 11:57:15 -07:00
db_stress_tool Fix stress test failure caused by #11424 (#11470) 2023-05-22 15:47:28 -07:00
docs Remove docs/Gemfile.lock and update github-pages version (#11173) 2023-02-14 12:17:23 -08:00
env Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
examples Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
file Add rocksdb.file.read.db.open.micros (#11455) 2023-05-18 09:26:29 -07:00
fuzz Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
include/rocksdb Improve memory efficiency of many OptimisticTransactionDBs (#11439) 2023-05-24 11:57:15 -07:00
java Implement missing compactrangeoptions from Java API (#10880) 2023-05-24 11:04:46 -07:00
logging Disabling some IO error assertion in EnvLogger (#11314) 2023-03-20 13:23:29 -07:00
memory Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
memtable remove redundant move (#11418) 2023-05-03 09:37:21 -07:00
microbench Remove wait_unscheduled from waitForCompact internal API (#11443) 2023-05-17 18:13:50 -07:00
monitoring Much better stats for seeks and prefix filtering (#11460) 2023-05-19 15:25:49 -07:00
options Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Simplify detection of x86 CPU features (#11419) 2023-05-09 22:25:45 -07:00
table Much better stats for seeks and prefix filtering (#11460) 2023-05-19 15:25:49 -07:00
test_util Add hash_seed to Caches (#11391) 2023-05-09 22:24:26 -07:00
third-party fix optimization-disabled test builds with platform010 (#11361) 2023-04-10 13:59:44 -07:00
tools Update db_crashtest.py for support for dir creation on remote storage (#11448) 2023-05-23 09:42:35 -07:00
trace_replay Fix error maybe-uninitialized #11100 (#11101) 2023-01-19 13:59:48 -08:00
util Improve memory efficiency of many OptimisticTransactionDBs (#11439) 2023-05-24 11:57:15 -07:00
utilities Improve memory efficiency of many OptimisticTransactionDBs (#11439) 2023-05-24 11:57:15 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore remove redundant move (#11418) 2023-05-03 09:37:21 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Add utils to use for handling user defined timestamp size record in WAL (#11451) 2023-05-22 14:28:58 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
crash_test.mk Allow a custom DB cleanup command to be passed to db_crashtest.py (#10883) 2022-10-27 19:47:01 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Improve memory efficiency of many OptimisticTransactionDBs (#11439) 2023-05-24 11:57:15 -07:00
INSTALL.md Simplify detection of x86 CPU features (#11419) 2023-05-09 22:25:45 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Add grocksdb in Go language bindings (#10498) 2022-08-23 15:02:10 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Add utils to use for handling user defined timestamp size record in WAL (#11451) 2023-05-22 14:28:58 -07:00
PLUGINS.md Added encryption plugin based on Intel open-source ipp-crypto library (#11429) 2023-05-08 12:13:43 -07:00
README.md Remove deprecated integration tests from README.md (#11354) 2023-04-07 16:52:50 -07:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Add utils to use for handling user defined timestamp size record in WAL (#11451) 2023-05-22 14:28:58 -07:00
TARGETS Add utils to use for handling user defined timestamp size record in WAL (#11451) 2023-05-22 14:28:58 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Add more users (#11369) 2023-05-03 10:47:41 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.