Find a file
Peter Dillinger 11b3f09eac Add a temporary hook for custom yielding in long-running op (#13103)
Summary:
This is a simplified version of https://github.com/facebook/rocksdb/issues/13096, which called for a way to hook into long-running loops completely within RocksDB to change their thread priority (or similar). The current prime hook point is `DBIter::FindNextUserEntryInternal` likely because of iterating over tombstones.

This is implemented using the weak symbol hack for ease of back-porting/patching, and while we get to know potential future requirements better for integration into the public API. (Consider potential relationships to `Env::GetThreadStatusUpdater()` and `TransactionDBMutexFactory`.)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13103

Test Plan:
Performance validated with db_bench and DEBUG_LEVEL=0: `./db_bench --benchmarks=fillseq,deleterandom,readseq[-X100] --value_size=1 --num=1000000`

No consistent difference seen; variances likely in how DB / executable / memory were laid out.

```
With an empty hook:
readseq [AVG    100 runs] : 1753018 (± 8850) ops/sec;   28.4 (± 0.1) MB/sec
readseq [MEDIAN 100 runs] : 1763746 ops/sec;   28.6 MB/sec
(recompile)
readseq [AVG    100 runs] : 1789019 (± 10260) ops/sec;   29.0 (± 0.2) MB/sec
readseq [MEDIAN 100 runs] : 1801849 ops/sec;   29.2 MB/sec

Base:
readseq [AVG    100 runs] : 1772196 (± 8240) ops/sec;   28.7 (± 0.1) MB/sec
readseq [MEDIAN 100 runs] : 1780453 ops/sec;   28.9 MB/sec
(recompile)
readseq [AVG    100 runs] : 1777637 (± 7613) ops/sec;   28.8 (± 0.1) MB/sec
readseq [MEDIAN 100 runs] : 1786657 ops/sec;   29.0 MB/sec

With a functional hook (count number of calls into it):
readseq [AVG    100 runs] : 1796733 (± 8854) ops/sec;   29.1 (± 0.1) MB/sec
readseq [MEDIAN 100 runs] : 1804690 ops/sec;   29.3 MB/sec
RocksDbThreadYield: 126915800
(recompile)
readseq [AVG    100 runs] : 1775371 (± 10529) ops/sec;   28.8 (± 0.2) MB/sec
readseq [MEDIAN 100 runs] : 1789046 ops/sec;   29.0 MB/sec
RocksDbThreadYield: 126977000

Base:
readseq [AVG    100 runs] : 1773071 (± 10657) ops/sec;   28.7 (± 0.2) MB/sec
readseq [MEDIAN 100 runs] : 1783414 ops/sec;   28.9 MB/sec
(recompile)
readseq [AVG    100 runs] : 1750852 (± 10184) ops/sec;   28.4 (± 0.2) MB/sec
readseq [MEDIAN 100 runs] : 1763587 ops/sec;   28.6 MB/sec
```

Reviewed By: george-reynya

Differential Revision: D65235379

Pulled By: pdillinger

fbshipit-source-id: 7829e4cc25a56d4c1801b8adf9c7f7aa49ab7aca
2024-11-01 08:11:34 -07:00
.circleci Enable io_uring in stress test (#12313) 2024-01-31 12:37:42 -08:00
.github convert circleci arm jobs to github actions (#12569) 2024-04-22 15:01:50 -07:00
buckifier add export_file to rockdb TARGETS generator and re-gen 2024-06-04 11:42:26 -07:00
build_tools Fix scan-build path (#12563) 2024-04-19 08:45:16 -07:00
cache Fix stale memory access with FSBuffer and tiered sec cache (#12712) 2024-06-02 16:50:41 -07:00
cmake Fix zstd typo in cmake (#12309) 2024-02-22 14:39:05 -08:00
coverage Remove platform009 and default to platform010 (#11333) 2023-03-30 09:56:37 -07:00
db Add a temporary hook for custom yielding in long-running op (#13103) 2024-11-01 08:11:34 -07:00
db_stress_tool Disallow memtable flush and sst ingest while WAL is locked (#12652) 2024-06-02 16:50:15 -07:00
docs Java FFI blog post - Post-publication issues with images (2) (#12372) 2024-02-22 15:01:55 -08:00
env Make linux file write life time hinting work (#12595) 2024-05-07 17:54:50 -07:00
examples Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
file Fix file deletions in DestroyDB not rate limited (#12891) 2024-08-05 09:24:20 -07:00
fuzz Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
include/rocksdb update HISTORY.md and version.h for 9.3.2 2024-08-05 09:25:52 -07:00
java Set optimize_filters_for_memory by default (#12377) 2024-04-30 08:33:31 -07:00
logging Fix data race in AutoRollLogger (#12436) 2024-03-14 14:28:33 -07:00
memory Set optimize_filters_for_memory by default (#12377) 2024-04-30 08:33:31 -07:00
memtable Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
microbench internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
monitoring New PerfContext counters for block cache bytes read (#12459) 2024-03-21 10:46:46 -07:00
options Basic RocksDB follower implementation (#12540) 2024-04-19 19:13:31 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Add a temporary hook for custom yielding in long-running op (#13103) 2024-11-01 08:11:34 -07:00
table Fix stale memory access with FSBuffer and tiered sec cache (#12712) 2024-06-02 16:50:41 -07:00
test_util Implement obsolete file deletion (GC) in follower (#12657) 2024-05-17 19:13:33 -07:00
third-party fix optimization-disabled test builds with platform010 (#11361) 2023-04-10 13:59:44 -07:00
tools Disallow memtable flush and sst ingest while WAL is locked (#12652) 2024-06-02 16:50:15 -07:00
trace_replay Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
unreleased_history Update HISTORY for 9.3.2 2024-08-05 10:40:16 -07:00
util Support read timestamp in ldb (#12641) 2024-05-13 15:43:12 -07:00
utilities Make transaction name conflict check more robust (#12895) 2024-08-05 10:37:59 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Add .arcconfig to .gitignore (fb internal use) (#11803) 2023-09-07 14:57:39 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Basic RocksDB follower implementation (#12540) 2024-04-19 19:13:31 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
crash_test.mk Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Update HISTORY for 9.3.2 2024-08-05 10:40:16 -07:00
INSTALL.md fix out of date macos instructions in INSTALL.md (#12393) 2024-02-28 12:38:15 -08:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Add grocksdb in Go language bindings (#10498) 2022-08-23 15:02:10 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Basic RocksDB follower implementation (#12540) 2024-04-19 19:13:31 -07:00
PLUGINS.md Add encfs plugin link (#12070) 2023-11-14 07:33:21 -08:00
README.md Remove deprecated integration tests from README.md (#11354) 2023-04-07 16:52:50 -07:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Basic RocksDB follower implementation (#12540) 2024-04-19 19:13:31 -07:00
TARGETS add export_file to rockdb TARGETS generator and re-gen 2024-06-04 11:42:26 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Add Qdrant to USERS.md (#12072) 2023-11-16 10:35:08 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.