Go to file
anand76 95b41eec6d Fix potential incorrect result for duplicate key in MultiGet (#12295)
Summary:
The RocksDB correctness testing has recently discovered a possible, but very unlikely, correctness issue with MultiGet. The issue happens when all of the below conditions are met -
1. Duplicate keys in a MultiGet batch
2. Key matches the last key in a non-zero, non-bottommost level file
3. Final value is not in the file (merge operand, not snapshot visible etc)
4. Multiple entries exist for the key in the file spanning more than 1 data block. This can happen due to snapshots, which would force multiple versions of the key in the file, and they may spill over to another data block
5. Lookup attempt in the SST for the first of the duplicates fails with IO error on a data block (NOT the first data block, but the second or subsequent uncached block), but no errors for the other duplicates
6. Value or merge operand for the key is present in the very next level

The problem is, in FilePickerMultiGet, when looking up keys in a level we use FileIndexer and the overlapping file in the current level to determine the search bounds for that key in the file list in the next level. If the next level is empty, the search bounds are reset and we do a full binary search in the next non-empty level's LevelFilesBrief. However, under the  conditions https://github.com/facebook/rocksdb/issues/1 and https://github.com/facebook/rocksdb/issues/2 listed above, only the first of the duplicates has its next-level search bounds updated, and the remaining duplicates are skipped.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12295

Test Plan: Add unit tests that fail an assertion or return wrong result without the fix

Reviewed By: hx235

Differential Revision: D53187634

Pulled By: anand1976

fbshipit-source-id: a5eadf4fede9bbdec784cd993b15e3341436d1ea
2024-02-02 11:48:35 -08:00
.circleci Enable io_uring in stress test (#12313) 2024-01-31 12:37:42 -08:00
.github Enable io_uring in stress test (#12313) 2024-01-31 12:37:42 -08:00
buckifier Error out in case of std errors in blackbox test and export file in TARGETS 2023-10-24 11:46:18 -07:00
build_tools Fix build on OpenBSD i386 (#12142) 2024-01-29 16:19:59 -08:00
cache Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
cmake Fix typo in CMakeList. (#12247) 2024-01-25 12:35:27 -08:00
coverage Remove platform009 and default to platform010 (#11333) 2023-03-30 09:56:37 -07:00
db Fix potential incorrect result for duplicate key in MultiGet (#12295) 2024-02-02 11:48:35 -08:00
db_stress_tool Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
docs FIX new blog post (JNI performance) Locate images correctly (#12050) 2023-11-07 11:58:58 -08:00
env Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
examples Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
file Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
fuzz Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
include/rocksdb Fix potential incorrect result for duplicate key in MultiGet (#12295) 2024-02-02 11:48:35 -08:00
java Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
logging Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
memory internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
memtable Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
microbench internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
monitoring Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
options Removed `check_flush_compaction_key_order` (#12311) 2024-01-31 16:30:26 -08:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
table Augment sst_dump tool to verify num_entries in table property (#12322) 2024-02-01 14:35:03 -08:00
test_util Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
third-party fix optimization-disabled test builds with platform010 (#11361) 2023-04-10 13:59:44 -07:00
tools Augment sst_dump tool to verify num_entries in table property (#12322) 2024-02-01 14:35:03 -08:00
trace_replay Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
unreleased_history Fix potential incorrect result for duplicate key in MultiGet (#12295) 2024-02-02 11:48:35 -08:00
util Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
utilities Fix potential incorrect result for duplicate key in MultiGet (#12295) 2024-02-02 11:48:35 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Add .arcconfig to .gitignore (fb internal use) (#11803) 2023-09-07 14:57:39 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Fix typo in CMakeList. (#12247) 2024-01-25 12:35:27 -08:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Add 8.11 release note for FileOperationType enum addition (#12263) 2024-01-22 12:43:44 -08:00
INSTALL.md Fix build on OpenBSD i386 (#12142) 2024-01-29 16:19:59 -08:00
LANGUAGE-BINDINGS.md Add grocksdb in Go language bindings (#10498) 2022-08-23 15:02:10 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Fix gcc12 build failure caused by INT_MIN in NumberToHumanString (#12215) 2024-01-10 10:17:31 -08:00
PLUGINS.md Add encfs plugin link (#12070) 2023-11-14 07:33:21 -08:00
README.md Remove deprecated integration tests from README.md (#11354) 2023-04-07 16:52:50 -07:00
TARGETS Make OffpeakTimeInfo available in VersionSet (#12018) 2023-10-27 15:56:48 -07:00
USERS.md Add Qdrant to USERS.md (#12072) 2023-11-16 10:35:08 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
crash_test.mk Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Add native logger support to RocksJava (#12213) 2024-01-17 17:51:36 -08:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.