Find a file
Changyu Bi fd165c869d Add memtable per key-value checksum (#10281)
Summary:
Append per key-value checksum to internal key. These checksums are verified on read paths including Get, Iterator and during Flush. Get and Iterator will return `Corruption` status if there is a checksum verification failure. Flush will make DB become read-only upon memtable entry checksum verification failure.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10281

Test Plan:
- Added new unit test cases: `make check`
- Benchmark on memtable insert
```
TEST_TMPDIR=/dev/shm/memtable_write ./db_bench -benchmarks=fillseq -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100

# avg over 10 runs
Baseline: 1166936 ops/sec
memtable 2 bytes kv checksum : 1.11674e+06 ops/sec (-4%)
memtable 2 bytes kv checksum + write batch 8 bytes kv checksum: 1.08579e+06 ops/sec (-6.95%)
write batch 8 bytes kv checksum: 1.17979e+06 ops/sec (+1.1%)
```
-  Benchmark on only memtable read: ops/sec dropped 31% for `readseq` due to time spend on verifying checksum.
ops/sec for `readrandom` dropped ~6.8%.
```
# Readseq
sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillseq,readseq"[-X20]" -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100

readseq [AVG    20 runs] : 7432840 (± 212005) ops/sec;  822.3 (± 23.5) MB/sec
readseq [MEDIAN 20 runs] : 7573878 ops/sec;  837.9 MB/sec

With -memtable_protection_bytes_per_key=2:

readseq [AVG    20 runs] : 5134607 (± 119596) ops/sec;  568.0 (± 13.2) MB/sec
readseq [MEDIAN 20 runs] : 5232946 ops/sec;  578.9 MB/sec

# Readrandom
sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillrandom,readrandom"[-X10]" -disable_wal=true -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=100
readrandom [AVG    10 runs] : 140236 (± 3938) ops/sec;    9.8 (± 0.3) MB/sec
readrandom [MEDIAN 10 runs] : 140545 ops/sec;    9.8 MB/sec

With -memtable_protection_bytes_per_key=2:
readrandom [AVG    10 runs] : 130632 (± 2738) ops/sec;    9.1 (± 0.2) MB/sec
readrandom [MEDIAN 10 runs] : 130341 ops/sec;    9.1 MB/sec
```

- Stress test: `python3 -u tools/db_crashtest.py whitebox --duration=1800`

Reviewed By: ajkr

Differential Revision: D37607896

Pulled By: cbi42

fbshipit-source-id: fdaefb475629d2471780d4a5f5bf81b44ee56113
2022-08-12 13:51:32 -07:00
.circleci Migrate to docker for CI run (#10496) 2022-08-10 17:34:38 -07:00
.github/workflows Use released clang-format instead of the one from dev branch (#9646) 2022-03-01 10:51:38 -08:00
buckifier Use optimized folly DistributedMutex in LRUCache when available (#10179) 2022-06-17 13:08:45 -07:00
build_tools Migrate to docker for CI run (#10496) 2022-08-10 17:34:38 -07:00
cache Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Fix commit_prereq and other targets (#9797) 2022-04-04 09:58:18 -07:00
db Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
db_stress_tool Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
docs Include some legal contents in website (#10491) 2022-08-05 15:24:20 -07:00
env Migrate to docker for CI run (#10496) 2022-08-10 17:34:38 -07:00
examples Add a simple example of backup and restore (#10054) 2022-06-03 23:25:31 -07:00
file WritableFileWriter tries to skip operations after failure (#10489) 2022-08-10 10:19:20 -07:00
fuzz Add some missing headers (#10519) 2022-08-11 12:45:50 -07:00
include/rocksdb Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
java Add some missing headers (#10519) 2022-08-11 12:45:50 -07:00
logging Close the Logger before rolling to next one in AutoRollLogger (#10488) 2022-08-05 16:23:44 -07:00
memory Revert "Add a blob-specific cache priority (#10309)" (#10434) 2022-07-29 07:18:15 -07:00
memtable Migrate to docker for CI run (#10496) 2022-08-10 17:34:38 -07:00
microbench Avoid allocations/copies for large GetMergeOperands() results (#10458) 2022-08-04 00:42:13 -07:00
monitoring Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
options Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
table Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
test_util Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
third-party Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
tools Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
trace_replay Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
util Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
utilities Derive cache keys from SST unique IDs (#10394) 2022-08-12 13:49:49 -07:00
.clang-format
.gitignore Snapshots with user-specified timestamps (#9879) 2022-06-10 16:07:03 -07:00
.lgtm.yml
.watchmanconfig
AUTHORS
CMakeLists.txt Fix the error path of PLUGIN_ROOT (#10446) 2022-08-03 11:06:27 -07:00
CODE_OF_CONDUCT.md
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
CONTRIBUTING.md
COPYING
crash_test.mk Tiered storage stress test (#10493) 2022-08-08 13:08:35 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md
HISTORY.md Add memtable per key-value checksum (#10281) 2022-08-12 13:51:32 -07:00
INSTALL.md Update supported VS versions in INSTALL.md (#9823) 2022-04-13 13:03:40 -07:00
issue_template.md
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache
LICENSE.leveldb
Makefile Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
PLUGINS.md Add pmem-rocksdb-plugin link in PLUGINs.md (#9934) 2022-05-12 22:02:28 -07:00
README.md Remove Travis CI (#10407) 2022-07-22 20:16:45 -07:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
ROCKSDB_LITE.md Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
src.mk Charge blob cache usage against the global memory limit (#10321) 2022-07-18 23:26:57 -07:00
TARGETS Charge blob cache usage against the global memory limit (#10321) 2022-07-18 23:26:57 -07:00
thirdparty.inc
USERS.md Add Solana's RocksDB use case in USERS.md (#9558) 2022-02-16 09:23:01 -08:00
Vagrantfile
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.