Go to file
Maysam Yabandeh f383641a1d Unordered Writes (#5218)
Summary:
Performing unordered writes in rocksdb when unordered_write option is set to true. When enabled the writes to memtable are done without joining any write thread. This offers much higher write throughput since the upcoming writes would not have to wait for the slowest memtable write to finish. The tradeoff is that the writes visible to a snapshot might change over time. If the application cannot tolerate that, it should implement its own mechanisms to work around that. Using TransactionDB with WRITE_PREPARED write policy is one way to achieve that. Doing so increases the max throughput by 2.2x without however compromising the snapshot guarantees.
The patch is prepared based on an original by siying
Existing unit tests are extended to include unordered_write option.

Benchmark Results:
```
TEST_TMPDIR=/dev/shm/ ./db_bench_unordered --benchmarks=fillrandom --threads=32 --num=10000000 -max_write_buffer_number=16 --max_background_jobs=64 --batch_size=8 --writes=3000000 -level0_file_num_compaction_trigger=99999 --level0_slowdown_writes_trigger=99999 --level0_stop_writes_trigger=99999 -enable_pipelined_write=false -disable_auto_compactions  --unordered_write=1
```
With WAL
- Vanilla RocksDB: 78.6 MB/s
- WRITER_PREPARED with unordered_write: 177.8 MB/s (2.2x)
- unordered_write: 368.9 MB/s (4.7x with relaxed snapshot guarantees)

Without WAL
- Vanilla RocksDB: 111.3 MB/s
- WRITER_PREPARED with unordered_write: 259.3 MB/s MB/s (2.3x)
- unordered_write: 645.6 MB/s (5.8x with relaxed snapshot guarantees)

- WRITER_PREPARED with unordered_write disable concurrency control: 185.3 MB/s MB/s (2.35x)

Limitations:
- The feature is not yet extended to `max_successive_merges` > 0. The feature is also incompatible with `enable_pipelined_write` = true as well as with `allow_concurrent_memtable_write` = false.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5218

Differential Revision: D15219029

Pulled By: maysamyabandeh

fbshipit-source-id: 38f2abc4af8780148c6128acdba2b3227bc81759
2019-05-13 17:47:21 -07:00
buckifier Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
build_tools Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
cache Consolidate hash function used for non-persistent data in a new function (#5155) 2019-04-08 13:32:06 -07:00
cmake Make FindZLIB consistent with official definitions (#4823) 2019-01-02 12:49:57 -08:00
coverage Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
db Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
docs Blog post for format_version=4 2019-03-08 16:49:30 -08:00
env Optionally wait on bytes_per_sync to smooth I/O (#5183) 2019-04-22 11:51:39 -07:00
examples Support for single-primary, multi-secondary instances (#4899) 2019-03-26 16:45:31 -07:00
hdfs Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
include/rocksdb Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
java Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
memtable WriteBufferManager's dummy entry size to block cache 1MB -> 256KB (#5175) 2019-04-16 12:03:07 -07:00
monitoring Fix compilation errors for 32bits/LITE/ios build. (#5220) 2019-04-22 16:02:16 -07:00
options Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
port Optionally wait on bytes_per_sync to smooth I/O (#5183) 2019-04-22 11:51:39 -07:00
table Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
third-party/gtest-1.7.0/fused-src/gtest remove bundled but unused fbson library (#5108) 2019-03-26 16:37:52 -07:00
tools Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
util Refresh snapshot list during long compactions (2nd attempt) (#5278) 2019-05-03 17:30:22 -07:00
utilities Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore RocksDB Trace Analyzer (#4091) 2018-08-13 11:44:02 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Fix printf formatting on MacOS (#4533) 2018-10-19 14:46:09 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt compile gtest only when enable test 2019-04-30 09:33:44 -07:00
CODE_OF_CONDUCT.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Unordered Writes (#5218) 2019-05-13 17:47:21 -07:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
LANGUAGE-BINDINGS.md LANGUAGE-BINDINGS.md: mention python-rocksdb 2019-03-20 11:10:48 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Allow builds of RocksJava debug releases (#5274) 2019-05-02 14:27:20 -07:00
README.md Add LevelDB repository link in the Readme 2019-04-01 18:19:09 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
TARGETS Support for single-primary, multi-secondary instances (#4899) 2019-03-26 16:45:31 -07:00
USERS.md Adding IOTA Foundation to USERS.MD (#4436) 2018-10-02 10:03:46 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00
appveyor.yml Add RocksJava build to AppVeyor 2019-01-03 10:44:44 -08:00
defs.bzl Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
src.mk RocksDB CRC32c optimization with ARMv8 Intrinsic (#5221) 2019-04-30 10:59:05 -07:00
thirdparty.inc Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.