Go to file
Yi Wu 5d4fddfa52 WritePrepared: Fix visible key compacted out by compaction (#4883)
Summary:
With WritePrepared transaction, flush/compaction can contain uncommitted keys, and those keys can get committed during compaction. If a snapshot is taken before the key is committed, it should not see the key. On the other hand, compaction grab the list of snapshots at its beginning, and only consider those snapshots to dedup keys. Consider the case:
```
seq = 1: put "foo" = "bar"
seq = 2: transaction T: delete "foo", prepare
seq = 3: compaction start
seq = 4: take snapshot S
seq = 5: transaction T: commit.
...
seq = N: compaction iterator reached key "foo".
```
When compaction start, the list of snapshot is empty. Compaction doesn't take snapshot S into account. When it reached "foo", transaction T is committed. Compaction may think the value "foo=bar" is not visible by any snapshot (which is wrong), and compact the value out.

The fix is to explicitly take a snapshot before compaction grabbing the list of snapshots. Compaction will then has to keep keys visible to this snapshot.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4883

Differential Revision: D13668775

Pulled By: maysamyabandeh

fbshipit-source-id: 1cab9615f94b7d3e8522cc3d44c3a14c7d4720e4
2019-01-15 21:34:38 -08:00
buckifier Fix skylark incompatible build files in rocksdb 2019-01-07 13:37:40 -08:00
build_tools Fix spelling errors (#4827) 2019-01-02 11:17:57 -08:00
cache Revert "Move MemoryAllocator option from Cache to BlockBasedTableOpti… (#4697) 2018-11-21 11:29:57 -08:00
cmake Make FindZLIB consistent with official definitions (#4823) 2019-01-02 12:49:57 -08:00
coverage Remove unused imports, from python scripts. (#4057) 2018-06-26 12:43:04 -07:00
db WritePrepared: Fix visible key compacted out by compaction (#4883) 2019-01-15 21:34:38 -08:00
docs Insane line length detected (#4813) 2018-12-21 14:54:34 -08:00
env Introduce a CPU time counter in perf_context (#4741) 2018-12-20 12:03:44 -08:00
examples Pin top-level index on partitioned index/filter blocks (#4037) 2018-06-22 15:27:46 -07:00
hdfs Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
include/rocksdb Remove some components (#4101) 2019-01-10 13:30:09 -08:00
java Fix typos in comments (#4819) 2018-12-26 09:43:56 -08:00
memtable WriteBufferManger doens't cost to cache if no limit is set (#4695) 2018-11-18 16:55:43 -08:00
monitoring Add a new per level counter for block cache hit (#4796) 2018-12-21 13:20:05 -08:00
options Concurrent task limiter for compaction thread control (#4332) 2018-12-13 13:18:28 -08:00
port Detect if Jemalloc is linked with the binary (#4844) 2019-01-03 16:30:12 -08:00
table fix accounting for range tombstones in TableProperties (#4841) 2019-01-02 15:08:53 -08:00
third-party Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
tools With ldb --try_load_options and wal_dir doesn't exist, ignore it (#4875) 2019-01-11 16:48:32 -08:00
util Detect if Jemalloc is linked with the binary (#4844) 2019-01-03 16:30:12 -08:00
utilities WritePrepared: Fix visible key compacted out by compaction (#4883) 2019-01-15 21:34:38 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore RocksDB Trace Analyzer (#4091) 2018-08-13 11:44:02 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Fix printf formatting on MacOS (#4533) 2018-10-19 14:46:09 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Remove some components (#4101) 2019-01-10 13:30:09 -08:00
CODE_OF_CONDUCT.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md With ldb --try_load_options and wal_dir doesn't exist, ignore it (#4875) 2019-01-11 16:48:32 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
LANGUAGE-BINDINGS.md Added PingCaps Rust RocksDB and ObjectiveRocks (#4065) 2018-06-27 15:43:21 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Fix downloaded filename of snappy (#4870) 2019-01-11 10:29:40 -08:00
README.md Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
TARGETS Remove some components (#4101) 2019-01-10 13:30:09 -08:00
USERS.md Adding IOTA Foundation to USERS.MD (#4436) 2018-10-02 10:03:46 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
appveyor.yml Add RocksJava build to AppVeyor 2019-01-03 10:44:44 -08:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
src.mk Remove some components (#4101) 2019-01-10 13:30:09 -08:00
thirdparty.inc Provide a way to override windows memory allocator with jemalloc for ZSTD 2018-06-04 12:12:48 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.