Go to file
Fenggang Wu da40d45267 DataBlockHashIndex: avoiding expensive iiter->Next when handling hash kNoEntry (#4296)
Summary:
When returning `kNoEntry` from HashIndex lookup, previously we invalidate the
`biter` by set `current_=restarts_`, so that the search can continue to the next
block in case the search result may reside in the next block.

There is one problem: when we are searching for a missing key, if the search
finds a `kNoEntry` and continue the search to the next block, there is also a
non-trivial possibility that the HashIndex return `kNoEntry` too, and the
expensive index iterator `Next()` will happen several times for nothing.

The solution is that if the hash table returns `kNoEntry`, `SeekForGetImpl()` just search the last restart interval for the key. It will stop at the first key that is large than the seek_key, or to the end of the block, and each case will be handled correctly.

Microbenchmark script:
```
TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillseq,readtocache,readmissing \
          --cache_size=20000000000  --use_data_block_hash_index={true|false}
```

`readmissing` performance (lower is better):
```
binary:                      3.6098 micros/op
hash (before applying diff): 4.1048 micros/op
hash (after  applying diff): 3.3502 micros/op
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4296

Differential Revision: D9419159

Pulled By: fgwu

fbshipit-source-id: 21e3eedcccbc47a249aa8eb4bf405c9def0b8a05
2018-08-23 10:12:58 -07:00
buckifier Remove two CI tests (#4110) 2018-07-12 11:43:25 -07:00
build_tools Release 5.16 (#4298) 2018-08-21 14:43:08 -07:00
cache Support group commits of version edits (#3944) 2018-06-28 12:34:39 -07:00
cmake Search paths provided by intel's "tbbvars.sh". 2018-05-07 14:28:36 -07:00
coverage Remove unused imports, from python scripts. (#4057) 2018-06-26 12:43:04 -07:00
db Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
docs Advisor: README and blog, and also tests for DBBenchRunner, DatabaseOptions (#4201) 2018-08-01 16:13:09 -07:00
env Fix the build failure with OS_ANDROID (#4232) 2018-08-08 08:12:02 -07:00
examples Pin top-level index on partitioned index/filter blocks (#4037) 2018-06-22 15:27:46 -07:00
hdfs Comment out unused variables 2018-03-05 13:13:41 -08:00
include/rocksdb Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
java Add CompactRangeOptions for Java (#4220) 2018-08-17 10:57:25 -07:00
memtable Suppress clang analyzer error (#4299) 2018-08-21 16:43:05 -07:00
monitoring Support group commits of version edits (#3944) 2018-06-28 12:34:39 -07:00
options Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
port Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
table DataBlockHashIndex: avoiding expensive iiter->Next when handling hash kNoEntry (#4296) 2018-08-23 10:12:58 -07:00
third-party Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
tools Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
util Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
utilities Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore RocksDB Trace Analyzer (#4091) 2018-08-13 11:44:02 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
CODE_OF_CONDUCT.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Require ZSTD 1.1.3+ to use dictionary trainer (#4295) 2018-08-22 18:27:52 -07:00
INSTALL.md Enable compilation on OpenBSD 2018-03-19 12:30:05 -07:00
LANGUAGE-BINDINGS.md Added PingCaps Rust RocksDB and ObjectiveRocks (#4065) 2018-06-27 15:43:21 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Adjusted the Makefile of trace_analyzer to isolate the Gflags from other (#4290) 2018-08-21 10:47:24 -07:00
README.md Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
TARGETS Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
USERS.md Support range deletion tombstones in IngestExternalFile SSTs (#3778) 2018-07-13 22:43:09 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
appveyor.yml Upgrade Appveyor to VS2017 2018-02-01 13:57:01 -08:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
src.mk Adjusted the Makefile of trace_analyzer to isolate the Gflags from other (#4290) 2018-08-21 10:47:24 -07:00
thirdparty.inc Provide a way to override windows memory allocator with jemalloc for ZSTD 2018-06-04 12:12:48 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.