Go to file
Yueh-Hsuan Chiang a7d4eb2f34 Fix a bug where flush does not happen when a manual compaction is running
Summary:
Currently, when rocksdb tries to run manual compaction to refit data into a level,
there's a ReFitLevel() process that requires no bg work is currently running.
When RocksDB plans to ReFitLevel(), it will do the following:

 1. pause scheduling new bg work.
 2. wait until all bg work finished
 3. do the ReFitLevel()
 4. unpause scheduling new bg work.

However, as it pause scheduling new bg work at step one and waiting for all bg work
finished in step 2, RocksDB will stop flushing until all bg work is done (which
could take a long time.)

This patch fix this issue by changing the way ReFitLevel() pause the background work:

1. pause scheduling compaction.
2. wait until all bg work finished.
3. pause scheduling flush
4. do ReFitLevel()
5. unpause both flush and compaction.

The major difference is that.  We only pause scheduling compaction in step 1 and wait
for all bg work finished in step 2.  This prevent flush being blocked for a long time.
Although there's a very rare case that ReFitLevel() might be in starvation in step 2,
but it's less likely the case as flush typically finish very fast.

Test Plan: existing test.

Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D55029
2016-03-04 14:24:52 -08:00
arcanist_util Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
build_tools Modify build_tools/build_detect_platform to detect and set -march=z10 on Linux s390x. 2016-02-29 15:02:52 -05:00
coverage Fix coverage script 2014-11-03 14:53:00 -08:00
db Fix a bug where flush does not happen when a manual compaction is running 2016-03-04 14:24:52 -08:00
doc Lint everything 2015-11-16 12:56:21 -08:00
examples Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
hdfs Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
include/rocksdb [backupable db] Remove file size embedded in name workaround 2016-03-03 13:32:20 -08:00
java Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
memtable Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
port Implement ConsistentChildrenAttribute 2016-02-19 14:20:34 -08:00
table Add Iterator Property rocksdb.iterator.version_number 2016-03-02 16:23:59 -08:00
third-party Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
tools Update benchmarks used to measure subcompaction performance 2016-03-04 12:32:11 -08:00
util Add parsing of missing DB options 2016-03-02 10:34:14 -08:00
utilities Fix Windows build 2016-03-03 15:08:24 -08:00
.arcconfig Integrate Jenkins with Phabricator 2015-04-07 11:56:29 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore New amalgamation target 2015-10-01 08:29:31 +13:00
.travis.yml Travis CI to disable ROCKSDB_LITE tests 2016-02-01 18:42:01 -08:00
AUTHORS Add AUTHORS file. Fix #203 2014-09-29 10:52:18 -07:00
CMakeLists.txt Add Iterator Property rocksdb.iterator.version_number 2016-03-02 16:23:59 -08:00
CONTRIBUTING.md facebook accounts are not required for CLA signers 2014-07-08 05:57:54 -04:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Change BlockBasedTableOptions.format_version default to 2 2016-03-01 14:28:52 -08:00
INSTALL.md Simple changes to support builds for ppc64[le] consistent with X86 2016-01-19 09:08:19 -06:00
LICENSE Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
Makefile Add Iterator Property rocksdb.iterator.version_number 2016-03-02 16:23:59 -08:00
PATENTS Update Patent Grant. 2015-04-13 10:33:43 +01:00
README.md Replaced "built on on earlier work" by "built on earlier work" in README.md 2014-09-17 01:16:17 -07:00
ROCKSDB_LITE.md Optimistic Transactions 2015-05-29 14:36:35 -07:00
USERS.md Add Wingify to USERS.md 2016-01-28 01:04:49 +05:30
Vagrantfile RocksDB on FreeBSD support 2015-02-26 15:19:17 -08:00
WINDOWS_PORT.md Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
appveyor.yml Exclude DBTest.FileCreationRandomFailure as a long running test 2015-11-17 13:54:13 -08:00
src.mk IOStatsContext::ToString() add option to exclude zero counters 2016-02-23 10:26:24 -08:00
thirdparty.inc Enable override to 3rd party linkage 2015-11-24 11:51:37 -08:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/