Go to file
Yanqin Jin 672befea2a Fix assertion failure in FIFO compaction with TTL (#5754)
Summary:
Before this PR, the following sequence of events can cause assertion failure as shown below.
Stack trace (partial):
```
(gdb) bt
2  0x00007f59b350ad15 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x9f8390 "mark_as_compacted ? !inputs_[i][j]->being_compacted : inputs_[i][j]->being_compacted", file=file@entry=0x9e347c "db/compaction/compaction.cc", line=line@entry=395, function=function@entry=0xa21ec0 <rocksdb::Compaction::MarkFilesBeingCompacted(bool)::__PRETTY_FUNCTION__> "void rocksdb::Compaction::MarkFilesBeingCompacted(bool)") at assert.c:92
3  0x00007f59b350adc3 in __GI___assert_fail (assertion=assertion@entry=0x9f8390 "mark_as_compacted ? !inputs_[i][j]->being_compacted : inputs_[i][j]->being_compacted", file=file@entry=0x9e347c "db/compaction/compaction.cc", line=line@entry=395, function=function@entry=0xa21ec0 <rocksdb::Compaction::MarkFilesBeingCompacted(bool)::__PRETTY_FUNCTION__> "void rocksdb::Compaction::MarkFilesBeingCompacted(bool)") at assert.c:101
4  0x0000000000492ccd in rocksdb::Compaction::MarkFilesBeingCompacted (this=<optimized out>, mark_as_compacted=<optimized out>) at db/compaction/compaction.cc:394
5  0x000000000049467a in rocksdb::Compaction::Compaction (this=0x7f59af013000, vstorage=0x7f581af53030, _immutable_cf_options=..., _mutable_cf_options=..., _inputs=..., _output_level=<optimized out>, _target_file_size=0, _max_compaction_bytes=0, _output_path_id=0, _compression=<incomplete type>, _compression_opts=..., _max_subcompactions=0, _grandparents=..., _manual_compaction=false, _score=4, _deletion_compaction=true, _compaction_reason=rocksdb::CompactionReason::kFIFOTtl) at db/compaction/compaction.cc:241
6  0x00000000004af9bc in rocksdb::FIFOCompactionPicker::PickTTLCompaction (this=0x7f59b31a6900, cf_name=..., mutable_cf_options=..., vstorage=0x7f581af53030, log_buffer=log_buffer@entry=0x7f59b1bfa930) at db/compaction/compaction_picker_fifo.cc:101
7  0x00000000004b0771 in rocksdb::FIFOCompactionPicker::PickCompaction (this=0x7f59b31a6900, cf_name=..., mutable_cf_options=..., vstorage=0x7f581af53030, log_buffer=0x7f59b1bfa930) at db/compaction/compaction_picker_fifo.cc:201
8  0x00000000004838cc in rocksdb::ColumnFamilyData::PickCompaction (this=this@entry=0x7f59b31b3700, mutable_options=..., log_buffer=log_buffer@entry=0x7f59b1bfa930) at db/column_family.cc:933
9  0x00000000004f3645 in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0x7f59b3176000, made_progress=made_progress@entry=0x7f59b1bfa6bf, job_context=job_context@entry=0x7f59b1bfa760, log_buffer=log_buffer@entry=0x7f59b1bfa930, prepicked_compaction=prepicked_compaction@entry=0x0, thread_pri=rocksdb::Env::LOW) at db/db_impl/db_impl_compaction_flush.cc:2541
10 0x00000000004f5e2a in rocksdb::DBImpl::BackgroundCallCompaction (this=this@entry=0x7f59b3176000, prepicked_compaction=prepicked_compaction@entry=0x0, bg_thread_pri=bg_thread_pri@entry=rocksdb::Env::LOW) at db/db_impl/db_impl_compaction_flush.cc:2312
11 0x00000000004f648e in rocksdb::DBImpl::BGWorkCompaction (arg=<optimized out>) at db/db_impl/db_impl_compaction_flush.cc:2087
```
This can be caused by the following sequence of events.
```
Time
|      thr          bg_compact_thr1                     bg_compact_thr2
|      write
|      flush
|                   mark all l0 as being compacted
|      write
|      flush
|                   add cf to queue again
|                                                       mark all l0 as being
|                                                       compacted, fail the
|                                                       assertion
V
```
Test plan (on devserver)
Since bg_compact_thr1 and bg_compact_thr2 are two threads executing the same
code, it is difficult to use sync point dependency to
coordinate their execution. Therefore, I choose to use db_stress.
```
$TEST_TMPDIR=/dev/shm/rocksdb ./db_stress --periodic_compaction_seconds=1 --max_background_compactions=20 --format_version=2 --memtablerep=skip_list --max_write_buffer_number=3 --cache_index_and_filter_blocks=1 --reopen=20 --recycle_log_file_num=0 --acquire_snapshot_one_in=10000 --delpercent=4 --log2_keys_per_lock=22 --compaction_ttl=1 --block_size=16384 --use_multiget=1 --compact_files_one_in=1000000 --target_file_size_multiplier=2 --clear_column_family_one_in=0 --max_bytes_for_level_base=10485760 --use_full_merge_v1=1 --target_file_size_base=2097152 --checkpoint_one_in=1000000 --mmap_read=0 --compression_type=zstd --writepercent=35 --readpercent=45 --subcompactions=4 --use_merge=0 --write_buffer_size=4194304 --test_batches_snapshots=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_whitebox --use_direct_reads=0 --compact_range_one_in=1000000 --open_files=-1 --destroy_db_initially=0 --progress_reports=0 --compression_zstd_max_train_bytes=0 --snapshot_hold_ops=100000 --enable_pipelined_write=0 --nooverwritepercent=1 --compression_max_dict_bytes=0 --max_key=1000000 --prefixpercent=5 --flush_one_in=1000000 --ops_per_thread=40000 --index_block_restart_interval=7 --cache_size=1048576 --compaction_style=2 --verify_checksum=1 --delrangepercent=1 --use_direct_io_for_flush_and_compaction=0
```
This should see no assertion failure.
Last but not least,
```
$COMPILE_WITH_ASAN=1 make -j32 all
$make check
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5754

Differential Revision: D17109791

Pulled By: riversand963

fbshipit-source-id: 25fc46101235add158554e096540b72c324be078
2019-08-30 12:42:01 -07:00
buckifier Change buckifier to support parameterized dependencies (#5648) 2019-08-02 10:55:17 -07:00
build_tools Port folly/synchronization/DistributedMutex to rocksdb (#5642) 2019-08-07 14:34:19 -07:00
cache Cleaned up and simplified LRU cache implementation (#5579) 2019-07-16 19:17:45 -07:00
cmake cmake: s/SNAPPY_LIBRARIES/snappy_LIBRARIES/ (#5687) 2019-08-16 15:49:23 -07:00
coverage Fix interpreter lines for files with python2-only syntax. 2019-07-09 10:51:37 -07:00
db Fix assertion failure in FIFO compaction with TTL (#5754) 2019-08-30 12:42:01 -07:00
docs Blog post for write_unprepared (#5711) 2019-08-15 14:41:13 -07:00
env fix compile error: ‘FALLOC_FL_KEEP_SIZE’ undeclared (#5708) 2019-08-16 13:58:05 -07:00
examples Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
file New API to get all merge operands for a Key (#5604) 2019-08-06 14:26:44 -07:00
hdfs Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
include/rocksdb Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
java Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
logging Auto Roll Logger to add some extra checking to avoid segfault. (#5623) 2019-07-24 15:14:40 -07:00
memory Move some logging related files to logging/ (#5387) 2019-05-31 17:23:59 -07:00
memtable simplify include directive involving inttypes (#5402) 2019-06-06 13:56:07 -07:00
monitoring fix compiling with `-DNPERF_CONTEXT` (#5704) 2019-08-16 14:38:08 -07:00
options Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
port fix sign compare warnings (#5651) 2019-07-30 14:12:54 -07:00
table Fix comment of function NotifyCollectTableCollectorsOnFinish (#5738) 2019-08-29 10:57:01 -07:00
test_util Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
third-party Fix TSAN failures in DistributedMutex tests (#5684) 2019-08-14 17:01:31 -07:00
tools crash_test to skip compaction TTL for FIFO compaction. (#5749) 2019-08-27 17:55:37 -07:00
trace_replay Block cache analyzer: Support reading from human readable trace file. (#5679) 2019-08-09 13:13:54 -07:00
util Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
utilities replace some reinterpret_cast with static_cast_with_check (#5740) 2019-08-27 10:59:11 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Block cache simulator: Add pysim to simulate caches using reinforcement learning. (#5610) 2019-07-26 14:41:13 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Switch Travis to Xenial build (#4789) 2019-06-17 10:20:02 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Lower the risk for users to run options.force_consistency_checks = true (#5744) 2019-08-29 14:07:37 -07:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
LANGUAGE-BINDINGS.md LANGUAGE-BINDINGS.md: mention python-rocksdb 2019-03-20 11:10:48 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Atomic Flush Crash Test also covers the case that WAL is enabled. (#5729) 2019-08-22 16:32:55 -07:00
README.md Add LevelDB repository link in the Readme 2019-04-01 18:19:09 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
TARGETS Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
USERS.md Add Crux to USERS.md 2019-08-29 11:29:48 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00
appveyor.yml New API to get all merge operands for a Key (#5604) 2019-08-06 14:26:44 -07:00
defs.bzl Change buckifier to support parameterized dependencies (#5648) 2019-08-02 10:55:17 -07:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
src.mk Refactor trimming logic for immutable memtables (#5022) 2019-08-23 13:55:34 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.