Go to file
Peter Dillinger 27f3af5966 Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460)
Summary:
TL;DR: due to a recent change, if you drop a column family,
often that DB will no longer fsync after writing new SST files
to remaining or new column families, which could lead to data
loss on power loss.

More bug detail:
The intent of https://github.com/facebook/rocksdb/issues/10049 was to Close FSDirectory objects at
DB::Close time rather than waiting for DB object destruction.
Unfortunately, it also closes shared FSDirectory objects on
DropColumnFamily (& destroy remaining handles), which can lead
to use-after-Close on FSDirectory shared with remaining column
families. Those "uses" are only Fsyncs (or redundant Closes). In
the default Posix filesystem, an Fsync on a closed FSDirectory is a
quiet no-op. Consequently (under most configurations), if you drop
a column family, that DB will no longer fsync after writing new SST
files to column families sharing the same directory (true under most
configurations).

More fix detail:
Basically, this removes unnecessary Close ops on destroying
ColumnFamilyData. We let `shared_ptr` take care of calling the
destructor at the right time. If the intent was to require Close be
called before destroying FSDirectory, that was not made clear by the
author of FileSystem and was not at all enforced by https://github.com/facebook/rocksdb/issues/10049, which
could have added `assert(fd_ == -1)` to `~PosixDirectory()` but did
not. To keep this fix simple, we relax the unit test for https://github.com/facebook/rocksdb/issues/10049 to allow
timely destruction of FSDirectory to suffice as Close (in
CountedFileSystem). Added a TODO to revisit that.

Also in this PR:
* Added a TODO to share FSDirectory instances between DB and its column
families. (Already shared among column families.)
* Made DB::Close attempt to close all its open FSDirectory objects even
if there is a failure in closing one. Also code clean-up around this
logic.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10460

Test Plan:
add an assert to check for use-after-Close. With that
existing tests can detect the misuse. With fix, tests pass (except noted
relaxing of unit test for https://github.com/facebook/rocksdb/issues/10049)

Reviewed By: ajkr

Differential Revision: D38357922

Pulled By: pdillinger

fbshipit-source-id: d42079cadbedf0a969f03389bf586b3b4e1f9137
2022-08-02 10:54:32 -07:00
.circleci Run new benchmark script in branch. (#10303) 2022-07-25 14:44:10 -07:00
.github/workflows Use released clang-format instead of the one from dev branch (#9646) 2022-03-01 10:51:38 -08:00
buckifier Use optimized folly DistributedMutex in LRUCache when available (#10179) 2022-06-17 13:08:45 -07:00
build_tools Run new benchmark script in branch. (#10303) 2022-07-25 14:44:10 -07:00
cache Fix cache metrics update when secondary cache is used (#10440) 2022-07-29 14:24:44 -07:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Fix commit_prereq and other targets (#9797) 2022-04-04 09:58:18 -07:00
db Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460) 2022-08-02 10:54:32 -07:00
db_stress_tool Stop operating on DB in a stress test background thread (#10373) 2022-07-19 11:25:43 -07:00
docs Bump tzinfo from 1.2.9 to 1.2.10 in /docs (#10400) 2022-07-22 08:45:36 -07:00
env Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460) 2022-08-02 10:54:32 -07:00
examples Add a simple example of backup and restore (#10054) 2022-06-03 23:25:31 -07:00
file Provide support for direct_reads with async_io (#10197) 2022-07-06 11:42:59 -07:00
fuzz Fix compilation errors and add fuzzers to CircleCI (#9420) 2022-02-01 10:32:15 -08:00
include/rocksdb Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
java Revert "Add a blob-specific cache priority (#10309)" (#10434) 2022-07-29 07:18:15 -07:00
logging Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
memory Revert "Add a blob-specific cache priority (#10309)" (#10434) 2022-07-29 07:18:15 -07:00
memtable Fix a bug in hash linked list (#10401) 2022-07-25 11:33:28 -07:00
microbench Fix overflow in ribbon_bench after #10184 (#10195) 2022-06-17 12:53:57 -07:00
monitoring Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
options Fix race conditions in GenericRateLimiter (#10374) 2022-07-19 09:31:14 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Use EnvLogger instead of PosixLogger (#10436) 2022-08-01 14:37:18 -07:00
table Fix cache metrics update when secondary cache is used (#10440) 2022-07-29 14:24:44 -07:00
test_util Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
third-party Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
tools regression_test.sh: kill very old db_bench (and more) (#10441) 2022-08-02 09:16:17 -07:00
trace_replay Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
util db_bench -use_stderr_info_logger to print timestamp (#10435) 2022-07-29 11:24:52 -07:00
utilities Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460) 2022-08-02 10:54:32 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Snapshots with user-specified timestamps (#9879) 2022-06-10 16:07:03 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt add trace tools flags in CMakeLists (#10404) 2022-07-27 09:10:18 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460) 2022-08-02 10:54:32 -07:00
INSTALL.md Update supported VS versions in INSTALL.md (#9823) 2022-04-13 13:03:40 -07:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Add seqno to time mapping (#10338) 2022-07-14 21:49:34 -07:00
PLUGINS.md Add pmem-rocksdb-plugin link in PLUGINs.md (#9934) 2022-05-12 22:02:28 -07:00
README.md Remove Travis CI (#10407) 2022-07-22 20:16:45 -07:00
ROCKSDB_LITE.md Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
TARGETS Charge blob cache usage against the global memory limit (#10321) 2022-07-18 23:26:57 -07:00
USERS.md Add Solana's RocksDB use case in USERS.md (#9558) 2022-02-16 09:23:01 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
crash_test.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Charge blob cache usage against the global memory limit (#10321) 2022-07-18 23:26:57 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.