Go to file
Cheng Chang 5e794b0841 Fix a recovery corner case (#7621)
Summary:
Consider the following sequence of events:

1. Db flushed an SST with file number N, appended to MANIFEST, and tried to sync the MANIFEST.
2. Syncing MANIFEST failed and db crashed.
3. Db tried to recover with this MANIFEST. In the meantime, no entry about the newly-flushed SST was found in the MANIFEST. Therefore, RocksDB replayed WAL and tried to flush to an SST file reusing the same file number N. This failed because file system does not support overwrite. Then Db deleted this file.
4. Db crashed again.
5. Db tried to recover. When db read the MANIFEST, there was an entry referencing N.sst. This could happen probably because the append in step 1 finally reached the MANIFEST and became visible. Since N.sst had been deleted in step 3, recovery failed.

It is possible that N.sst created in step 1 is valid. Although step 3 would still fail since the MANIFEST was not synced properly in step 1 and 2, deleting N.sst would make it impossible for the db to recover even if the remaining part of MANIFEST was appended and visible after step 5.

After this PR, in step 3, immediately after recovering from MANIFEST, a new MANIFEST is created, then we find that N.sst is not referenced in the MANIFEST, so we delete it, and we'll not reuse N as file number. Then in step 5, since the new MANIFEST does not contain N.sst, the recovery failure situation in step 5 won't happen.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7621

Test Plan:
1. some tests are updated, because these tests assume that new MANIFEST is created after WAL recovery.
2. a new unit test is added in db_basic_test to simulate step 3.

Reviewed By: riversand963

Differential Revision: D24668144

Pulled By: cheng-chang

fbshipit-source-id: 90d7487fbad2bc3714f5ede46ea949895b15ae3b
2020-11-07 22:23:27 -08:00
.circleci Fix many tests to run with MEM_ENV and ENCRYPTED_ENV; Introduce a MemoryFileSystem class (#7566) 2020-10-27 10:33:09 -07:00
.github/workflows Update clang-format-diff.py (#7609) 2020-11-04 16:09:01 -08:00
buckifier Add a rocksdb lib target with link_whole=True (#7466) 2020-09-30 22:50:32 -07:00
build_tools Update clang-format-diff.py (#7609) 2020-11-04 16:09:01 -08:00
cache Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
cmake Add `find_dependency()` in cmake config file. (#6791) 2020-05-12 21:18:29 -07:00
coverage Find the correct gcov (#6904) 2020-06-01 16:33:05 -07:00
db Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
db_stress_tool Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
docs Update github-pages to v207 (#7235) 2020-08-12 09:26:24 -07:00
env Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
examples Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
file Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Add API to verify whole sst file checksum (#7578) 2020-11-03 20:34:56 -08:00
java Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
logging Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
memory slightly improve jemalloc allocator API header (#7592) 2020-10-28 13:47:12 -07:00
memtable Test for LoadLatestOptions (#7554) 2020-10-14 22:28:55 -07:00
monitoring Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
options Expand effect of dictionary settings in `ColumnFamilyOptions::compression_opts` (#7619) 2020-11-02 19:21:11 -08:00
port Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
table Fix MultiGet unable to query timestamp data issue (#7589) 2020-11-03 09:45:41 -08:00
test_util Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
third-party Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
tools Add "max_write_buffer_size_to_maintain" to crash test (#7634) 2020-11-03 13:55:18 -08:00
trace_replay Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
util Ribbon: major re-work of hashing, seeds, and more (#7635) 2020-11-07 17:22:54 -08:00
utilities Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
.clang-format
.gitignore Ribbon: InterleavedSolutionStorage (#7598) 2020-11-03 12:46:36 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Update Travis config for broken snapd on ppc (#7381) 2020-09-14 14:23:13 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md
HISTORY.md Compute NeedCompact() after table builder Finish() (#7627) 2020-11-04 10:44:56 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
LANGUAGE-BINDINGS.md Add RestoreDBFromLatestBackup to C API, add new C# package (#7092) 2020-07-08 11:56:41 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
README.md Fix the CI badge for ppc64le Jenkins (#7561) 2020-10-16 09:00:56 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
TARGETS Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
USERS.md Add YugabyteDB to USERS (#6786) 2020-05-06 10:28:29 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
src.mk Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.