mirror of https://github.com/facebook/rocksdb.git
Deflake DBErrorHandlingFSTest.MultiCFWALWriteError (#9496)
Summary: **Context:** As part of https://github.com/facebook/rocksdb/pull/6949, file deletion is disabled for faulty database on the IOError of MANIFEST write/sync and [re-enabled again during `DBImpl::Resume()` if all recovery is completed](e66199d848 (diff-d9341fbe2a5d4089b93b22c5ed7f666bc311b378c26d0786f4b50c290e460187R396)
). Before re-enabling file deletion, it `assert(versions_->io_status().ok());`, which IMO assumes `versions_` is **the** `version_` in the recovery process. However, this is not necessarily true due to `s = error_handler_.ClearBGError();` happening before that assertion can unblock some foreground thread by [`EventHelpers::NotifyOnErrorRecoveryEnd()`](3122cb4358/db/error_handler.cc (L552-L553)
) as part of the `ClearBGError()`. That foreground thread can do whatever it wants including closing/reopening the db and clean up that same `versions_`. As a consequence, `assert(versions_->io_status().ok());`, will access `io_status()` of a nullptr and test like `DBErrorHandlingFSTest.MultiCFWALWriteError` becomes flaky. The unblocked foreground thread (in this case, the testing thread) proceeds to [reopen the db](https://github.com/facebook/rocksdb/blob/6.29.fb/db/error_handler_fs_test.cc?fbclid=IwAR1kQOxSbTUmaHQPAGz5jdMHXtDsDFKiFl8rifX-vIz4B23Y0S9jBkssSCg#L1494), where [`versions_` gets reset to nullptr](https://github.com/facebook/rocksdb/blob/6.29.fb/db/db_impl/db_impl.cc?fbclid=IwAR2uRhwBiPKgmE9q_6CM2mzbfwjoRgsGpXOrHruSJUDcAKc9rYZtVSvKdOY#L678) as part of the old db clean-up. If this happens right before `assert(versions_->io_status().ok()); ` gets excuted in the background thread, then we can see error like ``` db/db_impl/db_impl.cc:420:5: runtime error: member call on null pointer of type 'rocksdb::VersionSet' assert(versions_->io_status().ok()); ``` **Summary:** - I proposed to call `s = error_handler_.ClearBGError();` after we know it's fine to wake up foreground, which I think is right before we LOG `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");` - As the context, the orignal https://github.com/facebook/rocksdb/pull/3997 introducing `DBImpl::Resume()` calls `s = error_handler_.ClearBGError();` very close to calling `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");` while the later https://github.com/facebook/rocksdb/pull/6949 distances these two calls a bit. - And it seems fine to me that `s = error_handler_.ClearBGError();` happens after `EnableFileDeletions(/*force=*/true);` at least syntax-wise since these two functions are orthogonal. And it also seems okay to me that we re-enable file deletion before `s = error_handler_.ClearBGError();`, which basically is resetting some state variables. - In addition, to preserve the previous behavior of https://github.com/facebook/rocksdb/pull/6949 where status of re-enabling file deletion is not taken account into the general status of resuming the db, I separated `enable_file_deletion_s` from the general `s` - In addition, to make `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");` more clear, I separated it into its own if-block. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9496 Test Plan: - Manually reproduce the assertion failure in`DBErrorHandlingFSTest.MultiCFWALWriteError` by injecting sleep like below so that it's more likely for `assert(versions_->io_status().ok());` to execute after [reopening the db](https://github.com/facebook/rocksdb/blob/6.29.fb/db/error_handler_fs_test.cc?fbclid=IwAR1kQOxSbTUmaHQPAGz5jdMHXtDsDFKiFl8rifX-vIz4B23Y0S9jBkssSCg#L1494) in the foreground (i.e, testing) thread ``` sleep(1); assert(versions_->io_status().ok()); ``` `python3 gtest-parallel/gtest_parallel.py -r 100 -w 100 rocksdb/error_handler_fs_test --gtest_filter=DBErrorHandlingFSTest.MultiCFWALWriteError` ``` [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from DBErrorHandlingFSTest [ RUN ] DBErrorHandlingFSTest.MultiCFWALWriteError Received signal 11 (Segmentation fault) #0 rocksdb/error_handler_fs_test() [0x5818a4] rocksdb::DBImpl::ResumeImpl(rocksdb::DBRecoverContext) /data/users/huixiao/rocksdb/db/db_impl/db_impl.cc:421 https://github.com/facebook/rocksdb/issues/1 rocksdb/error_handler_fs_test() [0x6379ff] rocksdb::ErrorHandler::RecoverFromBGError(bool) /data/users/huixiao/rocksdb/db/error_handler.cc:600 https://github.com/facebook/rocksdb/issues/2 rocksdb/error_handler_fs_test() [0x7c5362] rocksdb::SstFileManagerImpl::ClearError() /data/users/huixiao/rocksdb/file/sst_file_manager_impl.cc:310 https://github.com/facebook/rocksdb/issues/3 rocksdb/error_handler_fs_test() ``` - The assertion failure does not happen with PR `python3 gtest-parallel/gtest_parallel.py -r 100 -w 100 rocksdb/error_handler_fs_test --gtest_filter=DBErrorHandlingFSTest.MultiCFWALWriteError` `[100/100] DBErrorHandlingFSTest.MultiCFWALWriteError (43785 ms) ` Reviewed By: riversand963, anand1976 Differential Revision: D33990099 Pulled By: hx235 fbshipit-source-id: 2e0259a471fa8892ff177da91b3e1c0792dd7bab
This commit is contained in:
parent
21345d2823
commit
87a8b3c8af
|
@ -3,6 +3,9 @@
|
|||
### New Features
|
||||
* Allow WriteBatchWithIndex to index a WriteBatch that includes keys with user-defined timestamps. The index itself does not have timestamp.
|
||||
|
||||
### Bug Fixes
|
||||
* * Fixed a data race on `versions_` between `DBImpl::ResumeImpl()` and threads waiting for recovery to complete (#9496)
|
||||
|
||||
## 7.0.0 (02/20/2022)
|
||||
### Bug Fixes
|
||||
* Fixed a major bug in which batched MultiGet could return old values for keys deleted by DeleteRange when memtable Bloom filter is enabled (memtable_prefix_bloom_size_ratio > 0). (The fix includes a substantial MultiGet performance improvement in the unusual case of both memtable_whole_key_filtering and prefix_extractor.)
|
||||
|
|
|
@ -400,14 +400,6 @@ Status DBImpl::ResumeImpl(DBRecoverContext context) {
|
|||
|
||||
JobContext job_context(0);
|
||||
FindObsoleteFiles(&job_context, true);
|
||||
if (s.ok()) {
|
||||
s = error_handler_.ClearBGError();
|
||||
} else {
|
||||
// NOTE: this is needed to pass ASSERT_STATUS_CHECKED
|
||||
// in the DBSSTTest.DBWithMaxSpaceAllowedRandomized test.
|
||||
// See https://github.com/facebook/rocksdb/pull/7715#issuecomment-754947952
|
||||
error_handler_.GetRecoveryError().PermitUncheckedError();
|
||||
}
|
||||
mutex_.Unlock();
|
||||
|
||||
job_context.manifest_file_number = 1;
|
||||
|
@ -428,11 +420,31 @@ Status DBImpl::ResumeImpl(DBRecoverContext context) {
|
|||
immutable_db_options_.info_log,
|
||||
"DB resume requested but could not enable file deletions [%s]",
|
||||
s.ToString().c_str());
|
||||
assert(false);
|
||||
}
|
||||
}
|
||||
ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");
|
||||
}
|
||||
|
||||
mutex_.Lock();
|
||||
if (s.ok()) {
|
||||
// This will notify and unblock threads waiting for error recovery to
|
||||
// finish. Those previouly waiting threads can now proceed, which may
|
||||
// include closing the db.
|
||||
s = error_handler_.ClearBGError();
|
||||
} else {
|
||||
// NOTE: this is needed to pass ASSERT_STATUS_CHECKED
|
||||
// in the DBSSTTest.DBWithMaxSpaceAllowedRandomized test.
|
||||
// See https://github.com/facebook/rocksdb/pull/7715#issuecomment-754947952
|
||||
error_handler_.GetRecoveryError().PermitUncheckedError();
|
||||
}
|
||||
|
||||
if (s.ok()) {
|
||||
ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");
|
||||
} else {
|
||||
ROCKS_LOG_INFO(immutable_db_options_.info_log, "Failed to resume DB [%s]",
|
||||
s.ToString().c_str());
|
||||
}
|
||||
|
||||
// Check for shutdown again before scheduling further compactions,
|
||||
// since we released and re-acquired the lock above
|
||||
if (shutdown_initiated_) {
|
||||
|
|
Loading…
Reference in New Issue