Fix race condition between file purge and backup/checkpoint (#10187)

Summary:
Resolves https://github.com/facebook/rocksdb/issues/10129

I extracted this fix from https://github.com/facebook/rocksdb/issues/7516 since it's also already a bug in main branch, and we want to
separate it from the main part of the PR.

There can be a race condition between two threads. Thread 1 executes
`DBImpl::FindObsoleteFiles()` while thread 2 executes `GetSortedWals()`.
```
Time   thread 1                                thread 2
  |  mutex_.lock
  |  read disable_delete_obsolete_files_
  |  ...
  |  wait on log_sync_cv and release mutex_
  |                                          mutex_.lock
  |                                          ++disable_delete_obsolete_files_
  |                                          mutex_.unlock
  |                                          mutex_.lock
  |                                          while (pending_purge_obsolete_files > 0) { bg_cv.wait;}
  |                                          wake up with mutex_ locked
  |                                          compute WALs tracked by MANIFEST
  |                                          mutex_.unlock
  |  wake up with mutex_ locked
  |  ++pending_purge_obsolete_files_
  |  mutex_.unlock
  |
  |  delete obsolete WAL
  |                                          WAL missing but tracked in MANIFEST.
  V
```

The fix proposed eliminates the possibility of the above by increasing
`pending_purge_obsolete_files_` before `FindObsoleteFiles()` can possibly release the mutex.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10187

Test Plan: make check

Reviewed By: ltamasi

Differential Revision: D37214235

Pulled By: riversand963

fbshipit-source-id: 556ab1b58ae6d19150169dfac4db08195c797184
This commit is contained in:
Yanqin Jin 2022-06-23 18:32:25 -07:00 committed by Facebook GitHub Bot
parent 6061905790
commit 725df120e9
2 changed files with 20 additions and 3 deletions

View file

@ -11,6 +11,9 @@
* `rocksdb_file_metadata_t` and its and get functions & destroy functions. * `rocksdb_file_metadata_t` and its and get functions & destroy functions.
* Add suggest_compact_range() and suggest_compact_range_cf() to C API. * Add suggest_compact_range() and suggest_compact_range_cf() to C API.
### Bug Fixes
* Fix a bug in which backup/checkpoint can include a WAL deleted by RocksDB.
## 7.4.0 (06/19/2022) ## 7.4.0 (06/19/2022)
### Bug Fixes ### Bug Fixes
* Fixed a bug in calculating key-value integrity protection for users of in-place memtable updates. In particular, the affected users would be those who configure `protection_bytes_per_key > 0` on `WriteBatch` or `WriteOptions`, and configure `inplace_callback != nullptr`. * Fixed a bug in calculating key-value integrity protection for users of in-place memtable updates. In particular, the affected users would be those who configure `protection_bytes_per_key > 0` on `WriteBatch` or `WriteOptions`, and configure `inplace_callback != nullptr`.

View file

@ -19,6 +19,7 @@
#include "logging/logging.h" #include "logging/logging.h"
#include "port/port.h" #include "port/port.h"
#include "util/autovector.h" #include "util/autovector.h"
#include "util/defer.h"
namespace ROCKSDB_NAMESPACE { namespace ROCKSDB_NAMESPACE {
@ -252,6 +253,22 @@ void DBImpl::FindObsoleteFiles(JobContext* job_context, bool force,
job_context->blob_delete_files); job_context->blob_delete_files);
} }
// Before potentially releasing mutex and waiting on condvar, increment
// pending_purge_obsolete_files_ so that another thread executing
// `GetSortedWals` will wait until this thread finishes execution since the
// other thread will be waiting for `pending_purge_obsolete_files_`.
// pending_purge_obsolete_files_ MUST be decremented if there is nothing to
// delete.
++pending_purge_obsolete_files_;
Defer cleanup([job_context, this]() {
assert(job_context != nullptr);
if (!job_context->HaveSomethingToDelete()) {
mutex_.AssertHeld();
--pending_purge_obsolete_files_;
}
});
// logs_ is empty when called during recovery, in which case there can't yet // logs_ is empty when called during recovery, in which case there can't yet
// be any tracked obsolete logs // be any tracked obsolete logs
if (!alive_log_files_.empty() && !logs_.empty()) { if (!alive_log_files_.empty() && !logs_.empty()) {
@ -308,9 +325,6 @@ void DBImpl::FindObsoleteFiles(JobContext* job_context, bool force,
job_context->logs_to_free = logs_to_free_; job_context->logs_to_free = logs_to_free_;
job_context->log_recycle_files.assign(log_recycle_files_.begin(), job_context->log_recycle_files.assign(log_recycle_files_.begin(),
log_recycle_files_.end()); log_recycle_files_.end());
if (job_context->HaveSomethingToDelete()) {
++pending_purge_obsolete_files_;
}
logs_to_free_.clear(); logs_to_free_.clear();
} }