rocksdb/db/db_impl
Yanqin Jin 717749f4c0 Fail point-in-time WAL recovery upon IOError reading WAL (#6963)
Summary:
If `options.wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery`, RocksDB stops replaying WAL once hitting an error and discards the rest of the WAL. This can lead to data loss if the error occurs at an offset smaller than the last sync'ed offset.
Ideally, RocksDB point-in-time recovery should permit recovery if the error occurs after last synced offset while fail recovery if error occurs before the last synced offset. However, RocksDB does not track the synced offset of WALs. Consequently, RocksDB does not know whether an error occurs before or after the last synced offset. An error can be one of the following.
- WAL record checksum mismatch. This can result from both corruption of synced data and dropping of unsynced data during shutdown. We cannot be sure which one. In order not to defeat the original motivation to permit the latter case, we keep the original behavior of point-in-time WAL recovery.
- IOError. This means the WAL can be bad, an indicator of whole file becoming unavailable, not to mention synced part of the WAL. Therefore, we choose to modify the behavior of point-in-time recovery and fail the database recovery.

Test plan (devserver):
make check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6963

Reviewed By: ajkr

Differential Revision: D22011083

Pulled By: riversand963

fbshipit-source-id: f9cbf29a37dc5cc40d3fa62f89eed1ad67ca1536
2020-06-11 18:42:10 -07:00
..
db_impl.cc Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
db_impl.h Find/purge obsolete blob files (#6807) 2020-05-07 09:32:51 -07:00
db_impl_compaction_flush.cc Fix a typo (bug) when setting error during Flush (#6928) 2020-06-04 08:30:42 -07:00
db_impl_debug.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_impl_experimental.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_impl_files.cc Fix a few bugs in best-efforts recovery (#6824) 2020-05-08 13:01:42 -07:00
db_impl_open.cc Fail point-in-time WAL recovery upon IOError reading WAL (#6963) 2020-06-11 18:42:10 -07:00
db_impl_readonly.cc API change: DB::OpenForReadOnly will not write to the file system unless create_if_missing is true (#6900) 2020-06-03 18:57:49 -07:00
db_impl_readonly.h API change: DB::OpenForReadOnly will not write to the file system unless create_if_missing is true (#6900) 2020-06-03 18:57:49 -07:00
db_impl_secondary.cc Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
db_impl_secondary.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_impl_write.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
db_secondary_test.cc Do not swallow error returned from SaveTo() (#6801) 2020-05-05 10:46:20 -07:00