rocksdb/db/db_impl
Hui Xiao 2f76ab150d Fix missing WAL in new manifest by rolling over the WAL deletion record from prev manifest (#10892)
Summary:
**Context**
`Options::track_and_verify_wals_in_manifest = true` verifies each of the WALs tracked in manifest indeed presents in the WAL folder. If not, a corruption "Missing WAL with log number" will be thrown.

`DB::SyncWAL()` called at a specific timing (i.e, at the `TEST_SYNC_POINT("FindObsoleteFiles::PostMutexUnlock")`) can record in a new manifest the WAL addition of a WAL file that already had a WAL deletion recorded in the previous manifest.
And the WAL deletion record is not rollover-ed to the new manifest. So the new manifest creates the illusion of such WAL never gets deleted and should presents at db re/open.
- Such WAL deletion record can be caused by flushing the memtable associated with that WAL and such WAL deletion can actually happen in` PurgeObsoleteFiles()`.

As a consequence, upon `DB::Reopen()`, this WAL file can be deleted while manifest still has its WAL addition record , which causes a false alarm of corruption "Missing WAL with log number" to be thrown.

**Summary**
This PR fixes this false alarm by rolling over the WAL deletion record from prev manifest to the new manifest by adding the WAL deletion record to the new manifest.

**Test**
- Make check
- Added new unit test `TEST_F(DBWALTest, FixSyncWalOnObseletedWalWithNewManifestCausingMissingWAL)` that failed before the fix and passed after
- [Ongoing]CI stress test + aggressive value as in https://github.com/facebook/rocksdb/pull/10761 , which is how this false alarm was first surfaced, to confirm such false alarm disappears
- [Ongoing]Regular CI stress test to confirm such fix didn't harm anything

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10892

Reviewed By: ajkr

Differential Revision: D40778965

Pulled By: hx235

fbshipit-source-id: a512364bfdeb0b1a55c171890e60d856c528f37f
2022-11-29 14:14:43 -08:00
..
compacted_db_impl.cc
compacted_db_impl.h
db_impl.cc Prevent iterating over range tombstones beyond iterate_upper_bound (#10966) 2022-11-23 14:27:14 -08:00
db_impl.h
db_impl_compaction_flush.cc Revert PR 10777 "Fix FIFO causing overlapping seqnos in L0 files due to overla…" (#10999) 2022-11-29 10:56:42 -08:00
db_impl_debug.cc
db_impl_experimental.cc
db_impl_files.cc Fix missing WAL in new manifest by rolling over the WAL deletion record from prev manifest (#10892) 2022-11-29 14:14:43 -08:00
db_impl_open.cc
db_impl_readonly.cc
db_impl_readonly.h
db_impl_secondary.cc
db_impl_secondary.h
db_impl_write.cc