rocksdb/db/db_impl
Peter Dillinger 98393f0139 Fix Checkpoint hard link of inactive but unsynced WAL (#12731)
Summary:
Background: there is one active WAL file but there can be
several more WAL files in various states. Those other WALs are always
in a "flushed" state but could be on the `logs_` list not yet fully
synced. We currently allow any WAL that is not the active WAL to be
hard-linked when creating a Checkpoint, as although it might still be
open for write, we are not appending any more data to it.

The problem is that a created Checkpoint is supposed to be fully synced
on return of that function, and a hard-linked WAL in the state described
above might not be fully synced. (Through some prudence in https://github.com/facebook/rocksdb/issues/10083,
it would synced if using track_and_verify_wals_in_manifest=true.)

The fix is a step toward a long term goal of removing the need to query
the filesystem to determine WAL files and their state. (I consider it
dubious any time we independently read from or query metadata from a
file we have open for writing, as this makes us more susceptible to
FileSystem deficiencies or races.) More specifically:
* Detect which WALs might not be fully synced, according to our DBImpl
  metadata, and prevent hard linking those (with `trim_to_size=true`
  from `GetLiveFilesStorageInfo()`. And while we're at it, use our known
  flushed sizes for those WALs.
* To avoid a race between that and GetSortedWalFiles(), track a maximum
  needed WAL number for the Checkpoint/GetLiveFilesStorageInfo.
* Because of the level of consistency provided by those two, we no
  longer need to consider syncing as part of the FlushWAL in
  GetLiveFilesStorageInfo. (We determine the max WAL number consistent
  with the manifest file size, while holding DB mutex. Should make
  track_and_verify_wals_in_manifest happy.) This makes the premise of
  test PutRaceWithCheckpointTrackedWalSync obsolete (sync point callback
  no longer hit) so the test is removed, with crash test as backstop for
  related issues. See https://github.com/facebook/rocksdb/issues/10185

Stacked on https://github.com/facebook/rocksdb/issues/12729

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12731

Test Plan:
Expanded an existing test, which now fails before fix.
Also long runs of blackbox_crash_test with amplified checkpoint frequency.

Reviewed By: cbi42

Differential Revision: D58199629

Pulled By: pdillinger

fbshipit-source-id: 376e55f4a2b082cd2adb6408a41209de14422382
2024-06-05 17:40:09 -07:00
..
compacted_db_impl.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
compacted_db_impl.h Deprecate some variants of Get and MultiGet (#12327) 2024-02-16 09:21:06 -08:00
db_impl.cc Fix Checkpoint hard link of inactive but unsynced WAL (#12731) 2024-06-05 17:40:09 -07:00
db_impl.h Fix Checkpoint hard link of inactive but unsynced WAL (#12731) 2024-06-05 17:40:09 -07:00
db_impl_compaction_flush.cc Refactor SyncWAL and SyncClosedLogs for code sharing (#12707) 2024-05-30 14:53:13 -07:00
db_impl_debug.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_impl_experimental.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_impl_files.cc Fix Checkpoint hard link of inactive but unsynced WAL (#12731) 2024-06-05 17:40:09 -07:00
db_impl_follower.cc Add LDB command and option for follower instances (#12682) 2024-05-28 23:21:32 -07:00
db_impl_follower.h Implement obsolete file deletion (GC) in follower (#12657) 2024-05-17 19:13:33 -07:00
db_impl_open.cc Fix delete obsolete files on recovery not rate limited (#12590) 2024-05-01 12:26:54 -07:00
db_impl_readonly.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
db_impl_readonly.h Remove the force mode for EnableFileDeletions API (#12337) 2024-02-13 18:36:25 -08:00
db_impl_secondary.cc Implement obsolete file deletion (GC) in follower (#12657) 2024-05-17 19:13:33 -07:00
db_impl_secondary.h Basic RocksDB follower implementation (#12540) 2024-04-19 19:13:31 -07:00
db_impl_write.cc Add public API WriteWithCallback to support custom callbacks (#12603) 2024-05-31 19:30:19 -07:00