mirror of
https://github.com/facebook/rocksdb.git
synced 2024-11-28 05:43:50 +00:00
98393f0139
Summary: Background: there is one active WAL file but there can be several more WAL files in various states. Those other WALs are always in a "flushed" state but could be on the `logs_` list not yet fully synced. We currently allow any WAL that is not the active WAL to be hard-linked when creating a Checkpoint, as although it might still be open for write, we are not appending any more data to it. The problem is that a created Checkpoint is supposed to be fully synced on return of that function, and a hard-linked WAL in the state described above might not be fully synced. (Through some prudence in https://github.com/facebook/rocksdb/issues/10083, it would synced if using track_and_verify_wals_in_manifest=true.) The fix is a step toward a long term goal of removing the need to query the filesystem to determine WAL files and their state. (I consider it dubious any time we independently read from or query metadata from a file we have open for writing, as this makes us more susceptible to FileSystem deficiencies or races.) More specifically: * Detect which WALs might not be fully synced, according to our DBImpl metadata, and prevent hard linking those (with `trim_to_size=true` from `GetLiveFilesStorageInfo()`. And while we're at it, use our known flushed sizes for those WALs. * To avoid a race between that and GetSortedWalFiles(), track a maximum needed WAL number for the Checkpoint/GetLiveFilesStorageInfo. * Because of the level of consistency provided by those two, we no longer need to consider syncing as part of the FlushWAL in GetLiveFilesStorageInfo. (We determine the max WAL number consistent with the manifest file size, while holding DB mutex. Should make track_and_verify_wals_in_manifest happy.) This makes the premise of test PutRaceWithCheckpointTrackedWalSync obsolete (sync point callback no longer hit) so the test is removed, with crash test as backstop for related issues. See https://github.com/facebook/rocksdb/issues/10185 Stacked on https://github.com/facebook/rocksdb/issues/12729 Pull Request resolved: https://github.com/facebook/rocksdb/pull/12731 Test Plan: Expanded an existing test, which now fails before fix. Also long runs of blackbox_crash_test with amplified checkpoint frequency. Reviewed By: cbi42 Differential Revision: D58199629 Pulled By: pdillinger fbshipit-source-id: 376e55f4a2b082cd2adb6408a41209de14422382 |
||
---|---|---|
.. | ||
delete_scheduler.cc | ||
delete_scheduler.h | ||
delete_scheduler_test.cc | ||
file_prefetch_buffer.cc | ||
file_prefetch_buffer.h | ||
file_util.cc | ||
file_util.h | ||
filename.cc | ||
filename.h | ||
line_file_reader.cc | ||
line_file_reader.h | ||
prefetch_test.cc | ||
random_access_file_reader.cc | ||
random_access_file_reader.h | ||
random_access_file_reader_test.cc | ||
read_write_util.cc | ||
read_write_util.h | ||
readahead_file_info.h | ||
readahead_raf.cc | ||
readahead_raf.h | ||
sequence_file_reader.cc | ||
sequence_file_reader.h | ||
sst_file_manager_impl.cc | ||
sst_file_manager_impl.h | ||
writable_file_writer.cc | ||
writable_file_writer.h |