rocksdb/db
Zhichao Cao 09a9ec3ac0 Fix the false positive alert of CF consistency check in WAL recovery (#8207)
Summary:
In current RocksDB, in recover the information form WAL, we do the consistency check for each column family when one WAL file is corrupted and PointInTimeRecovery is set. However, it will report a false positive alert on "SST file is ahead of WALs" when one of the CF current log number is greater than the corrupted WAL number (CF contains the data beyond the corrupted WAl) due to a new column family creation during flush. In this case, a new WAL is created (it is empty) during a flush. Also, due to some reason (e.g., storage issue or crash happens before SyncCloseLog is called), the old WAL is corrupted. The new CF has no data, therefore, it does not have the consistency issue.

Fix: when checking cfd->GetLogNumber() > corrupted_wal_number also check cfd->GetLiveSstFilesSize() > 0. So the CFs with no SST file data will skip the check here.

Note potential ignored inconsistency caused due to fix: empty CF can also be caused by write+delete. In this case, after flush, there is no SST files being generated. However, this CF still have the log in the WAL. When the WAL is corrupted, the DB might be inconsistent.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8207

Test Plan: added unit test, make crash_test

Reviewed By: riversand963

Differential Revision: D27898839

Pulled By: zhichao-cao

fbshipit-source-id: 931fc2d8b92dd00b4169bf84b94e712fd688a83e
2021-04-22 10:28:37 -07:00
..
blob Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
compaction Compaction should not move data to up level (#8116) 2021-03-29 17:10:42 -07:00
db_impl Fix the false positive alert of CF consistency check in WAL recovery (#8207) 2021-04-22 10:28:37 -07:00
arena_wrapped_db_iter.cc Add blob support to DBIter (#7731) 2020-12-04 21:29:38 -08:00
arena_wrapped_db_iter.h Add blob support to DBIter (#7731) 2020-12-04 21:29:38 -08:00
builder.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
builder.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
c.cc Add Blob Options to C API (#8148) 2021-04-16 05:56:00 -07:00
c_test.c Add Blob Options to C API (#8148) 2021-04-16 05:56:00 -07:00
column_family.cc Assert unlimited max_open_files for FIFO compaction. (#8172) 2021-04-14 12:05:47 -07:00
column_family.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
column_family_test.cc Using emplace_back replace push_back (#7568) 2021-01-15 16:56:41 -08:00
compact_files_test.cc Compaction should not move data to up level (#8116) 2021-03-29 17:10:42 -07:00
comparator_db_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
convenience.cc
corruption_test.cc Extend VerifyFileChecksums API for blob files (#7979) 2021-02-22 22:09:22 -08:00
cuckoo_table_db_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
db_basic_test.cc Create a CustomEnv class; Add WinFileSystem; Make LegacyFileSystemWrapper private (#7703) 2021-01-06 10:49:32 -08:00
db_block_cache_test.cc Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
db_bloom_filter_test.cc Assert unlimited max_open_files for FIFO compaction. (#8172) 2021-04-14 12:05:47 -07:00
db_compaction_filter_test.cc Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715) 2021-01-06 14:15:02 -08:00
db_compaction_test.cc Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
db_dynamic_level_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
db_encryption_test.cc Improvements to Env::GetChildren (#7819) 2021-01-09 09:44:34 -08:00
db_filesnapshot.cc Fix checkpoint file deletion race with avoid_unnecessary_blocking_io (#7369) 2020-09-10 22:35:25 -07:00
db_flush_test.cc Fix possible hang issue in ~DBImpl() when flush is scheduled in LOW pool (#8125) 2021-03-30 18:35:20 -07:00
db_info_dumper.cc Fix write-ahead log file size overflow (#7870) 2021-01-19 13:47:48 -08:00
db_info_dumper.h
db_inplace_update_test.cc
db_io_failure_test.cc Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715) 2021-01-06 14:15:02 -08:00
db_iter.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
db_iter.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
db_iter_stress_test.cc Add blob support to DBIter (#7731) 2020-12-04 21:29:38 -08:00
db_iter_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
db_iterator_test.cc Instantiate tests DBIteratorTestForPinnedData (#8051) 2021-03-12 12:31:29 -08:00
db_kv_checksum_test.cc Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
db_log_iter_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
db_logical_block_size_cache_test.cc Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
db_memtable_test.cc Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
db_merge_operand_test.cc Add further tests to ASSERT_STATUS_CHECKED (1) (#7679) 2020-12-08 15:55:04 -08:00
db_merge_operator_test.cc Add further tests to ASSERT_STATUS_CHECKED (1) (#7679) 2020-12-08 15:55:04 -08:00
db_options_test.cc Fix handling of Mutable options; Allow DB::SetOptions to update mutable TableFactory Options (#7936) 2021-02-19 10:29:02 -08:00
db_properties_test.cc Assert unlimited max_open_files for FIFO compaction. (#8172) 2021-04-14 12:05:47 -07:00
db_range_del_test.cc Fix compression dictionary sampling with dedicated range tombstone SSTs (#8141) 2021-04-01 05:08:17 -07:00
db_secondary_test.cc Move a test file to a better location (#8054) 2021-03-15 15:03:27 -07:00
db_sst_test.cc Fix flaky failure in DBSSTest.DBWithSstFileManagerForBlobFilesWithGC (#8196) 2021-04-15 20:18:57 -07:00
db_statistics_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
db_table_properties_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
db_tailing_iter_test.cc Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715) 2021-01-06 14:15:02 -08:00
db_test.cc Fix a data race related to DB properties (#8206) 2021-04-19 16:38:02 -07:00
db_test2.cc Fix the false positive alert of CF consistency check in WAL recovery (#8207) 2021-04-22 10:28:37 -07:00
db_test_util.cc Handle rename() failure in non-local FS (#8192) 2021-04-19 18:11:13 -07:00
db_test_util.h Fix the false positive alert of CF consistency check in WAL recovery (#8207) 2021-04-22 10:28:37 -07:00
db_universal_compaction_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
db_wal_test.cc Fix flush reason attribution (#8150) 2021-04-07 23:18:37 -07:00
db_with_timestamp_basic_test.cc Fix a bug in key comparison when index type is kBinarySearchWithFirstKey (#8062) 2021-03-15 17:44:52 -07:00
db_with_timestamp_compaction_test.cc
db_write_buffer_manager_test.cc Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) 2021-04-21 13:54:02 -07:00
db_write_test.cc Add more tests to ASSERT_STATUS_CHECKED (4) (#7718) 2020-12-22 15:09:39 -08:00
dbformat.cc Make CompactRange and GetApproximateSizes work with timestamp (#7684) 2020-12-02 13:00:53 -08:00
dbformat.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
dbformat_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
deletefile_test.cc Add more tests to ASSERT_STATUS_CHECKED (4) (#7718) 2020-12-22 15:09:39 -08:00
error_handler.cc Add the statistics and info log for Error handler (#8050) 2021-03-17 22:38:13 -07:00
error_handler.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
error_handler_fs_test.cc Fix error_handler_fs_test failure due to statistics (#8136) 2021-03-30 21:44:44 -07:00
event_helpers.cc Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
event_helpers.h
experimental.cc
external_sst_file_basic_test.cc Fix seqno in ingested file boundary key metadata (#8209) 2021-04-20 14:00:21 -07:00
external_sst_file_ingestion_job.cc Fix seqno in ingested file boundary key metadata (#8209) 2021-04-20 14:00:21 -07:00
external_sst_file_ingestion_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
external_sst_file_test.cc Introduce a ThreadGuard class and use it in ExternalSSTFileTest.PickedLevelBug (#8112) 2021-03-25 22:08:58 -07:00
fault_injection_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
file_indexer.cc
file_indexer.h
file_indexer_test.cc
filename_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
flush_job.cc Fix flush reason attribution (#8150) 2021-04-07 23:18:37 -07:00
flush_job.h Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
flush_job_test.cc Fix testcase failures on windows (#7992) 2021-02-23 14:35:06 -08:00
flush_scheduler.cc
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
forward_iterator.h
forward_iterator_bench.cc
import_column_family_job.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
import_column_family_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
import_column_family_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
internal_stats.cc Fix a data race related to DB properties (#8206) 2021-04-19 16:38:02 -07:00
internal_stats.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
job_context.h
kv_checksum.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
listener_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
log_format.h
log_reader.cc Fix kPointInTimeRecovery handling of truncated WAL (#7701) 2020-11-30 18:11:38 -08:00
log_reader.h
log_test.cc Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
log_writer.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
log_writer.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
logs_with_prep_tracker.cc
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h
malloc_stats.cc
malloc_stats.h
manual_compaction_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
memtable.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
memtable.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
memtable_list.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
memtable_list.h Add initial blob support to batched MultiGet (#7766) 2020-12-14 13:48:22 -08:00
memtable_list_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
merge_context.h
merge_helper.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
merge_helper.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
merge_helper_test.cc
merge_operator.cc
merge_test.cc MergeHelper::FilterMerge() calling ElapsedNanosSafe() upon exit even … (#7867) 2021-01-21 13:13:02 -08:00
obsolete_files_test.cc Fix a harmless data race affecting two test cases (#8055) 2021-03-12 16:44:35 -08:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Use NPHash64 in more places (#7632) 2020-11-10 23:42:13 -08:00
output_validator.h Use NPHash64 in more places (#7632) 2020-11-10 23:42:13 -08:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.h Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
periodic_work_scheduler_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
pinned_iterators_manager.h
plain_table_db_test.cc Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
pre_release_callback.h
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_del_aggregator_bench.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_test.cc
range_tombstone_fragmenter.cc Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
range_tombstone_fragmenter.h
range_tombstone_fragmenter_test.cc
read_callback.h
repair.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
repair_test.cc Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
snapshot_checker.h
snapshot_impl.cc
snapshot_impl.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
table_cache.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
table_cache.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector_test.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
transaction_log_impl.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
transaction_log_impl.h
trim_history_scheduler.cc
trim_history_scheduler.h
version_builder.cc Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_builder.h Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_builder_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
version_edit.cc Make blob related VersionEdit tags unignorable (#7886) 2021-01-20 20:29:04 -08:00
version_edit.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
version_edit_handler.cc Ignore comparator name mismatch in ldb manifest dump (#8216) 2021-04-21 20:43:10 -07:00
version_edit_handler.h Ignore comparator name mismatch in ldb manifest dump (#8216) 2021-04-21 20:43:10 -07:00
version_edit_test.cc Make it able to ignore WAL related VersionEdits in older versions (#7873) 2021-01-19 19:27:53 -08:00
version_set.cc Handle rename() failure in non-local FS (#8192) 2021-04-19 18:11:13 -07:00
version_set.h Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_set_test.cc version_set_test.cc: remove a redundent obj copy (#7880) 2021-04-01 11:28:54 -07:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_manager.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
wal_manager.h
wal_manager_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
write_batch.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
write_batch_base.cc
write_batch_internal.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
write_batch_test.cc Add further tests to ASSERT_STATUS_CHECKED (1) (#7679) 2020-12-08 15:55:04 -08:00
write_callback.h
write_callback_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) 2021-04-21 13:54:02 -07:00
write_thread.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00