rocksdb/file
Andrew Kryczka d5d8920f2c Fix race condition with WAL tracking and FlushWAL(true /* sync */) (#10185)
Summary:
`FlushWAL(true /* sync */)` is used internally and for manual WAL sync. It had a bug when used together with `track_and_verify_wals_in_manifest` where the synced size tracked in MANIFEST was larger than the number of bytes actually synced.

The bug could be repro'd almost immediately with the following crash test command: `python3 tools/db_crashtest.py blackbox --simple --write_buffer_size=524288 --max_bytes_for_level_base=2097152 --target_file_size_base=524288 --duration=3600 --interval=10 --sync_fault_injection=1 --disable_wal=0 --checkpoint_one_in=1000 --max_key=10000 --value_size_mult=33`.

An example error message produced by the above command is shown below. The error sometimes arose from the checkpoint and other times arose from the main stress test DB.

```
Corruption: Size mismatch: WAL (log number: 119) in MANIFEST is 27938 bytes , but actually is 27859 bytes on disk.
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10185

Test Plan:
- repro unit test
- the above crash test command no longer finds the error. It does find a different error after a while longer such as "Corruption: WAL file 481 required by manifest but not in directory list"

Reviewed By: riversand963

Differential Revision: D37200993

Pulled By: ajkr

fbshipit-source-id: 98e0071c1a89f4d009888512ed89f9219779ae5f
2022-06-17 16:45:28 -07:00
..
delete_scheduler.cc
delete_scheduler.h
delete_scheduler_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
file_prefetch_buffer.cc Add few optimizations in async_io for short scans (#10140) 2022-06-15 20:17:35 -07:00
file_prefetch_buffer.h Add few optimizations in async_io for short scans (#10140) 2022-06-15 20:17:35 -07:00
file_util.cc Support read rate-limiting in SequentialFileReader (#9973) 2022-05-24 10:28:57 -07:00
file_util.h Set Read rate limiter priority dynamically and pass it to FS (#9996) 2022-05-18 19:41:44 -07:00
filename.cc Handle "NotSupported" status by default implementation of Close() in … (#10127) 2022-06-07 09:49:31 -07:00
filename.h
line_file_reader.cc Support read rate-limiting in SequentialFileReader (#9973) 2022-05-24 10:28:57 -07:00
line_file_reader.h Support read rate-limiting in SequentialFileReader (#9973) 2022-05-24 10:28:57 -07:00
prefetch_test.cc Add few optimizations in async_io for short scans (#10140) 2022-06-15 20:17:35 -07:00
random_access_file_reader.cc Add rate-limiting support to batched MultiGet() (#10159) 2022-06-17 16:40:47 -07:00
random_access_file_reader.h Set Read rate limiter priority dynamically and pass it to FS (#9996) 2022-05-18 19:41:44 -07:00
random_access_file_reader_test.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
read_write_util.cc
read_write_util.h
readahead_file_info.h Reuse internal auto readhead_size at each Level (expect L0) for Iterations (#9056) 2021-11-10 16:20:04 -08:00
readahead_raf.cc
readahead_raf.h
sequence_file_reader.cc Support read rate-limiting in SequentialFileReader (#9973) 2022-05-24 10:28:57 -07:00
sequence_file_reader.h Support read rate-limiting in SequentialFileReader (#9973) 2022-05-24 10:28:57 -07:00
sst_file_manager_impl.cc Fix race condition in SstFileManagerImpl error recovery code (#9435) 2022-01-25 23:22:58 -08:00
sst_file_manager_impl.h Fix typo about file/sst_file_manager_impl.h (#9799) 2022-04-04 09:57:33 -07:00
writable_file_writer.cc Fix race condition with WAL tracking and FlushWAL(true /* sync */) (#10185) 2022-06-17 16:45:28 -07:00
writable_file_writer.h Fix race condition with WAL tracking and FlushWAL(true /* sync */) (#10185) 2022-06-17 16:45:28 -07:00