rocksdb/file
Akanksha Mahajan 33f8a08af2 Fix some errors in async prefetching in FilePrefetchBuffer (#9734)
Summary:
In ReadOption `async_io` which prefetches the data asynchronously, db_bench and db_stress runs were failing  because wrong data was prefetched which resulted in Error: Checksum mismatched. Wrong data was copied because capacity was less than actual size needed. It has been fixed in this PR.

Since there are two separate methods for async and sync prefetching, these changes are in async prefetching methods and any changes would not effect normal prefetching. I ran the regressions to make sure normal prefetching is fine.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9734

Test Plan:
1. CircleCI jobs

2.  Ran db_bench
```
. /db_bench -use_existing_db=true
-db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32
-value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680
-duration=120 -ops_between_duration_checks=1 -async_io=1 -adaptive_readahead=1

```
3. Ran db_stress test
```
export CRASH_TEST_EXT_ARGS=" --async_io=1 --adaptive_readahead=1"
make crash_test -j
```

4. Run regressions for async_io disabled.

Old flow without any async changes:
```
./db_bench -use_existing_db=true -db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32 -value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680 -duration=120 -ops_between_duration_checks=1
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
RocksDB:    version 7.0
Date:       Thu Mar 17 13:11:34 2022
CPU:        24 * Intel Core Processor (Broadwell)
CPUCache:   16384 KB
Keys:       32 bytes each (+ 0 bytes user-defined timestamp)
Values:     512 bytes each (256 bytes after compression)
Entries:    5000000
Prefix:    0 bytes
Keys per prefix:    0
RawSize:    2594.0 MB (estimated)
FileSize:   1373.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
DB path: [/tmp/prefix_scan_prefetch_main]
seekrandom   :  483618.390 micros/op 2 ops/sec;  338.9 MB/s (249 of 249 found)
```

With async prefetching changes and async_io disabled to make sure in normal prefetching there is no regression.
 ```
 ./db_bench -use_existing_db=true -db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32 -value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680 -duration=120 -ops_between_duration_checks=1 --async_io=0
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
RocksDB:    version 7.1
Date:       Wed Mar 23 15:56:37 2022
CPU:        24 * Intel Core Processor (Broadwell)
CPUCache:   16384 KB
Keys:       32 bytes each (+ 0 bytes user-defined timestamp)
Values:     512 bytes each (256 bytes after compression)
Entries:    5000000
Prefix:    0 bytes
Keys per prefix:    0
RawSize:    2594.0 MB (estimated)
FileSize:   1373.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
DB path: [/tmp/prefix_scan_prefetch_main]
seekrandom   :  481819.816 micros/op 2 ops/sec;  340.2 MB/s (250 of 250 found)
```

Reviewed By: riversand963

Differential Revision: D35058471

Pulled By: akankshamahajan15

fbshipit-source-id: 9233a1e6d97cea0c7a8111bfb9e8ac3251c341ce
2022-03-25 18:26:22 -07:00
..
delete_scheduler.cc Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
delete_scheduler.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
delete_scheduler_test.cc Deflake DeleteSchedulerTest.StartBGEmptyTrashMultipleTimes (#9706) 2022-03-17 13:30:28 -07:00
file_prefetch_buffer.cc Fix some errors in async prefetching in FilePrefetchBuffer (#9734) 2022-03-25 18:26:22 -07:00
file_prefetch_buffer.h Provide implementation to prefetch data asynchronously in FilePrefetchBuffer (#9674) 2022-03-21 07:12:43 -07:00
file_util.cc Avoid usage of ReopenWritableFile in db_stress (#9649) 2022-03-04 10:30:10 -08:00
file_util.h Avoid usage of ReopenWritableFile in db_stress (#9649) 2022-03-04 10:30:10 -08:00
filename.cc Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
filename.h Add (Live)FileStorageInfo API (#8968) 2021-10-16 10:04:32 -07:00
line_file_reader.cc Replace Status with IOStatus in the backupable_db (#8820) 2021-09-15 15:09:48 -07:00
line_file_reader.h Replace Status with IOStatus in the backupable_db (#8820) 2021-09-15 15:09:48 -07:00
prefetch_test.cc Provide implementation to prefetch data asynchronously in FilePrefetchBuffer (#9674) 2022-03-21 07:12:43 -07:00
random_access_file_reader.cc Provide implementation to prefetch data asynchronously in FilePrefetchBuffer (#9674) 2022-03-21 07:12:43 -07:00
random_access_file_reader.h Provide implementation to prefetch data asynchronously in FilePrefetchBuffer (#9674) 2022-03-21 07:12:43 -07:00
random_access_file_reader_test.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
read_write_util.cc Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
read_write_util.h Refactor: add LineFileReader and Status::MustCheck (#8026) 2021-03-09 20:12:38 -08:00
readahead_file_info.h Reuse internal auto readhead_size at each Level (expect L0) for Iterations (#9056) 2021-11-10 16:20:04 -08:00
readahead_raf.cc Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
readahead_raf.h Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
sequence_file_reader.cc Add file operation callbacks to SequentialFileReader (#8982) 2021-10-05 10:51:59 -07:00
sequence_file_reader.h Add listener API that notifies on IOError (#9177) 2021-11-18 17:11:19 -08:00
sst_file_manager_impl.cc Fix race condition in SstFileManagerImpl error recovery code (#9435) 2022-01-25 23:22:58 -08:00
sst_file_manager_impl.h Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
writable_file_writer.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
writable_file_writer.h Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00