Fix a failure to propagate ReadOptions (#12757)

Summary:
The crash test revealed a case in which the uncache functionality in ~BlockBasedTableReader could initiate an block read (IO), despite setting ReadOptions::read_tier = kBlockCacheTier.

The root cause is a place in the code where many people have over time decided to opt-in propagating ReadOptions and no one took the initiative to propagate ReadOptions by default (opt out / override only as needed). The fix is in partitioned_index_reader.cc. Here,
ReadOptions::readahead_size is opted-out to avoid churn in prefetch_test that is not clearly an improvement or regression. It's hard to tell given the poor state of relevant documentation https://github.com/facebook/rocksdb/issues/12756. The affected unit test was added in https://github.com/facebook/rocksdb/issues/10602.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12757

Test Plan: (Now postponed to a follow-up diff) I have added some new infrastructure to DEBUG builds to catch this specific kind of violation in unit tests and in the stress/crash test. `EnforceReadOpts` establishes a thread-local context under which we assert no IOs are performed if ReadOptions said it should be forbidden. With this new checking, the Uncache unit test would catch the critical step toward a violation (inner ReadOptions allowing IO, even if no IO is actually performed), which is fixed with the production code change.

Reviewed By: hx235

Differential Revision: D58421526

Pulled By: pdillinger

fbshipit-source-id: 9e9917a0e320c78967e751bd887926a2ed231d37
This commit is contained in:
Peter Dillinger 2024-06-11 21:41:21 -07:00 committed by Facebook GitHub Bot
parent 961468f92e
commit d64eac28d3
1 changed files with 4 additions and 9 deletions

View File

@ -78,15 +78,10 @@ InternalIteratorBase<IndexValue>* PartitionIndexReader::NewIterator(
index_value_is_full(), false /* block_contents_pinned */,
user_defined_timestamps_persisted()));
} else {
ReadOptions ro;
ro.fill_cache = read_options.fill_cache;
ro.deadline = read_options.deadline;
ro.io_timeout = read_options.io_timeout;
ro.adaptive_readahead = read_options.adaptive_readahead;
ro.async_io = read_options.async_io;
ro.rate_limiter_priority = read_options.rate_limiter_priority;
ro.verify_checksums = read_options.verify_checksums;
ro.io_activity = read_options.io_activity;
ReadOptions ro{read_options};
// FIXME? Possible regression seen in prefetch_test if this field is
// propagated
ro.readahead_size = ReadOptions{}.readahead_size;
// We don't return pinned data from index blocks, so no need
// to set `block_contents_pinned`.