rocksdb/db_stress_tool
Hui Xiao 56f7ef50d7 Fix nullptr access and race to fault_fs_guard (#12799)
Summary:
**Context/Summary:**

There are a couple places where we forgot to check fault_fs_guard before accessing it. So we can see something like this occasionally

```
=138831==Hint: address points to the zero page.
SCARINESS: 10 (null-deref)
AddressSanitizer:DEADLYSIGNAL
    #0 0x18b9e0b in rocksdb::ThreadLocalPtr::Get() const fbcode/internal_repo_rocksdb/repo/util/thread_local.cc:503
    https://github.com/facebook/rocksdb/issues/1 0x83d8b7 in rocksdb::StressTest::TestCompactRange(rocksdb::ThreadState*, long, rocksdb::Slice const&, rocksdb::ColumnFamilyHandle*) fbcode/internal_repo_rocksdb/repo/utilities/fault_injection_fs.h
```
Also accessing of `io_activties_exempted_from_fault_injection.find` not fully synced so we see the following
```
WARNING: ThreadSanitizer: data race (pid=90939)
  Write of size 8 at 0x7b4c000004d0 by thread T762 (mutexes: write M0):
    #0 std::_Rb_tree<rocksdb::Env::IOActivity, rocksdb::Env::IOActivity, std::_Identity<rocksdb::Env::IOActivity>, std::less<rocksdb::Env::IOActivity>, std::allocator<rocksdb::Env::IOActivity>>::operator=(std::_Rb_tree<rocksdb::Env::IOActivity, rocksdb::Env::IOActivity, std::_Identity<rocksdb::Env::IOActivity>, std::less<rocksdb::Env::IOActivity>, std::allocator<rocksdb::Env::IOActivity>> const&) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/stl_tree.h:208 (db_stress+0x411c32) (BuildId: b803e5aca22c6b080defed8e85b7bfec)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::DbStressListener::OnErrorRecoveryCompleted(rocksdb::Status) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/stl_set.h:298 (db_stress+0x4112e5) (BuildId: b803e5aca22c6b080defed8e85b7bfec)
    https://github.com/facebook/rocksdb/issues/2 rocksdb::EventHelpers::NotifyOnErrorRecoveryEnd(std::vector<std::shared_ptr<rocksdb::EventListener>, std::allocator<std::shared_ptr<rocksdb::EventListener>>> const&, rocksdb::Status const&, rocksdb::Status const&, rocksdb::InstrumentedMutex*) fbcode/internal_repo_rocksdb/repo/db/event_helpers.cc:239 (db_stress+0xa09d60) (BuildId: b803e5aca22c6b080defed8e85b7bfec)

  Previous read of size 8 at 0x7b4c000004d0 by thread T131 (mutexes: write M1):
    #0 rocksdb::FaultInjectionTestFS::MaybeInjectThreadLocalError(rocksdb::FaultInjectionIOType, rocksdb::IOOptions const&, rocksdb::FaultInjectionTestFS::ErrorOperation, rocksdb::Slice*, bool, char*, bool, bool*) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/stl_tree.h:798 (db_stress+0xf7d0f3) (BuildId: b803e5aca22c6b080defed8e85b7bfec)
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12799

Test Plan: CI

Reviewed By: jowlyzhang

Differential Revision: D58917449

Pulled By: hx235

fbshipit-source-id: f24fc1acc2a7d91f9f285447a97ba41397f48dbd
2024-06-24 16:10:36 -07:00
..
batched_ops_stress.cc Add experimental range filters to stress/crash test (#12769) 2024-06-18 16:16:09 -07:00
cf_consistency_stress.cc Fix nullptr access and race to fault_fs_guard (#12799) 2024-06-24 16:10:36 -07:00
CMakeLists.txt Add experimental range filters to stress/crash test (#12769) 2024-06-18 16:16:09 -07:00
db_stress.cc Disable tiered storage + BlobDB stress test (#10699) 2022-09-19 15:39:31 -07:00
db_stress_common.cc Fix nullptr access and race to fault_fs_guard (#12799) 2024-06-24 16:10:36 -07:00
db_stress_common.h Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_compaction_filter.h Enable compaction filter for db_stress with user-defined timestamp (#10259) 2022-06-27 11:53:09 -07:00
db_stress_driver.cc Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_driver.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
db_stress_env_wrapper.h Change ReadAsync callback API to remove const from FSReadRequest (#11649) 2024-02-16 09:14:55 -08:00
db_stress_filters.cc Add experimental range filters to stress/crash test (#12769) 2024-06-18 16:16:09 -07:00
db_stress_filters.h Add experimental range filters to stress/crash test (#12769) 2024-06-18 16:16:09 -07:00
db_stress_gflags.cc Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_listener.cc Fix/improve temperature handling for file ingestion (#12402) 2024-03-05 16:56:08 -08:00
db_stress_listener.h Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_shared_state.cc Remove ROCKSDB_SUPPORT_THREAD_LOCAL define because it's a part of C++11 (#10015) 2022-05-18 15:25:19 -07:00
db_stress_shared_state.h Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_stat.cc Fix Statistics in db_stress (#9260) 2021-12-07 16:24:22 -08:00
db_stress_stat.h Fix Statistics in db_stress (#9260) 2021-12-07 16:24:22 -08:00
db_stress_table_properties_collector.h Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
db_stress_test_base.cc Fix nullptr access and race to fault_fs_guard (#12799) 2024-06-24 16:10:36 -07:00
db_stress_test_base.h Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_tool.cc Inject more errors to more files in stress test (#12713) 2024-06-19 08:42:00 -07:00
db_stress_wide_merge_operator.cc Add the wide-column aware merge API to the stress tests (#11906) 2023-09-29 08:54:50 -07:00
db_stress_wide_merge_operator.h Add the wide-column aware merge API to the stress tests (#11906) 2023-09-29 08:54:50 -07:00
expected_state.cc Add TimedPut to stress test (#12559) 2024-04-30 15:40:35 -07:00
expected_state.h Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
expected_value.cc Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
expected_value.h Ensures PendingExpectedValue either Commit or Rollback (#12244) 2024-01-24 11:04:40 -08:00
multi_ops_txns_stress.cc Add experimental range filters to stress/crash test (#12769) 2024-06-18 16:16:09 -07:00
multi_ops_txns_stress.h Disable AttributeGroup in multiops txn test (#12781) 2024-06-18 16:05:18 -07:00
no_batched_ops_stress.cc Fix nullptr access and race to fault_fs_guard (#12799) 2024-06-24 16:10:36 -07:00