mirror of https://github.com/facebook/rocksdb.git
9ded9f789f
Summary: We saw frequent stress test failures with error messages like: ``` Verification failed for column family 0 key ...: value_from_db: , value_from_expected: ..., msg: GetEntity verification: Value not found: NotFound: ``` One cause for this is that data in WAL is lost after a crash. We initialize FaultInjectionTestFS to be not direct writable when write_fault_injection is enabled (see code change). This can cause the first WAL created during DB open to be lost if a db_stress is killed before the first WAL is synced. This PR initializes FaultInjectionTestFS to be direct writable. Note that FaultInjectionTestFS will be configured propertly for write fault injection after DB open in `RunStressTestImpl()`. So this change should not affect write fault injection coverage. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11958 Test Plan: a repro for the above bug: ``` Simulate crash before first WAL is sealed: --- a/db_stress_tool/db_stress_driver.cc +++ b/db_stress_tool/db_stress_driver.cc @@ -256,6 +256,7 @@ bool RunStressTestImpl(SharedState* shared) { fprintf(stderr, "Verification failed :(\n"); return false; } + exit(1); return true; } ./db_stress --clear_column_family_one_in=0 --column_families=1 --preserve_internal_time_seconds=60 --destroy_db_initially=0 --db=/dev/shm/rocksdb_crashtest_blackbox --db_write_buffer_size=2097152 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --reopen=0 --test_batches_snapshots=0 --threads=1 --ops_per_thread=100 --write_fault_one_in=1000 --sync_fault_injection=0 ./db_stress_main --clear_column_family_one_in=0 --column_families=1 --preserve_internal_time_seconds=60 --destroy_db_initially=0 --db=/dev/shm/rocksdb_crashtest_blackbox --db_write_buffer_size=2097152 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --reopen=0 --test_batches_snapshots=0 --sync_fault_injection=1 ``` Reviewed By: akankshamahajan15 Differential Revision: D50300347 Pulled By: cbi42 fbshipit-source-id: 3a4881d72197f5ece82364382a0100912e16c2d6 |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
batched_ops_stress.cc | ||
cf_consistency_stress.cc | ||
db_stress.cc | ||
db_stress_common.cc | ||
db_stress_common.h | ||
db_stress_compaction_filter.h | ||
db_stress_driver.cc | ||
db_stress_driver.h | ||
db_stress_env_wrapper.h | ||
db_stress_gflags.cc | ||
db_stress_listener.cc | ||
db_stress_listener.h | ||
db_stress_shared_state.cc | ||
db_stress_shared_state.h | ||
db_stress_stat.cc | ||
db_stress_stat.h | ||
db_stress_table_properties_collector.h | ||
db_stress_test_base.cc | ||
db_stress_test_base.h | ||
db_stress_tool.cc | ||
db_stress_wide_merge_operator.cc | ||
db_stress_wide_merge_operator.h | ||
expected_state.cc | ||
expected_state.h | ||
expected_value.cc | ||
expected_value.h | ||
multi_ops_txns_stress.cc | ||
multi_ops_txns_stress.h | ||
no_batched_ops_stress.cc |