Do not add unprep_seqs when WriteImpl() fails in unprepared txn (#12927)

Summary:
With the following scenario, we see assertion failure in write_unprepared_txn

1. The write is the first one in the transaction (`log_number_` is not set yet - https://github.com/facebook/rocksdb/blob/main/utilities/transactions/write_unprepared_txn.cc#L376-L379)
2. `WriteToWAL()` fails inside `db_impl_->WriteImpl()` due to fault injection. `last_log_number_`is 0. https://github.com/facebook/rocksdb/blob/main/utilities/transactions/write_unprepared_txn.cc#L386
3. `prepare_batch_cnt_` is still added to `unprep_seqs_` https://github.com/facebook/rocksdb/blob/main/utilities/transactions/write_unprepared_txn.cc#L395-L398
4. When the transaction gets destructed after failed, it expects `log_number_ > 0` if `unprep_seqs_` is not empty - https://github.com/facebook/rocksdb/blob/main/utilities/transactions/write_unprepared_txn.cc#L54-L55.

Step 3 is wrong. `unprep_seqs_` is for the ordered list of unprep sequence numbers that we have already written to the DB. If `db_impl_->WriteImpl()` failed, `unprep_seqs_` shouldn't be set. `prepare_seq` value would be wrong anyway.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12927

Test Plan:
Following command steadily repros the issue. This no longer fails after the fix
```
./db_stress \
  --WAL_size_limit_MB=0 \
  --WAL_ttl_seconds=60 \
  --acquire_snapshot_one_in=10000 \
  --adaptive_readahead=1 \
  --adm_policy=2 \
  --advise_random_on_open=0 \
  --allow_data_in_errors=True \
  --allow_fallocate=1 \
  --async_io=1 \
  --auto_readahead_size=0 \
  --avoid_flush_during_recovery=0 \
  --avoid_flush_during_shutdown=1 \
  --avoid_unnecessary_blocking_io=0 \
  --backup_max_size=104857600 \
  --backup_one_in=100000 \
  --batch_protection_bytes_per_key=8 \
  --bgerror_resume_retry_interval=100 \
  --block_align=1 \
  --block_protection_bytes_per_key=8 \
  --block_size=16384 \
  --bloom_before_level=7 \
  --bloom_bits=4.1280979878467345 \
  --bottommost_compression_type=none \
  --bottommost_file_compaction_delay=600 \
  --bytes_per_sync=262144 \
  --cache_index_and_filter_blocks=1 \
  --cache_index_and_filter_blocks_with_high_priority=0 \
  --cache_size=33554432 \
  --cache_type=lru_cache \
  --charge_compression_dictionary_building_buffer=1 \
  --charge_file_metadata=1 \
  --charge_filter_construction=1 \
  --charge_table_reader=1 \
  --check_multiget_consistency=0 \
  --check_multiget_entity_consistency=1 \
  --checkpoint_one_in=0 \
  --checksum_type=kXXH3 \
  --clear_column_family_one_in=0 \
  --compact_files_one_in=1000000 \
  --compact_range_one_in=1000000 \
  --compaction_pri=4 \
  --compaction_readahead_size=1048576 \
  --compaction_ttl=0 \
  --compress_format_version=1 \
  --compressed_secondary_cache_ratio=0.0 \
  --compressed_secondary_cache_size=0 \
  --compression_checksum=1 \
  --compression_max_dict_buffer_bytes=4294967295 \
  --compression_max_dict_bytes=16384 \
  --compression_parallel_threads=8 \
  --compression_type=none \
  --compression_use_zstd_dict_trainer=1 \
  --compression_zstd_max_train_bytes=0 \
  --continuous_verification_interval=0 \
  --create_timestamped_snapshot_one_in=0 \
  --daily_offpeak_time_utc=04:00-08:00 \
  --data_block_index_type=0 \
  --db=$db \
  --db_write_buffer_size=8388608 \
  --default_temperature=kHot \
  --default_write_temperature=kUnknown \
  --delete_obsolete_files_period_micros=21600000000 \
  --delpercent=5 \
  --delrangepercent=0 \
  --destroy_db_initially=0 \
  --detect_filter_construct_corruption=0 \
  --disable_file_deletions_one_in=10000 \
  --disable_manual_compaction_one_in=10000 \
  --disable_wal=0 \
  --dump_malloc_stats=1 \
  --enable_checksum_handoff=1 \
  --enable_compaction_filter=0 \
  --enable_custom_split_merge=1 \
  --enable_do_not_compress_roles=0 \
  --enable_index_compression=1 \
  --enable_memtable_insert_with_hint_prefix_extractor=0 \
  --enable_pipelined_write=0 \
  --enable_sst_partitioner_factory=0 \
  --enable_thread_tracking=1 \
  --enable_write_thread_adaptive_yield=1 \
  --error_recovery_with_no_fault_injection=1 \
  --exclude_wal_from_write_fault_injection=0 \
  --expected_values_dir=$exp \
  --fail_if_options_file_error=0 \
  --fifo_allow_compaction=1 \
  --file_checksum_impl=crc32c \
  --fill_cache=1 \
  --flush_one_in=1000 \
  --format_version=2 \
  --get_all_column_family_metadata_one_in=1000000 \
  --get_current_wal_file_one_in=0 \
  --get_live_files_apis_one_in=1000000 \
  --get_properties_of_all_tables_one_in=100000 \
  --get_property_one_in=1000000 \
  --get_sorted_wal_files_one_in=0 \
  --hard_pending_compaction_bytes_limit=274877906944 \
  --high_pri_pool_ratio=0 \
  --index_block_restart_interval=15 \
  --index_shortening=0 \
  --index_type=2 \
  --ingest_external_file_one_in=0 \
  --initial_auto_readahead_size=524288 \
  --inplace_update_support=0 \
  --iterpercent=10 \
  --key_len_percent_dist=1,30,69 \
  --key_may_exist_one_in=100000 \
  --kill_random_test=888887 \
  --last_level_temperature=kCold \
  --level_compaction_dynamic_level_bytes=1 \
  --lock_wal_one_in=1000000 \
  --log2_keys_per_lock=10 \
  --log_file_time_to_roll=0 \
  --log_readahead_size=0 \
  --long_running_snapshots=1 \
  --low_pri_pool_ratio=0 \
  --lowest_used_cache_tier=1 \
  --manifest_preallocation_size=5120 \
  --manual_wal_flush_one_in=0 \
  --mark_for_compaction_one_file_in=10 \
  --max_auto_readahead_size=0 \
  --max_background_compactions=20 \
  --max_bytes_for_level_base=10485760 \
  --max_key=100000 \
  --max_key_len=3 \
  --max_log_file_size=1048576 \
  --max_manifest_file_size=1073741824 \
  --max_sequential_skip_in_iterations=2 \
  --max_total_wal_size=0 \
  --max_write_batch_group_size_bytes=16 \
  --max_write_buffer_number=10 \
  --max_write_buffer_size_to_maintain=0 \
  --memtable_insert_hint_per_batch=0 \
  --memtable_max_range_deletions=1000 \
  --memtable_prefix_bloom_size_ratio=0.1 \
  --memtable_protection_bytes_per_key=8 \
  --memtable_whole_key_filtering=1 \
  --memtablerep=skip_list \
  --metadata_charge_policy=0 \
  --metadata_read_fault_one_in=32 \
  --metadata_write_fault_one_in=0 \
  --min_write_buffer_number_to_merge=2 \
  --mmap_read=1 \
  --mock_direct_io=False \
  --nooverwritepercent=1 \
  --num_file_reads_for_auto_readahead=2 \
  --open_files=-1 \
  --open_metadata_read_fault_one_in=0 \
  --open_metadata_write_fault_one_in=0 \
  --open_read_fault_one_in=0 \
  --open_write_fault_one_in=0 \
  --ops_per_thread=20000000 \
  --optimize_filters_for_hits=1 \
  --optimize_filters_for_memory=0 \
  --optimize_multiget_for_io=0 \
  --paranoid_file_checks=1 \
  --partition_filters=0 \
  --partition_pinning=0 \
  --pause_background_one_in=1000000 \
  --periodic_compaction_seconds=1 \
  --prefix_size=5 \
  --prefixpercent=5 \
  --prepopulate_block_cache=0 \
  --preserve_internal_time_seconds=60 \
  --progress_reports=0 \
  --promote_l0_one_in=0 \
  --read_amp_bytes_per_bit=32 \
  --read_fault_one_in=1000 \
  --readahead_size=0 \
  --readpercent=45 \
  --recycle_log_file_num=1 \
  --reopen=20 \
  --report_bg_io_stats=1 \
  --reset_stats_one_in=10000 \
  --sample_for_compression=0 \
  --secondary_cache_fault_one_in=0 \
  --sync=0 \
  --sync_fault_injection=0 \
  --table_cache_numshardbits=6 \
  --target_file_size_base=524288 \
  --target_file_size_multiplier=2 \
  --test_batches_snapshots=0 \
  --top_level_index_pinning=0 \
  --txn_write_policy=2 \
  --uncache_aggressiveness=126 \
  --universal_max_read_amp=4 \
  --unordered_write=0 \
  --unpartitioned_pinning=1 \
  --use_adaptive_mutex=0 \
  --use_adaptive_mutex_lru=1 \
  --use_attribute_group=1 \
  --use_delta_encoding=1 \
  --use_direct_io_for_flush_and_compaction=0 \
  --use_direct_reads=0 \
  --use_full_merge_v1=0 \
  --use_get_entity=0 \
  --use_merge=1 \
  --use_multi_cf_iterator=0 \
  --use_multi_get_entity=0 \
  --use_multiget=1 \
  --use_optimistic_txn=0 \
  --use_put_entity_one_in=0 \
  --use_timed_put_one_in=0 \
  --use_txn=1 \
  --use_write_buffer_manager=0 \
  --user_timestamp_size=0 \
  --value_size_mult=32 \
  --verification_only=0 \
  --verify_checksum=1 \
  --verify_checksum_one_in=1000000 \
  --verify_compression=0 \
  --verify_db_one_in=10000 \
  --verify_file_checksums_one_in=1000000 \
  --verify_iterator_with_expected_state_one_in=5 \
  --verify_sst_unique_id_in_manifest=1 \
  --wal_bytes_per_sync=0 \
  --wal_compression=none \
  --write_buffer_size=4194304 \
  --write_dbid_to_manifest=1 \
  --write_fault_one_in=128 \
  --writepercent=35
```

Reviewed By: cbi42

Differential Revision: D61048774

Pulled By: jaykorean

fbshipit-source-id: 22200d55fd0b22b68732b12516e681a6c6e2c601
This commit is contained in:
Jay Huh 2024-08-15 09:16:29 -07:00 committed by Facebook GitHub Bot
parent f63428bcc7
commit b99baa5ae7
2 changed files with 4 additions and 1 deletions

View file

@ -0,0 +1 @@
Fixed a bug where we set unprep_seqs_ even when WriteImpl() fails. This was caught by stress test write fault injection in WriteImpl(). This may have incorrectly caused iteration creation failure for unvalidated writes or returned wrong result for WriteUnpreparedTxn::GetUnpreparedSequenceNumbers().

View file

@ -395,7 +395,9 @@ Status WriteUnpreparedTxn::FlushWriteBatchToDBInternal(bool prepared) {
// unprep_seqs_ will also contain prepared seqnos since they are treated in // unprep_seqs_ will also contain prepared seqnos since they are treated in
// the same way in the prepare/commit callbacks. See the comment on the // the same way in the prepare/commit callbacks. See the comment on the
// definition of unprep_seqs_. // definition of unprep_seqs_.
unprep_seqs_[prepare_seq] = prepare_batch_cnt_; if (s.ok()) {
unprep_seqs_[prepare_seq] = prepare_batch_cnt_;
}
// Reset transaction state. // Reset transaction state.
if (!prepared) { if (!prepared) {