Commit graph

12280 commits

Author SHA1 Message Date
Levi Tamasi 51b3b7e08c Remove a now-unnecessary WideColumnSerialization::Serialize variant (#11864)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11864

Reviewed By: jaykorean

Differential Revision: D49445163

fbshipit-source-id: a1275112b9f87a3652df5f155bd215db5819326c
2023-09-20 08:04:35 -07:00
Levi Tamasi f42e70bf56 Integrate FullMergeV3 into the query and compaction paths (#11858)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11858

The patch builds on https://github.com/facebook/rocksdb/pull/11807 and integrates the `FullMergeV3` API into the read and compaction code paths by updating and extending the logic in `MergeHelper`.

In particular, when it comes to merge inputs, the existing `TimedFullMergeWithEntity` is folded into `TimedFullMerge`, since wide-column base values are now handled the same way as plain base values (or no base values for that matter), e.g. they are passed directly to the `MergeOperator`. On the other hand, there is some new differentiation on the output side. Namely, there are now two sets of `TimedFullMerge` variants: one set for contexts where the complete merge result and its value type are needed (used by iterators and compactions), and another set where the merge result is needed in a form determined by the client (used by the point lookup APIs, where e.g. for `Get` we have to extract the value of the default column of any wide-column results).

Implementation-wise, the two sets of overloads use different visitors to process the `std::variant` produced by `FullMergeV3`. This has the benefit of eliminating some repeated code e.g. in the point lookup paths, since `TimedFullMerge` now populates the application's result object (`PinnableSlice`/`string` or `PinnableWideColumns`) directly. Moreover, within each set of variants, there is a separate overload for the no base value/plain base value/wide-column base value cases, which eliminates some repeated branching w/r/t to the type of the base value if any.

Reviewed By: jaykorean

Differential Revision: D49352562

fbshipit-source-id: c2fb9853dba3fbbc6918665bde4195c4ea150a0c
2023-09-19 17:27:04 -07:00
Changyu Bi cfe0e0b037 Initialize FaultInjectionTestFS DirectWritable field (#11862)
Summary:
FaultInjectionTestFS is not directly writable by default. Should set it to direct writable if there is no write fault injection.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11862

Test Plan: internal stress test failure reduces.

Reviewed By: jaykorean

Differential Revision: D49428108

Pulled By: cbi42

fbshipit-source-id: 5dfa1fbb454272a14f8228a5c496d480d7138ef1
2023-09-19 12:23:38 -07:00
Peter Dillinger 44e4ffd60d Release note for #11738 AutoHCC (#11855)
Summary:
Neglected in original PR

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11855

Test Plan: no code change

Reviewed By: jowlyzhang

Differential Revision: D49424759

Pulled By: pdillinger

fbshipit-source-id: 9a2ef2c0992a65f2a8f04dce9cefe978f8087407
2023-09-19 11:35:43 -07:00
Changyu Bi ba5897ada8 Fix stress test failure due to write fault injections and disable write fault injection (#11859)
Summary:
This PR contains two fixes:

1. disable write fault injection since it caused several other kinds of internal stress test failures. I'll try to fix those separately before enabling it again.
2. Fix segfault like
```
https://github.com/facebook/rocksdb/issues/5  0x000000000083dc43 in rocksdb::port::Mutex::Lock (this=0x30) at internal_repo_rocksdb/repo/port/port_posix.cc:80
80	internal_repo_rocksdb/repo/port/port_posix.cc: No such file or directory.
https://github.com/facebook/rocksdb/issues/6  0x0000000000465142 in rocksdb::MutexLock::MutexLock (mu=0x30, this=<optimized out>) at internal_repo_rocksdb/repo/util/mutexlock.h:37
37	internal_repo_rocksdb/repo/util/mutexlock.h: No such file or directory.
https://github.com/facebook/rocksdb/issues/7  rocksdb::FaultInjectionTestFS::DisableWriteErrorInjection (this=0x0) at internal_repo_rocksdb/repo/utilities/fault_injection_fs.h:505
505	internal_repo_rocksdb/repo/utilities/fault_injection_fs.h: No such file or directory.
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11859

Test Plan: db_stress with no fault injection: `./db_stress --write_fault_one_in=0 --read_fault_one_in=0 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --sync_fault_injection=0`

Reviewed By: jaykorean

Differential Revision: D49408247

Pulled By: cbi42

fbshipit-source-id: 0ca01f20e6e81bf52af77818b50d562ef7462165
2023-09-19 08:33:05 -07:00
Jay Huh 20dbf51247 DB Stress Fix - Commit pending value after checking for status (#11856)
Summary:
We've seen occasional crash test failure in optimistic transaction DB with the following error message.

```
stderr:
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Verification failed for column family 0 key 0000000000001EDB0000000000000036787878787878 (789064): value_from_db: 010000000504070609080B0A0D0C0F0E111013121514171619181B1A1D1C1F1E212023222524272629282B2A2D2C2F2E313033323534373639383B3A3D3C3F3E, value_from_expected: , msg: MultiGet verification: Unexpected value found
Crash-recovery verification failed :(
No writes or ops?
Verification failed :(
```

There was a possibility if the db stress crashes after `pending_expected_value.Commit()` but before `thread->shared->SafeTerminate();`, we may have expected value committed while actual DB value was not.

Moving the `pending_expected_value.Commit()` after `s.ok()` check to fix the test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11856

Test Plan:
Ran the following in a script with while loop. (It doesn't always repro the issue even without this fix, though..)
```
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=0 --allow_data_in_errors=True --async_io=1 --atomic_flush=1 --auto_readahead_size=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_protection_bytes_per_key=0 --block_size=16384 --bloom_before_level=8 --bloom_bits=11 --bottommost_compression_type=lz4 --bottommost_file_compaction_delay=0 --bytes_per_sync=0 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=auto_hyper_clock_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=0 --charge_filter_construction=0 --charge_table_reader=0 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=4 --compaction_readahead_size=0 --compaction_ttl=100 --compression_checksum=0 --compression_max_dict_buffer_bytes=2097151 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=lz4 --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_blackbox --db_write_buffer_size=1048576 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --enable_thread_tracking=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=1 --fifo_allow_compaction=1 --file_checksum_impl=crc32c --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=3 --index_type=2 --ingest_external_file_one_in=0 --initial_auto_readahead_size=524288 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=0 --lock_wal_one_in=1000000 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=524288 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=2097152 --memtable_max_range_deletions=0 --memtable_prefix_bloom_size_ratio=0.001 --memtable_protection_bytes_per_key=2 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=1 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=1 --occ_lock_bucket_count=500 --occ_validation_policy=0 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=1000000 --periodic_compaction_seconds=2 --prefix_size=5 --prefixpercent=5 --prepopulate_block_cache=1 --preserve_internal_time_seconds=36000 --progress_reports=0 --read_fault_one_in=0 --readahead_size=0 --readpercent=45 --recycle_log_file_num=1 --reopen=0 --secondary_cache_fault_one_in=0 --secondary_cache_uri=compressed_secondary_cache://capacity=8388608;enable_custom_split_merge=true --set_options_one_in=10000 --share_occ_lock_buckets=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=3 --sync=0 --sync_fault_injection=0 --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=2 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=0 --use_multi_get_entity=0 --use_multiget=1 --use_optimistic_txn=1 --use_put_entity_one_in=0 --use_txn=1 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_file_checksums_one_in=1000000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=4194304 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=35&
```
```
pid=$!
sleep 10
kill -9 $pid
sleep 1
```
```
db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_data_in_errors=True --async_io=0 --atomic_flush=1 --auto_readahead_size=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=8 --block_protection_bytes_per_key=0 --block_size=16384 --bloom_before_level=2147483647 --bloom_bits=75.01353068032098 --bottommost_compression_type=xpress --bottommost_file_compaction_delay=3600 --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=auto_hyper_clock_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=0 --charge_filter_construction=1 --charge_table_reader=0 --checkpoint_one_in=1000000 --checksum_type=kCRC32c --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=4 --compaction_readahead_size=0 --compaction_ttl=0 --compression_checksum=0 --compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 --compression_parallel_threads=1 --compression_type=xpress --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=1 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_blackbox --db_write_buffer_size=134217728 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --detect_filter_construct_corruption=1 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --enable_thread_tracking=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=1 --fifo_allow_compaction=0 --file_checksum_impl=none --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=10 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=524288 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=1 --lock_wal_one_in=1000000 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=524288 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=1048576 --memtable_max_range_deletions=100 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=8 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --num_file_reads_for_auto_readahead=2 --occ_lock_bucket_count=10 --occ_validation_policy=0 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=32 --open_write_fault_one_in=16 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=1 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=36000 --progress_reports=0 --read_fault_one_in=32 --readahead_size=0 --readpercent=45 --recycle_log_file_num=1 --reopen=0 --secondary_cache_fault_one_in=0 --secondary_cache_uri=compressed_secondary_cache://capacity=8388608;enable_custom_split_merge=true --set_options_one_in=10000 --share_occ_lock_buckets=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=4 --sync=0 --sync_fault_injection=0 --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=1 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=0 --use_full_merge_v1=1 --use_get_entity=0 --use_merge=1 --use_multi_get_entity=0 --use_multiget=1 --use_optimistic_txn=1 --use_put_entity_one_in=0 --use_txn=1 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_file_checksums_one_in=0 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=524288 --wal_compression=zstd --write_buffer_size=4194304 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=35
```

Reviewed By: cbi42

Differential Revision: D49403091

Pulled By: jaykorean

fbshipit-source-id: 5ee6136133bbdc46aa733e5101c1f998f658c200
2023-09-18 23:04:38 -07:00
Changyu Bi c90807d103 Inject retryable write IOError when writing to SST files in stress test (#11829)
Summary:
* db_crashtest.py now may set `write_fault_one_in` to 500 for blackbox and whitebox simple test.
* Error injection only applies to writing to SST files. Flush error will cause DB to pause background operations and auto-resume. Compaction error will just re-schedule later.
* File ingestion and back up tests are updated to check if the result status is due to an injected error.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11829

Test Plan:
a full round of whitebox simple and blackbox simple crash test
*  `python3 ./tools/db_crashtest.py whitebox/blackbox --simple  --write_fault_one_in=500`

Reviewed By: ajkr

Differential Revision: D49256962

Pulled By: cbi42

fbshipit-source-id: 68e0c9648d8e03bad39c7672b25d5500fc286d97
2023-09-18 16:23:26 -07:00
Changyu Bi cc254efea6 Release compaction files in manifest write callback (#11764)
Summary:
Fixes https://github.com/facebook/rocksdb/issues/10257 (also see [here](https://github.com/facebook/rocksdb/pull/10355#issuecomment-1684308556)) by releasing compaction files earlier when writing to manifest in LogAndApply().  This is done by passing in a [callback](ba59751430/db/version_set.h (L1199)) to LogAndApply(). The new Version is created in the same critical section where compaction files are released. When compaction picker is picking compaction based on the new version, these compaction files will already be released.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11764

Test Plan:
* Existing unit tests
* A repro unit test to validate that compaction files are released: `./db_compaction_test --gtest_filter=DBCompactionTest.ReleaseCompactionDuringManifestWrite`
* `python3 ./tools/db_crashtest.py --simple whitebox` with some assertions to check compaction files are released

Reviewed By: ajkr

Differential Revision: D48742152

Pulled By: cbi42

fbshipit-source-id: 7560fd0e723a63fe692234015d2b96850f8b5d77
2023-09-18 13:11:53 -07:00
Kefu Chai 920d72e6fa cmake: check PORTABLE for well-known boolean representations (#11724)
Summary:
before 459969e9, we were using CMake `option()` to represent `PORTABLE`. so the CMake boolean representations like ON, OFF, 0 and 1 are supported. this is also the downstream package maintainers were using before v8.3.2.

in 459969e9, this option is expanded to specify the argument of `-march` passed to compiler in order to be more flexible and hence allows user to specify CPU type directly. but in the typical use cases, user would just want to use "ON" for the best performance on the building host, and "OFF" for a portable build when it comes to a distro package maintainer.

so, in this change, let's check for the boolean representations supported by CMake.

Fixes https://github.com/facebook/rocksdb/issues/11558
Signed-off-by: Kefu Chai <tchaikov@gmail.com>

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11724

Reviewed By: anand1976

Differential Revision: D48827447

Pulled By: ajkr

fbshipit-source-id: b2fef7076b2e90ad13a1fbec80e197841fa06d38
2023-09-18 12:11:15 -07:00
dengyan 0dac75d542 Fix a bug in MultiGet when skip_memtable is true (#11700)
Summary:
When skip_memtable is true in MultiGetImpl, The lookup_current is always false, Causes data to be unable to be queried in super_version->current。

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11700

Reviewed By: anand1976

Differential Revision: D49342877

Pulled By: jowlyzhang

fbshipit-source-id: 270a36d049b4cb7fd151a1fa3080300310111271
2023-09-18 12:06:58 -07:00
akankshamahajan 5b5b011cdd Avoid double block cache lookup during Seek with async_io option (#11616)
Summary:
With the async_io option, the Seek happens in 2 phases. Phase 1 starts an asynchronous read on a block cache miss, and phase 2 waits for it to complete and finishes the seek. In both phases, BlockBasedTable::NewDataBlockIterator is called, which tries to lookup the block cache for the data block first before looking in the prefetch buffer. It's optimized by doing the block cache lookup only in the first phase and save some CPU.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11616

Test Plan: Added unit test

Reviewed By: jaykorean

Differential Revision: D47477887

Pulled By: akankshamahajan15

fbshipit-source-id: 0355e0a68fc0ea2eb92340ae42735afcdbcbfd79
2023-09-18 11:32:30 -07:00
Changyu Bi 6997a06c63 Invalidate threadlocal SV before incrementing super_version_number_ (#11848)
Summary:
CI has been hitting assertion error like
```
https://github.com/facebook/rocksdb/issues/8  0x00007fafd9294fd6 in __GI___assert_fail (assertion=assertion@entry=0x7fafda270300 "!*memtable_range_tombstone_iter_ || sv_number_ != cfd_->GetSuperVersionNumber()", file=file@entry=0x7fafda270350 "db/arena_wrapped_db_iter.cc", line=line@entry=124, function=function@entry=0x7fafda270288 "virtual rocksdb::Status rocksdb::ArenaWrappedDBIter::Refresh(const rocksdb::Snapshot*)") at assert.c:101
```

This is due to
* Iterator::Refresh() passing in `cur_sv_number` instead of `sv->version_number` here: 1c6faf3587/db/arena_wrapped_db_iter.cc (L94-L96)

* `super_version_number_` can be incremented before thread local SV is installed: https://github.com/facebook/rocksdb/blob/main/db/column_family.cc#L1287-L1306

* The optimization in https://github.com/facebook/rocksdb/issues/11452 removed the check for SV number, such that `cur_sv_number > sv.version_number` is possible in the following code.
```
uint64_t cur_sv_number = cfd_->GetSuperVersionNumber();
SuperVersion* sv = cfd_->GetReferencedSuperVersion(db_impl_);
```

Not sure why assertion only started failing after https://github.com/facebook/rocksdb/issues/10594, maybe it's because Refresh() is called more often in stress test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11848

Test Plan:
* This repros hits the assertion pretty consistently before this change:
```
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_data_in_errors=True --async_io=0 --atomic_flush=1 --auto_readahead_size=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_one_in=0 --block_size=16384 --bloom_bits=0.7161318870366848 --cache_index_and_filter_blocks=0 --cache_size=8388608 --charge_table_reader=0 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=3 --compaction_readahead_size=0 --compaction_ttl=0 --compression_checksum=0 --compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 --compression_parallel_threads=1 --compression_type=zlib --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db_write_buffer_size=8388608 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --enable_thread_tracking=1 --fail_if_options_file_error=0 --fifo_allow_compaction=1 --file_checksum_impl=none --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=14 --index_type=2 --ingest_external_file_one_in=0 --initial_auto_readahead_size=524288 --iterpercent=30 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=1 --lock_wal_one_in=1000000 --long_running_snapshots=1 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=524288 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=2500000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=1048576 --memtable_max_range_deletions=0 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=0 --memtable_whole_key_filtering=1 --memtablerep=skip_list --min_write_buffer_number_to_merge=1 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=1 --open_files=500000 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=0 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=0 --preserve_internal_time_seconds=0 --progress_reports=0 --read_fault_one_in=32 --readahead_size=16384 --readpercent=30 --recycle_log_file_num=1 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0 --secondary_cache_uri= --set_options_one_in=10000 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=600 --subcompactions=1 --sync=0 --sync_fault_injection=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --test_cf_consistency=1 --top_level_index_pinning=3 --unpartitioned_pinning=3 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=0 --use_multi_get_entity=0 --use_multiget=1 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_file_checksums_one_in=0 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=1048576 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=35 --use_io_uring=0 --db=/tmp/rocksdb_crashtest_blackboxnf3pyv_0 --expected_values_dir=/tmp/rocksdb_crashtest_expected_6opy9nqg
```

Reviewed By: ajkr

Differential Revision: D49344066

Pulled By: cbi42

fbshipit-source-id: d5373ddb48d933acb42a5dd8fae3f3019b0241e5
2023-09-18 09:37:40 -07:00
Andrew Kryczka 4196ad81e3 LZ4 set acceleration parameter (#11844)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11844

Test Plan:
Command:
```
for level in 1 1234 32767 -1 -10 -100 -1000 -10000 -100000 -1000000; do echo -n "level=$level " && ./db_bench -benchmarks=compress -compression_type=lz4 -compression_level=$level |& awk '/^compress / {print $5, $6}' ; done
```

Output:
```
level=1 181340 ops/sec
level=1234 183197 ops/sec
level=32767 181480 ops/sec
level=-1 181053 ops/sec
level=-10 662858 ops/sec
level=-100 2611516 ops/sec
level=-1000 3043125 ops/sec
level=-10000 3001351 ops/sec
level=-100000 2861834 ops/sec
level=-1000000 2906413 ops/sec
```

Reviewed By: cbi42

Differential Revision: D49331443

Pulled By: ajkr

fbshipit-source-id: c8909708c3b2b9b83bf2bda2d3f24b8a92d4c2ea
2023-09-18 09:26:29 -07:00
Niklas Fiekas 1a9b42bbdd Add C API for ReadOptions::auto_readahead_size (#11837)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11837

Reviewed By: cbi42

Differential Revision: D49303447

Pulled By: ajkr

fbshipit-source-id: 7debf722339f4fd551760ef8d6801b7a41498565
2023-09-17 19:51:28 -07:00
Changyu Bi 60de713e15 Use uint64_t for compaction_readahead_size in stress test (#11849)
Summary:
Internal clang check complains: `tools/db_bench_tool.cc:722:43: error: implicit conversion loses integer precision: 'size_t' (aka 'unsigned long') to 'const gflags::int32' (aka 'const int') [-Werror,-Wshorten-64-to-32]
             ROCKSDB_NAMESPACE::Options().compaction_readahead_size,`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11849

Test Plan: `make -C internal_repo_rocksdb/repo -j64 USE_CLANG=1 J=40 check`, I can only repro when using on-demand devserver.

Reviewed By: hx235

Differential Revision: D49344491

Pulled By: cbi42

fbshipit-source-id: 8c2c0bf2a075c3190b8b91f14f64e26ee252f20f
2023-09-16 12:08:55 -07:00
Jay Huh 99f8820054 Fix test on IOActivity check for MultiGetEntity (#11850)
Summary:
After https://github.com/facebook/rocksdb/issues/11842  merged, we started to see some crash_test failures.

There is a flow inside `TestMultiGetEntity()` that it calls `GetEntity()` to compare the result between `MultiGetEntity()` and `GetEntity()` 1c6faf3587/db_stress_tool/no_batched_ops_stress.cc (L1068-L1072)

However, IOActivity check inside DbStressRandomAccessFileWrapper was expecting IOActivity::MultiGet when GetEntity() was called. We are fixing the test by setting expected operation to be GetEntity before calling GetEntity()

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11850

Test Plan:
Error repro'ed by the following run before fix and no more error after the fix.

```
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=0 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --async_io=1 --auto_readahead_size=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=8 --block_protection_bytes_per_key=0 --block_size=16384 --bloom_before_level=1 --bloom_bits=9.880688060667444 --bottommost_compression_type=zstd --bottommost_file_compaction_delay=86400 --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=auto_hyper_clock_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=0 --charge_filter_construction=0 --charge_table_reader=1 --checkpoint_one_in=1000000 --checksum_type=kxxHash64 --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=3 --compaction_readahead_size=1048576 --compaction_ttl=0 --compression_checksum=0 --compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 --compression_parallel_threads=1 --compression_type=none --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=1 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_blackbox --db_write_buffer_size=0 --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --detect_filter_construct_corruption=1 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=0 --enable_thread_tracking=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=0 --fifo_allow_compaction=1 --file_checksum_impl=big --flush_one_in=1000000 --format_version=6 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=4 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=0 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=0 --lock_wal_one_in=1000000 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=524288 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=2097152 --memtable_max_range_deletions=0 --memtable_prefix_bloom_size_ratio=0 --memtable_protection_bytes_per_key=2 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=1 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=0 --preserve_internal_time_seconds=0 --progress_reports=0 --read_fault_one_in=32 --readahead_size=16384 --readpercent=50 --recycle_log_file_num=1 --reopen=0 --secondary_cache_fault_one_in=0 --secondary_cache_uri=compressed_secondary_cache://capacity=8388608;enable_custom_split_merge=true --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=2 --sync=0 --sync_fault_injection=1 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=1 --use_multi_get_entity=1 --use_multiget=1 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_file_checksums_one_in=1000000 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=524288 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --writepercent=35
```

Reviewed By: cbi42

Differential Revision: D49344996

Pulled By: jaykorean

fbshipit-source-id: 8059b8127c0e3cb8af96cf222f47398413c92c50
2023-09-15 22:50:49 -07:00
Peter Dillinger 1c6faf3587 Make RibbonFilterPolicy::bloom_before_level mutable (SetOptions()) (#11838)
Summary:
An internal user wants to be able to dynamically switch between Bloom and Ribbon filters, without a custom FilterPolicy. Making `filter_policy` mutable would actually make issue https://github.com/facebook/rocksdb/issues/10079 worse, because it would be a race on a pointer field, not just on scalars.

As a reasonable compromise until that is fixed, I am enabling dynamic control over Bloom vs. Ribbon choice by making
RibbonFilterPolicy::bloom_before_level mutable, and doing that safely by using an atomic.

I've also slightly tweaked the interpretation of that field so that setting it to INT_MAX really means "always Bloom."

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11838

Test Plan: unit tests added/extended. crash test updated for SetOptions call and tested under TSAN with amplified probability (lower set_options_one_in).

Reviewed By: ajkr

Differential Revision: D49296284

Pulled By: pdillinger

fbshipit-source-id: e4251c077510df9a9c719876f482448c0d15402a
2023-09-15 15:46:10 -07:00
Jay Huh cff6490bc4 Add IOActivity.kMultiGetEntity (#11842)
Summary:
- As a follow up from https://github.com/facebook/rocksdb/issues/11799, adding `Env::IOActivity::kMultiGetEntity` support to `DBImpl::MultiGetEntity()`.

## Minor Refactor
- Because both `DBImpl::MultiGet()` and `DBImpl::MultiGetEntity()` call `DBImpl::MultiGetCommon()` which later calls `DBImpl::MultiGetWithCallback()` where we check `Env::IOActivity::kMultiGet`, minor refactor was needed so that we don't check `Env::IOActivity::kMultiGet` for  `DBImpl::MultiGetEntity()`.
- I still see more areas for refactoring to avoid duplicate code of checking IOActivity and setting it when Unknown, but this will be addressed separately.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11842

Test Plan:
- Added the `ThreadStatus::OperationType::OP_MULTIGETENTITY` in `db_stress` to verify the pass-down IOActivity in a thread aligns with the actual activity the thread is doing.
```
python3 tools/db_crashtest.py blackbox --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --interval=10
python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --interval=10
python3 tools/db_crashtest.py blackbox --cf_consistency --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --interval=10
```

Reviewed By: ltamasi

Differential Revision: D49329575

Pulled By: jaykorean

fbshipit-source-id: 05198f1d3f92e6be3d42a3d184bacb3ab2ce6923
2023-09-15 15:34:04 -07:00
Changyu Bi 051cad3867 Fix CI failure due to transaction_test (#11843)
Summary:
Test ` build-linux-static_lib-alt_namespace-status_checked` has been failing in main branch.

```
utilities/transactions/transaction_test.cc:6777:3: error: 'rocksdb' has not been declared
 6777 |   rocksdb::GetMergeOperandsOptions mergeOperandOptions;
      |   ^~~~~~~
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11843

Test Plan: `ASSERT_STATUS_CHECKED=1 TEST_UINT128_COMPAT=1 ROCKSDB_MODIFY_NPHASH=1 LIB_MODE=static OPT="-DROCKSDB_NAMESPACE=alternative_rocksdb_ns" make V=1 -j24 J=24 transaction_test`

Reviewed By: sarangbh

Differential Revision: D49330210

Pulled By: cbi42

fbshipit-source-id: 85c99236eeca6a777af0101684fbab5a33cca1c9
2023-09-15 13:05:23 -07:00
leipeng 68ce5d84f6 Add new Iterator API Refresh(const snapshot*) (#10594)
Summary:
This PR resolves https://github.com/facebook/rocksdb/issues/10487 & https://github.com/facebook/rocksdb/issues/10536, user code needs to call Refresh() periodically.

The main code change is to support range deletions. A range tombstone iterator uses a sequence number as upper bound to decide which range tombstones are effective. During Iterator refresh, this sequence number upper bound needs to be updated for all range tombstone iterators under DBIter and LevelIterator. LevelIterator may create new table iterators and range tombstone iterator during scanning, so it needs to be aware of iterator refresh. The code path that propagates this change is `db_iter_->set_sequence(read_seq)  -> MergingIterator::SetRangeDelReadSeqno() -> TruncatedRangeDelIterator::SetRangeDelReadSeqno() and LevelIterator::SetRangeDelReadSeqno()`.

This change also fixes an issue where range tombstone iterators created by LevelIterator may access ReadOptions::snapshot, even though we do not explicitly require users to keep a snapshot alive after creating an Iterator.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10594

Test Plan:
* New unit tests.
* Add Iterator::Refresh(snapshot) to stress test. Note that this change only adds tests for refreshing to the same snapshot since this is the main target use case.

TODO in a following PR:
* Stress test Iterator::Refresh() to different snapshots or no snapshot.

Reviewed By: ajkr

Differential Revision: D48456896

Pulled By: cbi42

fbshipit-source-id: 2e642c04e91235cc9542ef4cd37b3c20823bd779
2023-09-15 10:44:43 -07:00
Hui Xiao b050751f76 Use default value instead of hard-coded 0 for compaction_readhead_size in db bench (#11831)
Summary:
**Context/Summary:**
It allows db bench reflect the default behavior of this option. For example, we recently changed its default value.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11831

Test Plan: No code change

Reviewed By: cbi42

Differential Revision: D49253690

Pulled By: hx235

fbshipit-source-id: 445d4e54f62b4b538626e301a3014d2f00849d30
2023-09-15 10:38:37 -07:00
Hui Xiao 3ebf10e0ac Info-log stats level on db open (#11840)
Summary:
**Context/Summary:**
It is useful to ensure users set the stats level right for enable detailed timers like ``rocksdb.file.read.{get|multiget|db.iterator|verify.checksum|verify.file.checksums}.micros`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11840

Test Plan:
- Manually checking LOG with db bench
```
./db_bench --benchmarks="fillrandom" --file_checksum=1 --num=100 --db=/dev/shm/rocksdb --statistics=0 --stats_level=2

2023/09/14-15:30:17.139022 2353133                              Options.statistics: (nil)
2023/09/14-15:30:17.139025 2353133                              Options.use_fsync: 0

./db_bench --benchmarks="fillrandom" --file_checksum=1 --num=100 --db=/dev/shm/rocksdb --statistics=1 --stats_level=0

2023/09/14-15:30:44.390827 2355026                              Options.statistics: 0x7f7c6d449290
2023/09/14-15:30:44.390830 2355026                              Options.statistics stats level: 0
2023/09/14-15:30:44.390833 2355026                              Options.use_fsync: 0

./db_bench --benchmarks="fillrandom" --file_checksum=1 --num=100 --db=/dev/shm/rocksdb --statistics=1 --stats_level=4

2023/09/14-15:31:04.466116 2356374                              Options.statistics: 0x7f84c8649290
2023/09/14-15:31:04.466119 2356374                              Options.statistics stats level: 4
2023/09/14-15:31:04.466122 2356374                              Options.use_fsync: 0
```

Reviewed By: ajkr

Differential Revision: D49296354

Pulled By: hx235

fbshipit-source-id: b1b4b911544b6fa8c3fe1dbbd65c3bedfef4b50a
2023-09-15 10:37:25 -07:00
Hui Xiao ed913513bd Fix a bug of rocksdb.file.read.verify.file.checksums.micros not being populated (#11836)
Summary:
**Context/Summary:**
`rocksdb.file.read.verify.file.checksums.micros ` was added in https://github.com/facebook/rocksdb/pull/11444 but the related path was not populated with statistics and clock object correctly so the actual statistics collection didn't happen. This PR fixed it.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11836

Test Plan:
Setup:
```
./db_bench --benchmarks="fillrandom" --file_checksum=1 --num=100 --db=/dev/shm/rocksdb
```
Run:
```
./db_bench --use_existing_db=1  --benchmarks="verifyfilechecksums" --file_checksum=1 --num=100 --db=/dev/shm/rocksdb --statistics=1 --stats_level=4
```
Post-PR
```
rocksdb.file.read.verify.file.checksums.micros P50 : 9.000000 P95 : 9.000000 P99 : 9.000000 P100 : 9.000000 COUNT : 1 SUM : 9
```

Pre-PR
```
rocksdb.file.read.verify.file.checksums.micros P50 : 0.000000 P95 : 0.000000 P99 : 0.000000 P100 : 0.000000 COUNT : 0 SUM : 0
```

Reviewed By: ajkr

Differential Revision: D49293378

Pulled By: hx235

fbshipit-source-id: 1acd8b828c28e088d0c5d63897f53cd180b82f42
2023-09-15 10:36:14 -07:00
Sarang Masti c4a19ed399 Add Transaction::CollapseKey to collapse merge op chains ondemand (#11815)
Summary:
Application using rocksdb today dont have much control over the cost of reads when merge-ops are enabled, expect for waiting for compactions to kick in or using max_successive_merges hint, which only applies to memtable. This change adds Transaction::CollapseKey api giving applications the ability to request merge chain collapse on-demand, when they detect high read costs due to merges. Currently, this only supported on PessimisticTransactions.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11815

Test Plan: Add a unit test

Reviewed By: ajkr

Differential Revision: D49309387

Pulled By: sarangbh

fbshipit-source-id: a1eb5cc9e3bd4b3206a36150aacead770318e3e1
2023-09-15 10:25:57 -07:00
akankshamahajan 1e2fd343bb Update upper_bound_offset when reseek changes iterate_upper_bound dynamically (#11775)
Summary:
Update the logic in FilePrefetchBuffer to update `upper_bound_offset_` during reseek. During Reseek, `iterate_upper_bound` can be changed dynamically. So added an API to update that in FilePrefetchBuffer.
Added unit test to confirm the behavior.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11775

Test Plan:
- Check stress tests in case there is any failure after this diff.
- make crash_test -j32 with auto_readahead_size=1 passed locally

Reviewed By: anand1976

Differential Revision: D48815177

Pulled By: akankshamahajan15

fbshipit-source-id: 5f44fbb3af06c86a1c38f139c5fa4543891837f4
2023-09-15 10:05:56 -07:00
Yu Zhang e1fd348b92 Fix a bug in multiget for cleaning up SuperVersion (#11830)
Summary:
When `MultiGet` acquires `SuperVersion` via locking the db mutex and get the current `ColumnFamilyData::super_version_`, its corresponding cleanup logic is not correctly done.

It's currently doing this:
`MultiGetColumnFamilyData::cfd->GetSuperVersion().Unref()`

This operates on the most recent `SuperVersion` without locking db mutex , which is not thread safe by itself. And this unref operation is intended for the originally acquired `SuperVersion` instead of the current one. Because a race condition could happen where a new `SuperVersion` is installed in between this `MultiGet`'s ref and unref. When this race condition does happen, it's not sufficient to just unref the `SuperVersion`, `DBImpl::CleanupSuperVersion` should be called instead to properly clean up the `SuperVersion` had this `MultiGet` call be its last reference holder.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11830

Test Plan:
`make all check`
Added a unit test that would originally fail

Reviewed By: ltamasi

Differential Revision: D49287715

Pulled By: jowlyzhang

fbshipit-source-id: 8353636ee11b2e90d85c677a96a92360072644b0
2023-09-15 09:50:39 -07:00
Jay Huh f2b623bcc1 GetEntity Support for ReadOnlyDB and SecondaryDB (#11799)
Summary:
`GetEntity` API support for ReadOnly DB and Secondary DB.
- Introduced `GetImpl()` with `GetImplOptions` in `db_impl_readonly` and refactored current `Get()` logic into `GetImpl()` so that look up logic can be reused for `GetEntity()` (Following the same pattern as `DBImpl::Get()` and `DBImpl::GetEntity()`)
- Introduced `GetImpl()` with `GetImplOptions` in `db_impl_secondary` and refactored current `GetImpl()` logic. This is to make `DBImplSecondary::Get/GetEntity` consistent with `DBImpl::Get/GetEntity` and `DBImplReadOnly::Get/GetEntity`
- `GetImpl()` in `db_impl` is now virtual. both `db_impl_readonly` and `db_impl_secondary`'s `Get()` override are no longer needed since all three dbs now have the same `Get()` which calls `GetImpl()` internally.
- `GetImpl()` in `DBImplReadOnly` and `DBImplSecondary` now pass in `columns` instead of `nullptr` in lookup functions like `memtable->get()`
- Introduced `GetEntity()` API in `DBImplReadOnly` and `DBImplSecondary` which simply calls `GetImpl()` with `columns` set in `GetImplOptions`.
- Introduced `Env::IOActivity::kGetEntity` and set read_options.io_activity to `Env::IOActivity::kGetEntity` for `GetEntity()` operations (in db_impl)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11799

Test Plan:
**Unit Tests**
- Added verification in `DBWideBasicTest::PutEntity` by Reopening DB as ReadOnly with the same setup.
- Added verification in `DBSecondaryTest::ReopenAsSecondary` by calling `PutEntity()` and `GetEntity()` on top of existing `Put()` and `Get()`
- `make -j64 check`

**Crash Tests**
- `python3 tools/db_crashtest.py blackbox --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --inter
val=10`
- `python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 `
- `python3 tools/db_crashtest.py blackbox --cf_consistency --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --inter
val=10`

Reviewed By: ltamasi

Differential Revision: D49037040

Pulled By: jaykorean

fbshipit-source-id: a0648253ded6e91af7953de364ed3c6bf163626b
2023-09-15 08:30:44 -07:00
Hui Xiao 47e023abbd Remove "rocksdb.file.read.db.open.micros" typo from 8.6 HISTORY (#11839)
Summary:
**Context/Summary:** `rocksdb.file.read.db.open.micros` is landed in 8.3 but not 8.6. It was included in the HISTORY due to an error.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11839

Test Plan: no code change; Will cherry pick this to 8.6 branch when landed.

Reviewed By: anand1976

Differential Revision: D49294250

Pulled By: hx235

fbshipit-source-id: b2ac10758a15eadd5c129d80e93e1c3d0aa569cb
2023-09-14 16:07:59 -07:00
马越 3c27f56d0b Fix the problem that some keys of ClipColumnFamily may not be deleted (#11811)
Summary:
When executing ClipColumnFamily, if end_key is equal to largest_user_key in a file, this key will not be deleted. So we need to change less than to less than or equal to

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11811

Reviewed By: ajkr

Differential Revision: D49206936

Pulled By: cbi42

fbshipit-source-id: 3e8bcb7b52040a9b4d1176de727616cc298d3445
2023-09-14 13:36:39 -07:00
Jonah Gao 84d335b619 Remove an unused variable: last_stats_dump_time_microsec_ (#11824)
Summary:
`last_stats_dump_time_microsec_`  is not used after initialization.

I guess that it was previously used to implement periodically dumping stats,
but this functionality has now been delegated to the `PeriodicTaskScheduler`.
4b79e8c003/db/db_impl/db_impl.cc (L770-L778)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11824

Reviewed By: cbi42

Differential Revision: D49278311

Pulled By: jowlyzhang

fbshipit-source-id: 5856245580afc026e6b490755a45c5436a2375c9
2023-09-14 11:25:33 -07:00
Yu Zhang 39a4ff2cab Track full_history_ts_low per SuperVersion (#11784)
Summary:
As discussed in https://github.com/facebook/rocksdb/issues/11730 , this PR tracks the effective `full_history_ts_low` per SuperVersion and update existing sanity checks for `ReadOptions.timestamp >= full_history_ts_low` to use this per SuperVersion `full_history_ts_low` instead. This also means the check is moved to happen after acquiring SuperVersion.

There are two motivations for this: 1) Each time `full_history_ts_low` really come into effect to collapse history, a new SuperVersion is always installed, because it would involve either a Flush or Compaction, both of which change the LSM tree shape. We can take advantage of this to ensure that as long as this sanity check is passed, even if `full_history_ts_low` can be concurrently increased and collapse some history above the requested `ReadOptions.timestamp`, a read request won’t have visibility to that part of history through this SuperVersion that it already acquired.  2) the existing sanity check uses `ColumnFamilyData::GetFullHistoryTsLow` without locking the db mutex, which is the mutex all `IncreaseFullHistoryTsLow` operation is using when mutating this field. So there is a race condition. This also solve the race condition on the read path.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11784

Test Plan:
`make all check`

// Checks success scenario really provide the read consistency attribute as mentioned above.
`./db_with_timestamp_basic_test --gtest_filter=*FullHistoryTsLowSanityCheckPassReadIsConsistent*`

// Checks failure scenario cleans up SuperVersion properly.
`./db_with_timestamp_basic_test --gtest_filter=*FullHistoryTsLowSanityCheckFail*`
`./db_secondary_test --gtest_filter=*FullHistoryTsLowSanityCheckFail*`
`./db_readonly_with_timestamp_test --gtest_filter=*FullHistoryTsLowSanitchCheckFail*`

Reviewed By: ltamasi

Differential Revision: D48894795

Pulled By: jowlyzhang

fbshipit-source-id: 1f801fe8e1bc8e63ca76c03cbdbd0974e5ff5bf6
2023-09-13 16:34:18 -07:00
Changyu Bi 3285ba7a29 Fix unit test tsan failure (#11828)
Summary:
The test DBCompactionWaitForCompactTest.WaitForCompactWithOptionToFlushAndCloseDB failed tsan in https://app.circleci.com/pipelines/github/facebook/rocksdb/32009/workflows/577e4e1f-a909-4e80-8ef4-af98b5ff7446/jobs/660989. I cannot repro locally, but this should be the fix.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11828

Test Plan: `./db_compaction_test --gtest_filter="*DBCompactionWaitForCompactTest/DBCompactionWaitForCompactTest.*"`

Reviewed By: jaykorean

Differential Revision: D49241904

Pulled By: cbi42

fbshipit-source-id: 68714c836d982dcb3946da104533d5c0594980de
2023-09-13 15:53:05 -07:00
Levi Tamasi 3db2cf113d Fix copyright header in util/overload.h (#11826)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11826

Reviewed By: jaykorean

Differential Revision: D49233043

fbshipit-source-id: cadf6cda3b9720789609e3d3d9404822c6681da2
2023-09-13 09:50:44 -07:00
Jay Huh 4b79e8c003 GetEntity and PutEntity Support in ldb (#11796)
Summary:
- `get_entity` and `put_entity` command support in ldb
- Input Format for `put_entity`: `ldb --db=<DB_PATH> put_entity <KEY> <COLUMN_1_NAME>:<COLUMN_1_VALUE> <COLUMN_2_NAME>:<COLUMN_2_VALUE> ...`
- Output Format for `get_entity`: `<COLUMN_1_NAME>:<COLUMN_1_VALUE> <COLUMN_2_NAME>:<COLUMN_2_VALUE>`
- If `get_entity` is called against non-wide column value (existing behavior), empty key (kDefaultWideColumnName) will be printed, appended by `:<COLUMN_VALUE>`
- If `get` is called against wide column value (existing behavior), first column value is printed if the first column name is kDefaultWideColumnName.

# Test

Checks for `put_entity` and `get_entity` added in `ldb_test.py`
```
❯ python3 tools/ldb_test.py                                                                                                                                                                                                                                                                                                                                                                    took 45s at 10:45:44 AM
Running testBlobBatchPut...
.Running testBlobDump
.Running testBlobPut...
.Running testBlobStartingLevel...
.Running testCheckConsistency...
.Running testColumnFamilies...
.Running testCountDelimDump...
.Running testCountDelimIDump...
.Running testDumpLiveFiles...
.Running testDumpLoad...
Warning: 7 bad lines ignored.
.Running testGetProperty...
.Running testHexPutGet...
.Running testIDumpBasics...
.Running testIDumpDecodeBlobIndex...
.Running testIngestExternalSst...
.Running testInvalidCmdLines...
.Running testListColumnFamilies...
.Running testListLiveFilesMetadata...
.Running testManifestDump...
.Running testMiscAdminTask...
Compacting the db...
Sequence,Count,ByteSize,Physical Offset,Key(s)
.Running testSSTDump...
.Running testSimpleStringPutGet...
.Running testStringBatchPut...
.Running testTtlPutGet...
.Running testWALDump...
.
----------------------------------------------------------------------
Ran 25 tests in 57.742s
```

Manual Test
```
# Invalid format for wide columns
❯ ./ldb --db=/tmp/test_db put_entity x4 x5
Failed: wide column format needs to be <column_name>:<column_value> (did you mean put <key> <value>?)

# empty column name (kDefaultWideColumnName)
❯ ./ldb --db=/tmp/test_db put_entity x4 :x5
OK
❯ ./ldb --db=/tmp/test_db get_entity x4
:x5
❯ ./ldb --db=/tmp/test_db get x4
x5

❯ ./ldb --db=/tmp/test_db put_entity a1 :z1 b1:c1 b2:f1
OK
❯ ./ldb --db=/tmp/test_db get_entity a1
:z1 b1:c1 b2:f1

# Keeping the existing behavior if `get` was called on wide column values
❯ ./ldb --db=/tmp/test_db get a1
z1

# Scan
❯ ./ldb --db=/tmp/test_db scan
a1 ==> b1:c1 b2:f1
x4 ==> x5
x5 ==> cn1:cv1 cn2:cv2

# Scan hex
❯ ./ldb --db=/tmp/test_db scan --hex
0x6131 ==> 0x6231:0x6331 0x6232:0x6631
0x7834 ==> 0x7835
0x7835 ==> 0x636E31:0x637631 0x636E32:0x637632

# More testing with hex values
❯ ./ldb --db=/tmp/test_db get_entity 0x6131 --hex
0x6231:0x6331 0x6232:0x6631

❯ ./ldb --db=/tmp/test_db get_entity 0x78 --hex
Failed: GetEntity failed: NotFound:

❯ ./ldb --db=/tmp/test_db get_entity 0x7834 --hex
:0x7835

❯ ./ldb --db=/tmp/test_db put_entity 0x7834 0x6234:0x6635 --hex
OK

❯ ./ldb --db=/tmp/test_db get_entity 0x7834 --hex
0x6234:0x6635

❯ ./ldb --db=/tmp/test_db get_entity 0x7834 --key_hex
b4:f5

❯ ./ldb --db=/tmp/test_db get_entity x4 --value_hex
0x6234:0x6635
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11796

Reviewed By: jowlyzhang

Differential Revision: D48978141

Pulled By: jaykorean

fbshipit-source-id: 4f87c222417ed90a6dbf39bd7b0f068b01e68393
2023-09-12 16:32:40 -07:00
Hui Xiao ef3e289b2d Conditionally exclude some L0 input files in size amp compaction (#11749)
Summary:
**Context/Summary:**
A size amp compaction can select and prevent a large number of L0 files from being selected by other compaction. If such compaction is running long or being queued behind, these L0 files will exist for long. With a few more flushes, we can run into write stop triggered by # L0 files. We've seen this happen on a host with many DBs sharing same thread pool, each of these DBs submits a size amp compaction with (110-180)+ files to the pool upon reopen and with a few more flushes, they hit the 200 L0 write stop condition.

The idea is to exclude some L0 input files in size amp compaction that are harmless to size amp reduction but improve the situation described above.

The exclusion algorithm is in `MightExcludeNewL0sToReduceWriteStop()` with two elements:

1.  #L0 to exclude + (level0_stop_writes_trigger - num_l0_input_pre_exclusion) should be in the range of [min_merge_width, max_merge_width].
    -  This is to ensure we are excluding enough L0 input files but not too many to be qualified to picked for another compaction along with the incoming future L0 files before write stop.
2. Based on (1), further constrain #L0 to exclude based on the post-exclusion compaction score. The goal is to ensure our exclusion will not disqualify the size amp compaction from being a size amp compaction after exclusion.

**Tets plan:** New unit test

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11749

Reviewed By: ajkr

Differential Revision: D48850631

Pulled By: hx235

fbshipit-source-id: 2c321036e164087c36319dd5645cbbf6b6152092
2023-09-12 15:53:15 -07:00
Changyu Bi 9d71682d1b Add statistics COMPACTION_CPU_TOTAL_TIME for total compaction time (#11741)
Summary:
Existing compaction statistics are `COMPACTION_TIME` and `COMPACTION_CPU_TIME` which are histogram and are logged at the end of a compaction. The new statistics `COMPACTION_CPU_TOTAL_TIME` is for cumulative total compaction time which is updated regularly during a compaction. This allows user to more closely track compaction cpu usage.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11741

Test Plan: * new unit test `DBTestWithParam.CompactionTotalTimeTest`

Reviewed By: ajkr

Differential Revision: D48608094

Pulled By: cbi42

fbshipit-source-id: b597109f3e4bf2237fb5a216b6fd036e5363b4c0
2023-09-12 15:48:36 -07:00
Levi Tamasi 1e63fc9925 Add a helper method WideColumnsHelper::SortColumns (#11823)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11823

Similarly to https://github.com/facebook/rocksdb/pull/11813, the patch is a small refactoring that eliminates some copy-paste around sorting the columns of entities by column name.

Reviewed By: jaykorean

Differential Revision: D49195504

fbshipit-source-id: d48c9f290e3203f838cc5949856c469ecf730008
2023-09-12 12:36:07 -07:00
Hui Xiao 4b123f3a54 Change file size related variables type to uint64_t in PickCompactionToReduceSizeAmp() (#11814)
Summary:
**Context/Summary:**
size_t is not most likely not needed as SortedRun::size/compensated_file_size is uint64_t. This is a pre-requisite to addressing https://github.com/facebook/rocksdb/pull/11749/files#r1321828933.  Other places already uses uint64_t e.g, https://github.com/facebook/rocksdb/blob/8.6.fb/db/compaction/compaction_picker_universal.cc#L349-L353

**Test**
CI

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11814

Reviewed By: ajkr

Differential Revision: D49169155

Pulled By: hx235

fbshipit-source-id: 2b3ad70e6f18aa360e94ed8907c8534ad2797e62
2023-09-12 10:00:57 -07:00
Levi Tamasi 8fc78a3a9e Add helper methods WideColumnsHelper::{Has,Get}DefaultColumn (#11813)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11813

The patch adds a couple of helper methods `WideColumnsHelper::{Has,Get}DefaultColumn` to eliminate some code duplication.

Reviewed By: jaykorean

Differential Revision: D49166682

fbshipit-source-id: f229ca5b94599f7445a0112b10f8317292505c82
2023-09-11 16:32:32 -07:00
Andrew Kryczka 694e49cbb1 Add a unit test for the fix in #11763 (#11810)
Summary:
The unit test depended on https://github.com/facebook/rocksdb/issues/11753, which landed after the bug fix

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11810

Reviewed By: jaykorean

Differential Revision: D49128695

Pulled By: ajkr

fbshipit-source-id: e0a98bd65a292a7c7bd03913650f73c26d0864c7
2023-09-11 12:54:50 -07:00
Levi Tamasi 760ea373a8 Introduce a wide column aware MergeOperator API (#11807)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11807

For now, RocksDB has limited support for using merge with wide columns: when a bunch of merge operands have to be applied to a wide-column base value, RocksDB currently passes only the value of the default column to the application's `MergeOperator`, which means there is no way to update any other columns during a merge. As a first step in making this more general, the patch adds a new API `FullMergeV3` to `MergeOperator`.

`FullMergeV3`'s interface enables applications to receive a plain, wide-column, or non-existent base value as merge input, and to produce a new plain value, a new wide-column value, or an existing operand as merge output. Note that there are no limitations on the column names and values if the merge result is a wide-column entity. Also, the interface is general in the sense that it makes it possible e.g. for a merge that takes a plain base value and some deltas to produce a wide-column entity as a result.

For backward compatibility, the default implementation of `FullMergeV3` falls back to `FullMergeV2` and implements the current logic where merge operands are applied to the default column of the base entity and any other columns are unchanged. (Note that with `FullMergeV3` in the `MergeOperator` interface, this behavior will become customizable.)

This patch just introduces the new API and the default backward compatible implementation. I plan to integrate `FullMergeV3` into the query and compaction logic in subsequent diffs.

Reviewed By: jaykorean

Differential Revision: D49117253

fbshipit-source-id: 109e016f25cd130fc504790818d927bae7fec6bd
2023-09-11 12:13:58 -07:00
akankshamahajan ed5b6c0d99 Avoid alignment in FilePrefetchBuffer during seek with async_io (#11793)
Summary:
During Seek, the iterator seeks every file on L0. In async_io, it submit the requests to seek on every file on L0 asynchronously using RocksDB FilePrefetchBuffer.
However, FilePrefetchBuffer does alignment and reads extra bytes then needed that can increase the throughput.
In case of non direct io, the alignment can be avoided.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11793

Test Plan:
- Added a unit test that fails without this PR.
- make crash_test -j32 completed successfully

Reviewed By: anand1976

Differential Revision: D48985051

Pulled By: akankshamahajan15

fbshipit-source-id: 2d130a9e7c3df9c4fcd0408406e6277ab75a4389
2023-09-11 11:41:44 -07:00
Peter Dillinger e67ee46642 Suppress TSAN reports on AutoHyperClockTable::Lookup (#11806)
Summary:
This function uses racing reads for heuristic performance improvement. My change in https://github.com/facebook/rocksdb/issues/11792 only worked for clang, not gcc, and gcc does not accurately handle TSAN suppressions. I would have to mark much more code as suppressed than I want to.

So I've taken a different approach: TSAN build does not use the racing reads but substitutes random results, as an extra test that a "correct" value is not needed for correct overall behavior.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11806

Test Plan: manual TSAN builds & tests with cache_bench

Reviewed By: ajkr

Differential Revision: D49100115

Pulled By: pdillinger

fbshipit-source-id: d6d0dfb796d710b953212dd3fc171b6e88fadea1
2023-09-08 10:50:47 -07:00
Jay Huh 05183bedcc Add .arcconfig to .gitignore (fb internal use) (#11803)
Summary:
This is for fb internal use. Please see the comment in internal Phabricator for details.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11803

Reviewed By: hx235

Differential Revision: D49065093

Pulled By: jaykorean

fbshipit-source-id: acd71d7c1163f3c95c59c427caf944dacfe58ef6
2023-09-07 14:57:39 -07:00
Changyu Bi 195f35c08b Add a unit test for the fix in #11786 (#11790)
Summary:
Tests a scenario where range tombstone reseek used to cause MergingIterator to discard non-ok status.

Ran on main without https://github.com/facebook/rocksdb/issues/11786:
```
./db_range_del_test --gtest_filter="*RangeDelReseekAfterFileReadError*"
Note: Google Test filter = *RangeDelReseekAfterFileReadError*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBRangeDelTest
[ RUN      ] DBRangeDelTest.RangeDelReseekAfterFileReadError
db/db_range_del_test.cc:3577: Failure
Value of: iter->Valid()
  Actual: true
Expected: false
[  FAILED  ] DBRangeDelTest.RangeDelReseekAfterFileReadError (64 ms)
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11790

Reviewed By: ajkr

Differential Revision: D48972869

Pulled By: cbi42

fbshipit-source-id: b1a71867533b0fb60af86f8ce8a9e391ba84dd57
2023-09-06 15:22:39 -07:00
anand76 137cd4bb75 Disable error injection after compaction completion (#11798)
Summary:
https://github.com/facebook/rocksdb/issues/11789 added error injection during compaction to db_stress. However, error injection was not disabled after compaction completion, which resulted in some test failures due to stale errors.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11798

Reviewed By: cbi42

Differential Revision: D49022821

Pulled By: anand1976

fbshipit-source-id: 3cbfe18d55bee393697e063d05e7a7a7f88b7635
2023-09-06 12:57:10 -07:00
Changyu Bi 458acf8169 Add some unit tests when file read returns error during compaction/scanning (#11788)
Summary:
Some repro unit tests for the bug fixed in https://github.com/facebook/rocksdb/pull/11782.

Ran on main without https://github.com/facebook/rocksdb/pull/11782:
```
./db_compaction_test --gtest_filter='*ErrorWhenReadFileHead'
Note: Google Test filter = *ErrorWhenReadFileHead
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBCompactionTest
[ RUN      ] DBCompactionTest.ErrorWhenReadFileHead
db/db_compaction_test.cc:10105: Failure
Value of: s.IsIOError()
  Actual: false
Expected: true
[  FAILED  ] DBCompactionTest.ErrorWhenReadFileHead (3960 ms)

./db_iterator_test --gtest_filter="*ErrorWhenReadFile*"
Note: Google Test filter = *ErrorWhenReadFile*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBIteratorTest
[ RUN      ] DBIteratorTest.ErrorWhenReadFile
db/db_iterator_test.cc:3399: Failure
Value of: (iter->status()).ok()
  Actual: true
Expected: false
[  FAILED  ] DBIteratorTest.ErrorWhenReadFile (280 ms)
[----------] 1 test from DBIteratorTest (280 ms total)
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11788

Reviewed By: ajkr

Differential Revision: D48940284

Pulled By: cbi42

fbshipit-source-id: 06f3c5963f576db3f85d305ffb2745ee13d209bb
2023-09-06 10:23:41 -07:00
git-hulk 3f54b9696c Export GetCompactionReasonString/GetFlushReasonString by moving them into listener.h (#11778)
Summary:
Currently, rocksdb users would use the event listener to catch the compaction/flush event and log them if any. But now the reason is an integer type instead of a human-readable string, so we would like to convert them into a human-readable string.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11778

Reviewed By: jaykorean

Differential Revision: D49012934

Pulled By: ajkr

fbshipit-source-id: a4935b95d70c1be02aec65da7bf1c98a8cf8b933
2023-09-06 10:07:21 -07:00
Peter Dillinger 6a98471ae5 Fix bad include (#11797)
Summary:
There was a `#include "port/lang.h"` situated inside an `extern "C" {` which just started causing the header to be unusuable in some contexts. This was a regression on the CircleCI job build-linux-unity-and-headers in https://github.com/facebook/rocksdb/issues/11792

The include, and another like it, now appears obsolete so removed.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11797

Test Plan: local `make check-headers` and `make`, CI

Reviewed By: jaykorean

Differential Revision: D48976826

Pulled By: pdillinger

fbshipit-source-id: 131d66969e045f2ded0f8936924ee30c9ef2655a
2023-09-05 14:44:17 -07:00
Andrew Kryczka 392d6957cd Added compaction read errors to db_stress (#11789)
Summary:
- Fixed misspellings of "inject"
- Made user read errors retryable when `FLAGS_inject_error_severity == 1`
- Added compaction read errors when `FLAGS_read_fault_one_in > 0`. These are always retryable so that the DB will keep accepting writes
- Reenabled setting `compaction_readahead_size` in crash test. The reason for disabling it was to "keep the test clean", which is not a good enough reason to skip testing it

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11789

Test Plan:
With https://github.com/facebook/rocksdb/issues/11782 reverted, reproduced the bug:
- Build: `make -j56 db_stress`
- Command: `TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py blackbox --simple --write_buffer_size=524288 --target_file_size_base=524288 --max_bytes_for_level_base=2097152 --interval=10 --max_key=1000000`
- Output:
```
stderr has error message:
***put or merge error: Corruption: Compaction number of input keys does not match number of keys processed.***
```

Reviewed By: cbi42

Differential Revision: D48939994

Pulled By: ajkr

fbshipit-source-id: a1efb799efecdfd5d9cfd185e4a6321db8fccfbb
2023-09-05 10:41:29 -07:00