mirror of https://github.com/facebook/rocksdb.git
283 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Yu Zhang | 701a19cc83 |
Enable crash test for user-defined timestamp and BlobDB combination (#11163)
Summary: Enable the set of crash test for when user defined timestamp is enabled in combination with BlobDB. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11163 Test Plan: `make check` and `db_stress`/`db_crashtest.py` with various combinations. Reviewed By: ltamasi Differential Revision: D42906457 Pulled By: jowlyzhang fbshipit-source-id: 6bec6449a4213b536c787420ff30a7d17b676deb |
|
Peter Dillinger | 94e3beec77 |
Cleanup, improve, stress test LockWAL() (#11143)
Summary: The previous API comments for LockWAL didn't provide much about why you might want to use it, and didn't really meet what one would infer its contract was. Also, LockWAL was not in db_stress / crash test. In this change: * Implement a counting semantics for LockWAL()+UnlockWAL(), so that they can safely be used concurrently across threads or recursively within a thread. This should make the API much less bug-prone and easier to use. * Make sure no UnlockWAL() is needed after non-OK LockWAL() (to match RocksDB conventions) * Make UnlockWAL() reliably return non-OK when there's no matching LockWAL() (for debug-ability) * Clarify API comments on LockWAL(), UnlockWAL(), FlushWAL(), and SyncWAL(). Their exact meanings are not obvious, and I don't think it's appropriate to talk about implementation mutexes in the API comments, but about what operations might block each other. * Add LockWAL()/UnlockWAL() to db_stress and crash test, mostly to check for assertion failures, but also checks that latest seqno doesn't change while WAL is locked. This is simpler to add when LockWAL() is allowed in multiple threads. * Remove unnecessary use of sync points in test DBWALTest::LockWal. There was a bug during development of above changes that caused this test to fail sporadically, with and without this sync point change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11143 Test Plan: unit tests added / updated, added to stress/crash test Reviewed By: ajkr Differential Revision: D42848627 Pulled By: pdillinger fbshipit-source-id: 6d976c51791941a31fd8fbf28b0f82e888d9f4b4 |
|
Hui Xiao | b965a5a80e |
Add back Options::CompactionOptionsFIFO::allow_compaction to stress/crash test (#11063)
Summary: **Context/Summary:** https://github.com/facebook/rocksdb/pull/10777 was reverted (https://github.com/facebook/rocksdb/pull/10999) due to internal blocker and replaced with a better fix https://github.com/facebook/rocksdb/pull/10922. However, the revert also reverted the `Options::CompactionOptionsFIFO::allow_compaction` stress/crash coverage added by the PR. It's an useful coverage cuz setting `Options::CompactionOptionsFIFO::allow_compaction=true` will [increase](https://github.com/facebook/rocksdb/blob/7.8.fb/db/version_set.cc#L3255) the compaction score of L0 files for FIFO and then trigger more FIFO compaction. This speed up discovery of bug related to FIFO compaction like https://github.com/facebook/rocksdb/pull/10955. To see the speedup, compare the failure occurrence in following commands with `Options::CompactionOptionsFIFO::allow_compaction=true/false` ``` --fifo_allow_compaction=1 --acquire_snapshot_one_in=10000 --adaptive_readahead=0 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=8.869062094789008 --bottommost_compression_type=none --bytes_per_sync=0 --cache_index_and_filter_blocks=1 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=1 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=3 --compaction_style=2 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8589934591 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_whitebox --db_write_buffer_size=1048576 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=1 --file_checksum_impl=xxh64 --flush_one_in=1000000 --format_version=4 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=10 --index_type=2 --ingest_external_file_one_in=100 --initial_auto_readahead_size=16384 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --log2_keys_per_lock=10 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=524288 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --num_file_reads_for_auto_readahead=2 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=40000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=7 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_fault_one_in=1000 --readahead_size=0 --readpercent=15 --recycle_log_file_num=1 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=1 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=1 --use_full_merge_v1=1 --use_merge=0 --use_multiget=0 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --writepercent=65 ``` Therefore this PR is adding it back to stress/crash test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11063 Test Plan: Rehearsal stress test to make sure stress/crash test is stable Reviewed By: ajkr Differential Revision: D42283650 Pulled By: hx235 fbshipit-source-id: 132e6396ab6e24d8dcb8fe51c62dd5211cdf53ef |
|
Hui Xiao | f1574a20ff |
Revert PR 10777 "Fix FIFO causing overlapping seqnos in L0 files due to overla…" (#10999)
Summary:
**Context/Summary:**
This reverts commit
|
|
Peter Dillinger | 32520df1d9 |
Remove prototype FastLRUCache (#10954)
Summary: This was just a stepping stone to what eventually became HyperClockCache, and is now just more code to maintain. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10954 Test Plan: tests updated Reviewed By: akankshamahajan15 Differential Revision: D41310123 Pulled By: pdillinger fbshipit-source-id: 618ee148a1a0a29ee756ba8fe28359617b7cd67c |
|
anand76 | bf497e91ad |
Allow a custom DB cleanup command to be passed to db_crashtest.py (#10883)
Summary: This option allows a custom cleanup command line for a non-Posix file system to be used by db_crashtest.py to cleanup between runs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10883 Test Plan: Run the whitebox crash test Reviewed By: pdillinger Differential Revision: D40726424 Pulled By: anand1976 fbshipit-source-id: b827f6b583ff78f9ca75ced2d96f7e58f5200432 |
|
Hui Xiao | fc74abb436 |
Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos between ingested files and memtable's (#10777)
Summary: **Context:** Same as https://github.com/facebook/rocksdb/pull/5958#issue-511150930 but apply the fix to FIFO Compaction case Repro: ``` COERCE_CONTEXT_SWICH=1 make -j56 db_stress ./db_stress --acquire_snapshot_one_in=0 --adaptive_readahead=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=18 --bottommost_compression_type=disable --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=1 --charge_filter_construction=1 --charge_table_reader=1 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=0 --compact_range_one_in=1000 --compaction_pri=3 --open_files=-1 --compaction_style=2 --fifo_allow_compaction=1 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8388607 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zlib --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test0/rocksdb_crashtest_whitebox --db_write_buffer_size=8388608 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --fail_if_options_file_error=1 --file_checksum_impl=none --flush_one_in=1000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=0 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --ingest_external_file_one_in=100 --initial_auto_readahead_size=0 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --log2_keys_per_lock=10 --long_running_snapshots=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=16384 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=100000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=4194304 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --num_levels=1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=200000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=0 --periodic_compaction_seconds=0 --prefix_size=8 --prefixpercent=5 --prepopulate_block_cache=0 --progress_reports=0 --read_fault_one_in=0 --readahead_size=16384 --readpercent=45 --recycle_log_file_num=1 --reopen=20 --ribbon_starting_level=999 --snapshot_hold_ops=1000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=524288 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=0 --verify_db_one_in=1000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=524288 --write_dbid_to_manifest=0 --writepercent=35 put or merge error: Corruption: force_consistency_checks(DEBUG): VersionBuilder: L0 file https://github.com/facebook/rocksdb/issues/479 with seqno 23711 29070 vs. file https://github.com/facebook/rocksdb/issues/482 with seqno 27138 29049 ``` **Summary:** FIFO only does intra-L0 compaction in the following four cases. For other cases, FIFO drops data instead of compacting on data, which is irrelevant to the overlapping seqno issue we are solving. - [FIFOCompactionPicker::PickSizeCompaction](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L155) when `total size < compaction_options_fifo.max_table_files_size` and `compaction_options_fifo.allow_compaction == true` - For this path, we simply reuse the fix in `FindIntraL0Compaction` https://github.com/facebook/rocksdb/pull/5958/files#diff-c261f77d6dd2134333c4a955c311cf4a196a08d3c2bb6ce24fd6801407877c89R56 - This path was not stress-tested at all. Therefore we covered `fifo.allow_compaction` in stress test to surface the overlapping seqno issue we are fixing here. - [FIFOCompactionPicker::PickCompactionToWarm](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L313) when `compaction_options_fifo.age_for_warm > 0` - For this path, we simply replicate the idea in https://github.com/facebook/rocksdb/pull/5958#issue-511150930 and skip files of largest seqno greater than `earliest_mem_seqno` - This path was not stress-tested at all. However covering `age_for_warm` option worths a separate PR to deal with db stress compatibility. Therefore we manually tested this path for this PR - [FIFOCompactionPicker::CompactRange](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L365) that ends up picking one of the above two compactions - [CompactionPicker::CompactFiles](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker.cc#L378) - Since `SanitizeCompactionInputFiles()` will be called [before](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker.h#L111-L113) `CompactionPicker::CompactFiles` , we simply replicate the idea in https://github.com/facebook/rocksdb/pull/5958#issue-511150930 in `SanitizeCompactionInputFiles()`. To simplify implementation, we return `Stats::Abort()` on encountering seqno-overlapped file when doing compaction to L0 instead of skipping the file and proceed with the compaction. Some additional clean-up included in this PR: - Renamed `earliest_memtable_seqno` to `earliest_mem_seqno` for consistent naming - Added comment about `earliest_memtable_seqno` in related APIs - Made parameter `earliest_memtable_seqno` constant and required Pull Request resolved: https://github.com/facebook/rocksdb/pull/10777 Test Plan: - make check - New unit test `TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam, FlushAfterIntraL0CompactionWithIngestedFile)`corresponding to the above 4 cases, which will fail accordingly without the fix - Regular CI stress run on this PR + stress test with aggressive value https://github.com/facebook/rocksdb/pull/10761 and on FIFO compaction only Reviewed By: ajkr Differential Revision: D40090485 Pulled By: hx235 fbshipit-source-id: 52624186952ee7109117788741aeeac86b624a4f |
|
Levi Tamasi | 11c0d1310e |
Do not adjust test_batches_snapshots to avoid mixing runs (#10830)
Summary: This is a small follow-up to https://github.com/facebook/rocksdb/pull/10821. The goal of that PR was to hold `test_batches_snapshots` fixed across all `db_stress` invocations; however, that patch didn't address the case when `test_batches_snapshots` is unset due to a conflicting `enable_compaction_filter` or `prefix_size` setting. This PR updates the logic so the other parameter is sanitized instead in the case of such conflicts. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10830 Reviewed By: riversand963 Differential Revision: D40444548 Pulled By: ltamasi fbshipit-source-id: 0331265704904b729262adec37139292fcbb7805 |
|
Jay Zhuang | 8124bc3526 |
Enable preclude_last_level_data_seconds in stress test (#10824)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10824 Reviewed By: siying Differential Revision: D40390535 Pulled By: jay-zhuang fbshipit-source-id: 700803a1aff8a1e77c038740d87931577e79bcf6 |
|
Levi Tamasi | 3cd78bce1e |
Temporarily disable mixing batched and non-batched runs (#10821)
Summary: We have recently made some stress test improvements that rely on decoding the "value base" from the values stored in the database. This logic does not currently support the case when some KVs are written by a non-batched ops run and some by a batched ops run. The patch temporarily disables mixing these two. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10821 Reviewed By: riversand963 Differential Revision: D40367326 Pulled By: ltamasi fbshipit-source-id: 66f2e0cbc097ab6b1f9e4b39b833bd466f1aaab5 |
|
Yanqin Jin | 943247b76e |
Expand stress test coverage for min_write_buffer_number_to_merge (#10785)
Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10785 Test Plan: CI Reviewed By: ltamasi Differential Revision: D40162583 Pulled By: ltamasi fbshipit-source-id: 4e01f9b682f397130e286cf5d82190b7973fa3c1 |
|
Peter Dillinger | b205c6d029 |
Fix bug in HyperClockCache ApplyToEntries; cleanup (#10768)
Summary: We have seen some rare crash test failures in HyperClockCache, and the source could certainly be a bug fixed in this change, in ClockHandleTable::ConstApplyToEntriesRange. It wasn't properly accounting for the fact that incrementing the acquire counter could be ineffective, due to parallel updates. (When incrementing the acquire counter is ineffective, it is incorrect to then decrement it.) This change includes some other minor clean-up in HyperClockCache, and adds stats_dump_period_sec with a much lower period to the crash test. This should be the primary caller of ApplyToEntries, in collecting cache entry stats. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10768 Test Plan: haven't been able to reproduce the failure, but should be in a better state (bug fix and improved crash test) Reviewed By: anand1976 Differential Revision: D40034747 Pulled By: anand1976 fbshipit-source-id: a06fcefe146e17ee35001984445cedcf3b63eb68 |
|
Levi Tamasi | 3ae00dec90 |
Disable ingestion in stress tests when PutEntity is used (#10769)
Summary: `SstFileWriter` currently does not support the `PutEntity` API, so in `TestIngestExternalFile` all key-values are written using regular `Put`s. This violates the assumption that whether or not a key corresponds to a plain old key-value or a wide-column entity can be determined by solely looking at the "value base" used when generating the value. The patch fixes this issue by disabling ingestion when `PutEntity` is enabled in the stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10769 Test Plan: Ran a simple blackbox stress test. Reviewed By: akankshamahajan15 Differential Revision: D40042132 Pulled By: ltamasi fbshipit-source-id: 93e75ff55545b7b69fa4ddef1d96093c961158a0 |
|
Changyu Bi | 9f2363f4c4 |
User-defined timestamp support for `DeleteRange()` (#10661)
Summary: Add user-defined timestamp support for range deletion. The new API is `DeleteRange(opt, cf, begin_key, end_key, ts)`. Most of the change is to update the comparator to compare without timestamp. Other than that, major changes are - internal range tombstone data structures (`FragmentedRangeTombstoneList`, `RangeTombstone`, etc.) to store timestamps. - Garbage collection of range tombstones and range tombstone covered keys during compaction. - Get()/MultiGet() to return the timestamp of a range tombstone when needed. - Get/Iterator with range tombstones bounded by readoptions.timestamp. - timestamp crash test now issues DeleteRange by default. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10661 Test Plan: - Added unit test: `make check` - Stress test: `python3 tools/db_crashtest.py --enable_ts whitebox --readpercent=57 --prefixpercent=4 --writepercent=25 -delpercent=5 --iterpercent=5 --delrangepercent=4` - Ran `db_bench` to measure regression when timestamp is not enabled. The tests are for write (with some range deletion) and iterate with DB fitting in memory: `./db_bench--benchmarks=fillrandom,seekrandom --writes_per_range_tombstone=200 --max_write_buffer_number=100 --min_write_buffer_number_to_merge=100 --writes=500000 --reads=500000 --seek_nexts=10 --disable_auto_compactions -disable_wal=true --max_num_range_tombstones=1000`. Did not see consistent regression in no timestamp case. | micros/op | fillrandom | seekrandom | | --- | --- | --- | |main| 2.58 |10.96| |PR 10661| 2.68 |10.63| Reviewed By: riversand963 Differential Revision: D39441192 Pulled By: cbi42 fbshipit-source-id: f05aca3c41605caf110daf0ff405919f300ddec2 |
|
Hui Xiao | 3b8164912e |
Add manual_wal_flush, FlushWAL() to stress/crash test (#10698)
Summary: **Context/Summary:** Introduce `manual_wal_flush_one_in` as titled. - When `manual_wal_flush_one_in > 0`, we also need tracing to correctly verify recovery because WAL data can be lost in this case when `FlushWAL()` is not explicitly called by users of RocksDB (in our case, db stress) and the recovery from such potential WAL data loss is a prefix recovery that requires tracing to verify. As another consequence, we need to disable features can't run under unsync data loss with `manual_wal_flush_one_in` Incompatibilities fixed along the way: ``` db_stress: db/db_impl/db_impl_open.cc:2063: static rocksdb::Status rocksdb::DBImpl::Open(const rocksdb::DBOptions&, const string&, const std::vector<rocksdb::ColumnFamilyDescriptor>&, std::vector<rocksdb::ColumnFamilyHandle*>*, rocksdb::DB**, bool, bool): Assertion `impl->TEST_WALBufferIsEmpty()' failed. ``` - It turns out that `Writer::AddCompressionTypeRecord` before this assertion `EmitPhysicalRecord(kSetCompressionType, encode.data(), encode.size());` but do not trigger flush if `manual_wal_flush` is set . This leads to `impl->TEST_WALBufferIsEmpty()' is false. - As suggested, assertion is removed and violation case is handled by `FlushWAL(sync=true)` along with refactoring `TEST_WALBufferIsEmpty()` to be `WALBufferIsEmpty()` since it is used in prod code now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10698 Test Plan: - Locally running `python3 tools/db_crashtest.py blackbox --manual_wal_flush_one_in=1 --manual_wal_flush=1 --sync_wal_one_in=100 --atomic_flush=1 --flush_one_in=100 --column_families=3` - Joined https://github.com/facebook/rocksdb/pull/10624 in auto CI testings with all RocksDB stress/crash test jobs Reviewed By: ajkr Differential Revision: D39593752 Pulled By: ajkr fbshipit-source-id: 3a2135bb792c52d2ffa60257d4fbc557fb04d2ce |
|
Levi Tamasi | 9078fcccee |
Add the PutEntity API to the stress/crash tests (#10760)
Summary: The patch adds the `PutEntity` API to the non-batched, batched, and CF consistency stress tests. Namely, when the new `db_stress` command line parameter `use_put_entity_one_in` is greater than zero, one in N writes on average is performed using `PutEntity` rather than `Put`. The wide-column entity written has the generated value in its default column; in addition, it contains up to three additional columns where the original generated value is divided up between the column name and the column value (with the column name containing the first k characters of the generated value, and the column value containing the rest). Whether `PutEntity` is used (and if so, how many columns the entity has) is completely determined by the "value base" used to generate the value (that is, there is no randomness involved). Assuming the same `use_put_entity_one_in` setting is used across `db_stress` invocations, this enables us to reconstruct and validate the entity during subsequent `db_stress` runs. Note that `PutEntity` is currently incompatible with `Merge`, transactions, and user-defined timestamps; these combinations are currently disabled/disallowed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10760 Test Plan: Ran some batched, non-batched, and CF consistency stress tests using the script. Reviewed By: riversand963 Differential Revision: D39939032 Pulled By: ltamasi fbshipit-source-id: eafdf124e95993fb7d73158e3b006d11819f7fa9 |
|
Hui Xiao | aa71464410 |
Remove and recreate expected values dir in white-box testing 2nd half (#10743)
Summary: **Context:** https://github.com/facebook/rocksdb/pull/10732#pullrequestreview-1121076205 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10743 Test Plan: - Locally run `python3 ./tools/db_crashtest.py whitebox --simple -max_key=1000000 -value_size_mult=33 -write_buffer_size=524288 -target_file_size_base=524288 -max_bytes_for_level_base=2097152 --duration=120 --interval=10 --ops_per_thread=1000 --random_kill_odd=887` - CI jobs testing Reviewed By: ajkr Differential Revision: D39838733 Pulled By: ajkr fbshipit-source-id: 9e819b66b0293dfc7a31a908a9d42c6baca4aeaa |
|
Hui Xiao | f3b359a549 |
Set options.num_levels in db_stress_test_base (#10732)
Summary: An add-on to https://github.com/facebook/rocksdb/pull/6818 to complete adding single-level universal compaction to stress/crash testing. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10732 Test Plan: - Locally run for 10 min `python3 ./tools/db_crashtest.py whitebox --simple --compaction_style=1 --num_levels=1 -max_key=1000000 -value_size_mult=33 -write_buffer_size=524288 -target_file_size_base=524288 -max_bytes_for_level_base=2097152 --duration=120 --interval=10 --ops_per_thread=1000 --random_kill_odd=887` - Check LOG to confirm single-level universal compaction is called - Manual testing and log checking to ensure destroy_db_initially=1 is correctly set across runs with different compaction styles (i.e, in the second half of whitebox testing). - [ongoing]CI jobs stress test Reviewed By: ajkr Differential Revision: D39797612 Pulled By: ajkr fbshipit-source-id: 16f5c40c3464c57360c06c8305f92118e426149c |
|
Hui Xiao | aed30ddf21 |
Support WriteCommit policy with sync_fault_injection=1 (#10624)
Summary:
**Context:**
Prior to this PR, correctness testing with un-sync data loss [disabled](https://github.com/facebook/rocksdb/pull/10605) transaction (`use_txn=1`) thus all of the `txn_write_policy` . This PR improved that by adding support for one policy - WriteCommit (`txn_write_policy=0`).
**Summary:**
They key to this support is (a) handle Mark{Begin, End}Prepare/MarkCommit/MarkRollback in constructing ExpectedState under WriteCommit policy correctly and (b) monitor CI jobs and solve any test incompatibility issue till jobs are stable. (b) will be part of the test plan.
For (a)
- During prepare (i.e, between `MarkBeginPrepare()` and `MarkEndPrepare(xid)`), `ExpectedStateTraceRecordHandler` will buffer all writes by adding all writes to an internal `WriteBatch`.
- On `MarkEndPrepare()`, that `WriteBatch` will be associated with the transaction's `xid`.
- During the commit (i.e, on `MarkCommit(xid)`), `ExpectedStateTraceRecordHandler` will retrieve and iterate the internal `WriteBatch` and finally apply those writes to `ExpectedState`
- During the rollback (i.e, on `MarkRollback(xid)`), `ExpectedStateTraceRecordHandler` will erase the internal `WriteBatch` from the map.
For (b) - one major issue described below:
- TransactionsDB in db stress recovers prepared-but-not-committed txns from the previous crashed run by randomly committing or rolling back it at the start of the current run, see a historical [PR](
|
|
Bo Wang | dd40f83e95 |
Fix lint issues after enable BLACK (#10717)
Summary: As title Pull Request resolved: https://github.com/facebook/rocksdb/pull/10717 Test Plan: Unit Tests CI Reviewed By: riversand963 Differential Revision: D39700707 Pulled By: gitbw95 fbshipit-source-id: 54de27e695535a50159f5f6467da36aaf21bebae |
|
Bo Wang | 9e01de9066 |
Enable BLACK for internal_repo_rocksdb (#10710)
Summary: Enable BLACK for internal_repo_rocksdb. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10710 Reviewed By: riversand963, zsol Differential Revision: D39666245 Pulled By: gitbw95 fbshipit-source-id: ef364318d2bbba66e96f3211dd6a975174d52c21 |
|
Jay Zhuang | 00050d4634 |
Disable tiered storage + BlobDB stress test (#10699)
Summary: There're 2 knobs to disable blobdb, adding that. Also print call stack when there's assert failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10699 Reviewed By: gitbw95 Differential Revision: D39596448 Pulled By: jay-zhuang fbshipit-source-id: 5ce9fd0630d8b6ff1e157a2685a1e80a99997098 |
|
gitbw95 | 2cc5b39560 |
Add enable_split_merge option for CompressedSecondaryCache (#10690)
Summary: `enable_custom_split_merge` is added for enabling the custom split and merge feature, which split the compressed value into chunks so that they may better fit jemalloc bins. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10690 Test Plan: Unit Tests Stress Tests Reviewed By: anand1976 Differential Revision: D39567604 Pulled By: gitbw95 fbshipit-source-id: f6d1d46200f365220055f793514601dcb0edc4b7 |
|
Peter Dillinger | 0f91c72adc |
Call experimental new clock cache HyperClockCache (#10684)
Summary: This change establishes a distinctive name for the experimental new lock-free clock cache (originally developed by guidotag and revamped in PR https://github.com/facebook/rocksdb/issues/10626). A few reasons: * We want to make it clear that this is a fundamentally different implementation vs. the old clock cache, to avoid people saying "I already tried clock cache." * We want to highlight the key feature: it's fast (especially under parallel load) * Because it requires an estimated charge per entry, it is not drop-in API compatible with old clock cache. This estimate might always be required for highest performance, and giving it a distinct name should reduce confusion about the distinct API requirements. * We might develop a variant requiring the same estimate parameter but with LRU eviction. In that case, using the name HyperLRUCache should make things more clear. (FastLRUCache is just a prototype that might soon be removed.) Some API detail: * To reduce copy-pasting parameter lists, etc. as in LRUCache construction, I have a `MakeSharedCache()` function on `HyperClockCacheOptions` instead of `NewHyperClockCache()`. * Changes -cache_type=clock_cache to -cache_type=hyper_clock_cache for applicable tools. I think this is more consistent / sustainable for reasons already stated. For performance tests see https://github.com/facebook/rocksdb/pull/10626 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10684 Test Plan: no interesting functional changes; tests updated Reviewed By: anand1976 Differential Revision: D39547800 Pulled By: pdillinger fbshipit-source-id: 5c0fe1b5cf3cb680ab369b928c8569682b9795bf |
|
Yanqin Jin | 088b9844d4 |
Re-enable user-defined timestamp and subcompactions (#10689)
Summary: Hopefully, we can re-enable the combination of user-defined timestamp and subcompactions after https://github.com/facebook/rocksdb/issues/10658. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10689 Test Plan: Make sure the following succeeds on devserver. make crash_test_with_ts Reviewed By: ltamasi Differential Revision: D39556558 Pulled By: riversand963 fbshipit-source-id: 4695f420b1bc9ebf3b24640b693746f4db82c149 |
|
Jay Zhuang | 1cdc84114f |
Tiered Storage feature doesn't support BlobDB yet (#10681)
Summary: Disable the tiered storage + BlobDB test. Also enable different hot data setting for Tiered compaction Pull Request resolved: https://github.com/facebook/rocksdb/pull/10681 Reviewed By: ajkr Differential Revision: D39531941 Pulled By: jay-zhuang fbshipit-source-id: aa0595eb38d03f17638d300d2e4cc9061429bf61 |
|
Akanksha Mahajan | 7a9ecdac3c |
Add auto prefetching parameters to db_bench and db_stress (#10632)
Summary: Same as title Pull Request resolved: https://github.com/facebook/rocksdb/pull/10632 Test Plan: make crash_test -j32 Reviewed By: anand1976 Differential Revision: D39241479 Pulled By: akankshamahajan15 fbshipit-source-id: 5db5b0c007da786bacc1b30d8926d36d6d029b87 |
|
Andrew Kryczka | 4100eb3053 |
minor cleanups to db_crashtest.py (#10654)
Summary: Expanded `all_params` to include all parameters crash test may set. Previously, `atomic_flush` was not included in `all_params` and thus was not visible to `finalize_and_sanitize()`. The consequence was manual crash test runs could provide unsafe combinations of parameters to `db_stress`. For example, running `db_crashtest.py` with `-atomic_flush=0` could cause `db_stress` to run with `-atomic_flush=0 -disable_wal=1`, which is known to produce inconsistencies across column families. While expanding `all_params`, I found we cannot have an entry in it for both `db_stress` and `db_crashtest.py`. So I renamed `enable_tiered_storage` to `test_tiered_storage` for `db_crashtest.py`, which appears more conventional anyways. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10654 Reviewed By: hx235 Differential Revision: D39369349 Pulled By: ajkr fbshipit-source-id: 31d9010c760c868b20d5e9bd78ba75c8ff3ce348 |
|
Andrew Kryczka | ccf822492f |
Reenable sync_fault_injection in crash test (#10172)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10172 Reviewed By: riversand963 Differential Revision: D37164671 Pulled By: ajkr fbshipit-source-id: 40eb919b8dc261d502510e878ee8ac7874ab35d0 |
|
Hui Xiao | e7525a1fff |
Disable use_txn=1 with sync_fault_injection=1 in db_crashtest.py (#10605)
Summary: **Context/Summary:** `ExpectedState` is not aware of transaction-related concept so `use_txn=1 ` is not compatible with `sync_fault_injection=1`. Therefore this PR disabled this combination until we expand our correctness testing to transaction related features. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10605 Test Plan: - Run the following commands to verify `--use_txn` is correctly sanitized - `python3 ./tools/db_crashtest.py blackbox --use_txn=1 --sync_fault_injection=1 ` - `python3 ./tools/db_crashtest.py blackbox --use_txn=0 --sync_fault_injection=1 ` Reviewed By: ajkr Differential Revision: D39121287 Pulled By: hx235 fbshipit-source-id: 7d5d6dd32479ea1c07df4f38322650f3a60def9c |
|
Andrew Kryczka | d95e376368 |
Disable db_stress features incompatible with unsynced data dropping when sync_fault_injection=1 (#10559)
Summary: The features that cannot work with disable_wal=1 due to unsynced data dropping (ingest_external_file_one_in and enable_compaction_filter) similarly cannot work with sync_fault_injection=1. This PR prevents those features from being used together with sync_fault_injection=1. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10559 Reviewed By: hx235 Differential Revision: D38953019 Pulled By: ajkr fbshipit-source-id: 7e2c7644ec84d7323f632cf976bcee00502d0ed7 |
|
Changyu Bi | d140fbfd7d |
Add Iterator test against expected state to stress test (#10538)
Summary: As mentioned in https://github.com/facebook/rocksdb/pull/5506#issuecomment-506021913, `db_stress` does not have much verification for iterator correctness. It has a `TestIterate()` function, but that is mainly for comparing results between two iterators, one with `total_order_seek` and the other optionally sets auto_prefix, upper/lower bounds. Commit 49a0581ad2462e31aa3f768afa769e0d33390f33 added a new `TestIterateAgainstExpected()` function that compares iterator against expected state. It locks a range of keys, creates an iterator, does a random sequence of `Next/Prev` and compares against expected state. This PR is based on that commit, the main changes include some logs (for easier debugging if a test fails), a forward and backward scan to cover the entire locked key range, and a flag for optionally turning on this version of Iterator testing. Added constraint that the checks against expected state in `TestIterateAgainstExpected()` and in `TestGet()` are only turned on when `--skip_verifydb` flag is not set. Remove the change log introduced in https://github.com/facebook/rocksdb/issues/10553. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10538 Test Plan: Run `db_stress` with `--verify_iterator_with_expected_state_one_in=1`, and a large `--iterpercent` and `--num_iterations`. Checked `op_logs` manually to ensure expected coverage. Tweaked part of the code in https://github.com/facebook/rocksdb/issues/10449 and stress test was able to catch it. - internally run various flavor of crash test Reviewed By: ajkr Differential Revision: D38847269 Pulled By: cbi42 fbshipit-source-id: 8b4402a9bba9f6cfa08051943cd672579d489599 |
|
Bo Wang | 13cb7a84b6 |
Fix the memory leak in db_stress tests that are caused by `FaultInjectionSecondaryCache` and add `CompressedSecondaryCache` into stress tests. (#10523)
Summary: 1. Fix the memory leak in db_stress tests that are caused by `FaultInjectionSecondaryCache`. To address the test requirements for both CompressedSecondaryCache and CachlibWrapper, a new class variable `base_is_compressed_sec_cache_` is added to determine the different behaviors in `Lookup()` and `WaitAll()`. 2. Add `CompressedSecondaryCache` into stress tests. Before this PR, memory leak is reported during crash tests if `CompressedSecondaryCache` is in stress tests. One example is shown as follows: ``` ==70722==ERROR: LeakSanitizer: detected memory leaks Direct leak of 6648240 byte(s) in 83103 object(s) allocated from: #0 0x13de9d7 in operator new(unsigned long) (/data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/fbcode/buck-out/dbgo/gen/aab7ed39/internal_repo_rocksdb/repo/db_stress+0x13de9d7) https://github.com/facebook/rocksdb/issues/1 0x9084c7 in rocksdb::BlocklikeTraits<rocksdb::Block>::Create(rocksdb::BlockContents&&, unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*) internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:128 https://github.com/facebook/rocksdb/issues/2 0x9084c7 in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)::operator()(void const*, unsigned long, void**, unsigned long*) const internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:34 https://github.com/facebook/rocksdb/issues/3 0x9082c9 in rocksdb::Block std::__invoke_impl<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::__invoke_other, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:61 https://github.com/facebook/rocksdb/issues/4 0x90825d in std::enable_if<is_invocable_r_v<rocksdb::Block, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>, rocksdb::Block>::type std::__invoke_r<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:114 https://github.com/facebook/rocksdb/issues/5 0x9081b0 in std::_Function_handler<rocksdb::Status (void const*, unsigned long, void**, unsigned long*), std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)>::_M_invoke(std::_Any_data const&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:291 https://github.com/facebook/rocksdb/issues/6 0x991f2c in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)>::operator()(void const*, unsigned long, void**, unsigned long*) const third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:560 https://github.com/facebook/rocksdb/issues/7 0x990277 in rocksdb::CompressedSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/cache/compressed_secondary_cache.cc:77 https://github.com/facebook/rocksdb/issues/8 0xd3aa4d in rocksdb::FaultInjectionSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/utilities/fault_injection_secondary_cache.cc:92 https://github.com/facebook/rocksdb/issues/9 0xeadaab in rocksdb::lru_cache::LRUCacheShard::Lookup(rocksdb::Slice const&, unsigned int, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/lru_cache.cc:445 https://github.com/facebook/rocksdb/issues/10 0x1064573 in rocksdb::ShardedCache::Lookup(rocksdb::Slice const&, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/sharded_cache.cc:89 https://github.com/facebook/rocksdb/issues/11 0x8be0df in rocksdb::BlockBasedTable::GetEntryFromCache(rocksdb::CacheTier const&, rocksdb::Cache*, rocksdb::Slice const&, rocksdb::BlockType, bool, rocksdb::GetContext*, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:389 https://github.com/facebook/rocksdb/issues/12 0x905790 in rocksdb::Status rocksdb::BlockBasedTable::GetDataBlockFromCache<rocksdb::Block>(rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::UncompressionDict const&, rocksdb::BlockType, bool, rocksdb::GetContext*) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1263 https://github.com/facebook/rocksdb/issues/13 0x8b9259 in rocksdb::Status rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, bool, bool, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::BlockContents*, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1559 https://github.com/facebook/rocksdb/issues/14 0x8b710c in rocksdb::Status rocksdb::BlockBasedTable::RetrieveBlock<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, bool, bool, bool, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1726 https://github.com/facebook/rocksdb/issues/15 0x8c329f in rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::FilePrefetchBuffer*, bool, bool, rocksdb::Status&) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader_impl.h:58 https://github.com/facebook/rocksdb/issues/16 0x920117 in rocksdb::BlockBasedTableIterator::InitDataBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:262 https://github.com/facebook/rocksdb/issues/17 0x920d42 in rocksdb::BlockBasedTableIterator::MaterializeCurrentBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:332 https://github.com/facebook/rocksdb/issues/18 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/19 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/20 0xef9f6c in rocksdb::MergingIterator::PrepareValue() internal_repo_rocksdb/repo/table/merging_iterator.cc:260 https://github.com/facebook/rocksdb/issues/21 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/22 0xc67bcd in rocksdb::DBIter::FindNextUserEntryInternal(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:326 https://github.com/facebook/rocksdb/issues/23 0xc66d36 in rocksdb::DBIter::FindNextUserEntry(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:234 https://github.com/facebook/rocksdb/issues/24 0xc7ab47 in rocksdb::DBIter::Next() internal_repo_rocksdb/repo/db/db_iter.cc:161 https://github.com/facebook/rocksdb/issues/25 0x70d938 in rocksdb::BatchedOpsStressTest::TestPrefixScan(rocksdb::ThreadState*, rocksdb::ReadOptions const&, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&) internal_repo_rocksdb/repo/db_stress_tool/batched_ops_stress.cc:320 https://github.com/facebook/rocksdb/issues/26 0x6dc6a8 in rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:907 https://github.com/facebook/rocksdb/issues/27 0x6867de in rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 https://github.com/facebook/rocksdb/issues/28 0xce4cc2 in rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:461 https://github.com/facebook/rocksdb/issues/29 0x7f23f9068c0e in start_thread /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/nptl/pthread_create.c:434:8 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/10523 Test Plan: ``` $COMPILE_WITH_ASAN=1 make -j 24 $db_stress J=40 crash_test_with_txn ``` Reviewed By: anand1976 Differential Revision: D38646839 Pulled By: gitbw95 fbshipit-source-id: 9452895c7dc95481a9d7afe83b15193cf5b1c43e |
|
Changyu Bi | fd165c869d |
Add memtable per key-value checksum (#10281)
Summary: Append per key-value checksum to internal key. These checksums are verified on read paths including Get, Iterator and during Flush. Get and Iterator will return `Corruption` status if there is a checksum verification failure. Flush will make DB become read-only upon memtable entry checksum verification failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10281 Test Plan: - Added new unit test cases: `make check` - Benchmark on memtable insert ``` TEST_TMPDIR=/dev/shm/memtable_write ./db_bench -benchmarks=fillseq -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100 # avg over 10 runs Baseline: 1166936 ops/sec memtable 2 bytes kv checksum : 1.11674e+06 ops/sec (-4%) memtable 2 bytes kv checksum + write batch 8 bytes kv checksum: 1.08579e+06 ops/sec (-6.95%) write batch 8 bytes kv checksum: 1.17979e+06 ops/sec (+1.1%) ``` - Benchmark on only memtable read: ops/sec dropped 31% for `readseq` due to time spend on verifying checksum. ops/sec for `readrandom` dropped ~6.8%. ``` # Readseq sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillseq,readseq"[-X20]" -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100 readseq [AVG 20 runs] : 7432840 (± 212005) ops/sec; 822.3 (± 23.5) MB/sec readseq [MEDIAN 20 runs] : 7573878 ops/sec; 837.9 MB/sec With -memtable_protection_bytes_per_key=2: readseq [AVG 20 runs] : 5134607 (± 119596) ops/sec; 568.0 (± 13.2) MB/sec readseq [MEDIAN 20 runs] : 5232946 ops/sec; 578.9 MB/sec # Readrandom sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillrandom,readrandom"[-X10]" -disable_wal=true -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=100 readrandom [AVG 10 runs] : 140236 (± 3938) ops/sec; 9.8 (± 0.3) MB/sec readrandom [MEDIAN 10 runs] : 140545 ops/sec; 9.8 MB/sec With -memtable_protection_bytes_per_key=2: readrandom [AVG 10 runs] : 130632 (± 2738) ops/sec; 9.1 (± 0.2) MB/sec readrandom [MEDIAN 10 runs] : 130341 ops/sec; 9.1 MB/sec ``` - Stress test: `python3 -u tools/db_crashtest.py whitebox --duration=1800` Reviewed By: ajkr Differential Revision: D37607896 Pulled By: cbi42 fbshipit-source-id: fdaefb475629d2471780d4a5f5bf81b44ee56113 |
|
gitbw95 | b57155a0bd |
Revert "Add CompressedSecondaryCache into stress test" #10442 (#10509)
Summary: Revert https://github.com/facebook/rocksdb/pull/10442 before I find the root cause and fix the memory leak in db_stress tests that are caused by `FaultInjectionSecondaryCache`. Memory leak is reported during crash tests and one example is shown as follows: ``` ==70722==ERROR: LeakSanitizer: detected memory leaks Direct leak of 6648240 byte(s) in 83103 object(s) allocated from: #0 0x13de9d7 in operator new(unsigned long) (/data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/fbcode/buck-out/dbgo/gen/aab7ed39/internal_repo_rocksdb/repo/db_stress+0x13de9d7) https://github.com/facebook/rocksdb/issues/1 0x9084c7 in rocksdb::BlocklikeTraits<rocksdb::Block>::Create(rocksdb::BlockContents&&, unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*) internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:128 https://github.com/facebook/rocksdb/issues/2 0x9084c7 in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)::operator()(void const*, unsigned long, void**, unsigned long*) const internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:34 https://github.com/facebook/rocksdb/issues/3 0x9082c9 in rocksdb::Block std::__invoke_impl<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::__invoke_other, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:61 https://github.com/facebook/rocksdb/issues/4 0x90825d in std::enable_if<is_invocable_r_v<rocksdb::Block, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>, rocksdb::Block>::type std::__invoke_r<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:114 https://github.com/facebook/rocksdb/issues/5 0x9081b0 in std::_Function_handler<rocksdb::Status (void const*, unsigned long, void**, unsigned long*), std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)>::_M_invoke(std::_Any_data const&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:291 https://github.com/facebook/rocksdb/issues/6 0x991f2c in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)>::operator()(void const*, unsigned long, void**, unsigned long*) const third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:560 https://github.com/facebook/rocksdb/issues/7 0x990277 in rocksdb::CompressedSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/cache/compressed_secondary_cache.cc:77 https://github.com/facebook/rocksdb/issues/8 0xd3aa4d in rocksdb::FaultInjectionSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/utilities/fault_injection_secondary_cache.cc:92 https://github.com/facebook/rocksdb/issues/9 0xeadaab in rocksdb::lru_cache::LRUCacheShard::Lookup(rocksdb::Slice const&, unsigned int, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/lru_cache.cc:445 https://github.com/facebook/rocksdb/issues/10 0x1064573 in rocksdb::ShardedCache::Lookup(rocksdb::Slice const&, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/sharded_cache.cc:89 https://github.com/facebook/rocksdb/issues/11 0x8be0df in rocksdb::BlockBasedTable::GetEntryFromCache(rocksdb::CacheTier const&, rocksdb::Cache*, rocksdb::Slice const&, rocksdb::BlockType, bool, rocksdb::GetContext*, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:389 https://github.com/facebook/rocksdb/issues/12 0x905790 in rocksdb::Status rocksdb::BlockBasedTable::GetDataBlockFromCache<rocksdb::Block>(rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::UncompressionDict const&, rocksdb::BlockType, bool, rocksdb::GetContext*) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1263 https://github.com/facebook/rocksdb/issues/13 0x8b9259 in rocksdb::Status rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, bool, bool, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::BlockContents*, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1559 https://github.com/facebook/rocksdb/issues/14 0x8b710c in rocksdb::Status rocksdb::BlockBasedTable::RetrieveBlock<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, bool, bool, bool, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1726 https://github.com/facebook/rocksdb/issues/15 0x8c329f in rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::FilePrefetchBuffer*, bool, bool, rocksdb::Status&) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader_impl.h:58 https://github.com/facebook/rocksdb/issues/16 0x920117 in rocksdb::BlockBasedTableIterator::InitDataBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:262 https://github.com/facebook/rocksdb/issues/17 0x920d42 in rocksdb::BlockBasedTableIterator::MaterializeCurrentBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:332 https://github.com/facebook/rocksdb/issues/18 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/19 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/20 0xef9f6c in rocksdb::MergingIterator::PrepareValue() internal_repo_rocksdb/repo/table/merging_iterator.cc:260 https://github.com/facebook/rocksdb/issues/21 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78 https://github.com/facebook/rocksdb/issues/22 0xc67bcd in rocksdb::DBIter::FindNextUserEntryInternal(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:326 https://github.com/facebook/rocksdb/issues/23 0xc66d36 in rocksdb::DBIter::FindNextUserEntry(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:234 https://github.com/facebook/rocksdb/issues/24 0xc7ab47 in rocksdb::DBIter::Next() internal_repo_rocksdb/repo/db/db_iter.cc:161 https://github.com/facebook/rocksdb/issues/25 0x70d938 in rocksdb::BatchedOpsStressTest::TestPrefixScan(rocksdb::ThreadState*, rocksdb::ReadOptions const&, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&) internal_repo_rocksdb/repo/db_stress_tool/batched_ops_stress.cc:320 https://github.com/facebook/rocksdb/issues/26 0x6dc6a8 in rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:907 https://github.com/facebook/rocksdb/issues/27 0x6867de in rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 https://github.com/facebook/rocksdb/issues/28 0xce4cc2 in rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:461 https://github.com/facebook/rocksdb/issues/29 0x7f23f9068c0e in start_thread /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/nptl/pthread_create.c:434:8 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/10509 Test Plan: ``` $COMPILE_WITH_ASAN=1 make -j 24 $db_stress J=40 crash_test_with_txn ``` Reviewed By: siying Differential Revision: D38540648 Pulled By: gitbw95 fbshipit-source-id: 703948e3a7ba40828a6445d00f3e73c184e34bf7 |
|
Akanksha Mahajan | 563f574372 |
Disable subcompactions for user_defined_timestamp (#10503)
Summary: Currently user_defined_timestamp is failing in stress test with subcompactions. So disabling it for now and will re enable it once its fixed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10503 Test Plan: make crash_test_with_ts -j32 Reviewed By: riversand963 Differential Revision: D38510485 Pulled By: akankshamahajan15 fbshipit-source-id: 82fd0ec8cf86a96ff6653edd5bad7623cb9e0a15 |
|
Jay Zhuang | 1e86d424e4 |
Tiered storage stress test (#10493)
Summary: Add Tiered storage stress test and db_bench option Pull Request resolved: https://github.com/facebook/rocksdb/pull/10493 Test Plan: new crashtest: https://app.circleci.com/pipelines/github/facebook/rocksdb/16905/workflows/68c2967c-9274-434f-8506-1403cf441ead Reviewed By: ajkr Differential Revision: D38481892 Pulled By: jay-zhuang fbshipit-source-id: 217a0be4acb93d420222e6ede2a1290d9f464776 |
|
gitbw95 | e1b176d274 |
Add CompressedSecondaryCache into stress test (#10442)
Summary: The secondary cache is randomly disabled or enabled with CompressedSecondaryCache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10442 Test Plan: - To test that the CompressedSecondaryCache is used and the stress test runs successfully, run `make -j24 CRASH_TEST_EXT_ARGS=—duration=960 blackbox_crash_test ` Reviewed By: anand1976 Differential Revision: D38290796 Pulled By: gitbw95 fbshipit-source-id: bb7027b39e0ed9c0c62835abe09e759898130ec8 |
|
Guido Tagliavini Ponce | 9d7de6517c |
Towards a production-quality ClockCache (#10418)
Summary: In this PR we bring ClockCache closer to production quality. We implement the following changes: 1. Fixed a few bugs in ClockCache. 2. ClockCache now fully supports ``strict_capacity_limit == false``: When an insertion over capacity is commanded, we allocate a handle separately from the hash table. 3. ClockCache now runs on almost every test in cache_test. The only exceptions are a test where either the LRU policy is required, and a test that dynamically increases the table capacity. 4. ClockCache now supports dynamically decreasing capacity via SetCapacity. (This is easy: we shrink the capacity upper bound and run the clock algorithm.) 5. Old FastLRUCache tests in lru_cache_test.cc are now also used on ClockCache. As a byproduct of 1. and 2. we are able to turn on ClockCache in the stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10418 Test Plan: - ``make -j24 USE_CLANG=1 COMPILE_WITH_ASAN=1 COMPILE_WITH_UBSAN=1 check`` - ``make -j24 USE_CLANG=1 COMPILE_WITH_TSAN=1 check`` - ``make -j24 USE_CLANG=1 COMPILE_WITH_ASAN=1 COMPILE_WITH_UBSAN=1 CRASH_TEST_EXT_ARGS="--duration=960 --cache_type=clock_cache" blackbox_crash_test_with_atomic_flush`` - ``make -j24 USE_CLANG=1 COMPILE_WITH_TSAN=1 CRASH_TEST_EXT_ARGS="--duration=960 --cache_type=clock_cache" blackbox_crash_test_with_atomic_flush`` Reviewed By: pdillinger Differential Revision: D38170673 Pulled By: guidotag fbshipit-source-id: 508987b9dc9d9d68f1a03eefac769820b680340a |
|
Gang Liao | ec4ebeff30 |
Support prepopulating/warming the blob cache (#10298)
Summary: Many workloads have temporal locality, where recently written items are read back in a short period of time. When using remote file systems, this is inefficient since it involves network traffic and higher latencies. Because of this, we would like to support prepopulating the blob cache during flush. This task is a part of https://github.com/facebook/rocksdb/issues/10156 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10298 Reviewed By: ltamasi Differential Revision: D37908743 Pulled By: gangliao fbshipit-source-id: 9feaed234bc719d38f0c02975c1ad19fa4bb37d1 |
|
Yanqin Jin | 7679f22a89 |
Add coverage for the combination of write-prepared and WAL recycling (#10350)
Summary: as title. Test plan - make check - CI on PR - TEST_TMPDIR=/dev/shm make crash_test_with_multiops_wp_txn (tested with successful run) Pull Request resolved: https://github.com/facebook/rocksdb/pull/10350 Reviewed By: ajkr Differential Revision: D37792872 Pulled By: riversand963 fbshipit-source-id: ff064093b7f715d0acf387af2e3ae87b1278b52b |
|
Yanqin Jin | 2f13f5f7d0 |
Add coverage for timestamped snapshot to MultiOpsTxnsStressTest (#10325)
Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10325 Test Plan: ```bash TEST_TMPDIR=/dev/shm/rocksdb/ make crash_test_with_multiops_wc_txn TEST_TMPDIR=/dev/shm/rocksdb/ make crash_test_with_txn ``` Reviewed By: akankshamahajan15 Differential Revision: D37688742 Pulled By: riversand963 fbshipit-source-id: e198ace921898af63f99e869568c1a7bbf69f1a4 |
|
zczhu | e716bda010 |
Add FLAGS_compaction_pri into crash_test (#10255)
Summary: Add FLAGS_compaction_pri into correctness test Pull Request resolved: https://github.com/facebook/rocksdb/pull/10255 Test Plan: run crash_test with FLAGS_compaction_pri Reviewed By: ajkr Differential Revision: D37510372 Pulled By: littlepig2013 fbshipit-source-id: 73d93a0a047d0c3993c8a512383dd6ee6acef641 |
|
Guido Tagliavini Ponce | 57a0e2f304 |
Clock cache (#10273)
Summary: This is the initial step in the development of a lock-free clock cache. This PR includes the base hash table design (which we mostly ported over from FastLRUCache) and the clock eviction algorithm. Importantly, it's still _not_ lock-free---all operations use a shard lock. Besides the locking, there are other features left as future work: - Remove keys from the handles. Instead, use 128-bit bijective hashes of them for handle comparisons, probing (we need two 32-bit hashes of the key for double hashing) and sharding (we need one 6-bit hash). - Remove the clock_usage_ field, which is updated on every lookup. Even if it were atomically updated, it could cause memory invalidations across cores. - Middle insertions into the clock list. - A test that exercises the clock eviction policy. - Update the Java API of ClockCache and Java calls to C++. Along the way, we improved the code and comments quality of FastLRUCache. These changes are relatively minor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10273 Test Plan: ``make -j24 check`` Reviewed By: pdillinger Differential Revision: D37522461 Pulled By: guidotag fbshipit-source-id: 3d70b737dbb70dcf662f00cef8c609750f083943 |
|
Yanqin Jin | d3de59255a |
Enable compaction filter for db_stress with user-defined timestamp (#10259)
Summary: Before this PR, when user-defined timestamp is enabled, db_stress disables compaction filter. This is no longer necessary after this PR, since the `DbStressCompactionFilter` is now aware of the presence of timestamps. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10259 Test Plan: TEST_TMPDIR=/dev/shm make crash_test_with_ts Reviewed By: ajkr Differential Revision: D37459692 Pulled By: riversand963 fbshipit-source-id: 8fe62e90a63bd9317fe1bb95a2b4984080c9e5ef |
|
Andrew Kryczka | f322f273b0 |
Temporarily disable mempurge in crash test (#10252)
Summary: Need to disable it for now as CI is failing, particularly `MultiOpsTxnsStressTest`. Investigation details in internal task T124324915. This PR disables mempurge more widely than `MultiOpsTxnsStressTest` until we know the issue is contained to that particular test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10252 Reviewed By: riversand963 Differential Revision: D37432948 Pulled By: ajkr fbshipit-source-id: d0cf5b0e0ec7c3142c382a0347f35a4c34f4607a |
|
Gang Liao | 2352e2dfda |
Add the blob cache to the stress tests and the benchmarking tool (#10202)
Summary: In order to facilitate correctness and performance testing, we would like to add the new blob cache to our stress test tool `db_stress` and our continuously running crash test script `db_crashtest.py`, as well as our synthetic benchmarking tool `db_bench` and the BlobDB performance testing script `run_blob_bench.sh`. As part of this task, we would also like to utilize these benchmarking tools to get some initial performance numbers about the effectiveness of caching blobs. This PR is a part of https://github.com/facebook/rocksdb/issues/10156 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10202 Reviewed By: ltamasi Differential Revision: D37325739 Pulled By: gangliao fbshipit-source-id: deb65d0d414502270dd4c324d987fd5469869fa8 |
|
Peter Dillinger | 84210c9489 |
Add data block hash index to crash test, fix MultiGet issue (#10220)
Summary: There was a bug in the MultiGet enhancement in https://github.com/facebook/rocksdb/issues/9899 with data block hash index, which was not caught because data block hash index was never added to stress tests. This change fixes both issues. Fixes https://github.com/facebook/rocksdb/issues/10186 I intend to pick this into the 7.4.0 release candidate Pull Request resolved: https://github.com/facebook/rocksdb/pull/10220 Test Plan: Failure quickly reproduces in crash test with kDataBlockBinaryAndHash, and does not seem to with the fix. Reproducing the failure with a unit test I believe would be too tricky and fragile to be worthwhile. Reviewed By: anand1976 Differential Revision: D37315647 Pulled By: pdillinger fbshipit-source-id: 9f648265bba867275edc752f7a56611a59401cba |
|
Guido Tagliavini Ponce | 3afed7408c |
Replace per-shard chained hash tables with open-addressing scheme (#10194)
Summary: In FastLRUCache, we replace the current chained per-shard hash table by an open-addressing hash table. In particular, this allows us to preallocate all handles. Because all handles are preallocated, this implementation doesn't support strict_capacity_limit = false (i.e., allowing insertions beyond the predefined capacity). This clashes with current assumptions of some tests, namely two tests in cache_test and the crash tests. We have disabled these for now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10194 Test Plan: ``make -j24 check`` Reviewed By: pdillinger Differential Revision: D37296770 Pulled By: guidotag fbshipit-source-id: 232ff1b8260331d868ebf4e3e5d8ad709390b0ad |
|
Andrew Kryczka | 5d6005c780 |
Add WriteOptions::protection_bytes_per_key (#10037)
Summary: Added an option, `WriteOptions::protection_bytes_per_key`, that controls how many bytes per key we use for integrity protection in `WriteBatch`. It takes effect when `WriteBatch::GetProtectionBytesPerKey() == 0`. Currently the only supported value is eight. Invoking a user API with it set to any other nonzero value will result in `Status::NotSupported` returned to the user. There is also a bug fix for integrity protection with `inplace_callback`, where we forgot to take into account the possible change in varint length when calculating KV checksum for the final encoded buffer. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10037 Test Plan: - Manual - Set default value of `WriteOptions::protection_bytes_per_key` to eight and ran `make check -j24` - Enabled in MyShadow for 1+ week - Automated - Unit tests have a `WriteMode` that enables the integrity protection via `WriteOptions` - Crash test - in most cases, use `WriteOptions::protection_bytes_per_key` to enable integrity protection Reviewed By: cbi42 Differential Revision: D36614569 Pulled By: ajkr fbshipit-source-id: 8650087ceac9b61b560f1e5fafe5e1baf9c725fb |