2019-12-09 07:49:32 +00:00
|
|
|
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
|
|
//
|
|
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
#include "rocksdb/io_status.h"
|
2019-12-09 07:49:32 +00:00
|
|
|
#ifdef GFLAGS
|
|
|
|
#pragma once
|
2022-10-06 22:07:16 +00:00
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
#include "db_stress_tool/db_stress_common.h"
|
|
|
|
#include "db_stress_tool/db_stress_shared_state.h"
|
2024-06-18 23:16:09 +00:00
|
|
|
#include "rocksdb/experimental.h"
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2020-02-20 20:07:53 +00:00
|
|
|
namespace ROCKSDB_NAMESPACE {
|
2021-03-15 11:32:24 +00:00
|
|
|
class SystemClock;
|
2019-12-09 07:49:32 +00:00
|
|
|
class Transaction;
|
|
|
|
class TransactionDB;
|
2023-06-17 23:27:37 +00:00
|
|
|
class OptimisticTransactionDB;
|
2022-04-28 00:50:54 +00:00
|
|
|
struct TransactionDBOptions;
|
2024-06-18 23:16:09 +00:00
|
|
|
using experimental::SstQueryFilterConfigsManager;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
class StressTest {
|
|
|
|
public:
|
|
|
|
StressTest();
|
|
|
|
|
2024-07-09 23:51:38 +00:00
|
|
|
virtual ~StressTest() {}
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2021-06-28 06:53:47 +00:00
|
|
|
std::shared_ptr<Cache> NewCache(size_t capacity, int32_t num_shard_bits);
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2021-02-02 19:39:20 +00:00
|
|
|
static std::vector<std::string> GetBlobCompressionTags();
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
bool BuildOptionsTable();
|
|
|
|
|
2022-05-02 20:25:45 +00:00
|
|
|
void InitDb(SharedState*);
|
2020-06-18 16:51:14 +00:00
|
|
|
// The initialization work is split into two parts to avoid a circular
|
|
|
|
// dependency with `SharedState`.
|
2021-12-14 21:33:16 +00:00
|
|
|
virtual void FinishInitDb(SharedState*);
|
2022-01-31 21:31:00 +00:00
|
|
|
void TrackExpectedState(SharedState* shared);
|
2019-12-09 07:49:32 +00:00
|
|
|
void OperateDb(ThreadState* thread);
|
|
|
|
virtual void VerifyDb(ThreadState* thread) const = 0;
|
2022-06-08 04:07:47 +00:00
|
|
|
virtual void ContinuouslyVerifyDb(ThreadState* /*thread*/) const = 0;
|
2019-12-09 07:49:32 +00:00
|
|
|
void PrintStatistics();
|
Decouple sync fault and write injection in FaultInjectionTestFS & fix tracing issue under WAL write error injection (#12797)
Summary:
**Context/Summary:**
After injecting write error to WAL, we started to see crash recovery verification failure in prefix recovery. That's because the current tracing implementation traces every write before it writes to WAL even when the WAL write can fail with write error injection. One consequence of that is the traced writes in trace files does not corresponding to write sequence sequence anymore e.g, it has more traced writes that the actual assigned sequence number to successful writes. Therefore https://github.com/facebook/rocksdb/blob/b4a84efb4e842b782e976de5b22a4554c2f76edd/db_stress_tool/expected_state.cc#L674 won't restore the ExpectedState to the correct sequence number we want.
Ideally, we should have a prepare-commit mechanism for tracing just like our ExpectedState so we can ignore the traced write if the write fails later. But for now, to simplify, we simply don't inject WAL error (and metadata write error cuz it could fail write when sync WAL dir fails)
To do so, we need to be able to exclude WAL from write injection but still allow sync fault injection in it to maintain its original sync fault testing coverage. This prompts us to decouple sync fault and write injection in FaultInjectionTestFS. And this is what this PR mainly about.
So now `FaultInjectionTestFS` works as the following:
- If direct_writable is true, then `FaultInjectionTestFS` is bypassed for writable file
- Otherwise, FaultInjectionTestFS` can buffer data for sync fault injection (if inject_unsynced_data_loss_ == true, global settings) and/or inject write error (if MaybeInjectThreadLocalError(), thread-local settings). WAL file can be optionally excluded from write injection
Bonus: better naming of relevant variables
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12797
Test Plan:
- The follow commands failed before this fix but passes after
```
python3 tools/db_crashtest.py --simple blackbox \
--interval=5 \
--preserve_unverified_changes=1 \
--threads=32 \
--disable_auto_compactions=1 \
--WAL_size_limit_MB=0 --WAL_ttl_seconds=0 --acquire_snapshot_one_in=0 --adaptive_readahead=0 --adm_policy=0 --advise_random_on_open=1 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --allow_fallocate=1 --async_io=0 --auto_readahead_size=0 --avoid_flush_during_recovery=1 --avoid_flush_during_shutdown=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --bgerror_resume_retry_interval=1000000 --block_align=0 --block_protection_bytes_per_key=4 --block_size=16384 --bloom_before_level=2147483646 --bloom_bits=3.2003682301518492 --bottommost_compression_type=zlib --bottommost_file_compaction_delay=600 --bytes_per_sync=0 --cache_index_and_filter_blocks=1 --cache_index_and_filter_blocks_with_high_priority=1 --cache_size=33554432 --cache_type=fixed_hyper_clock_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=0 --charge_filter_construction=0 --charge_table_reader=1 --check_multiget_consistency=0 --check_multiget_entity_consistency=0 --checkpoint_one_in=0 --checksum_type=kxxHash64 --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=0 --compact_range_one_in=0 --compaction_pri=2 --compaction_readahead_size=0 --compaction_ttl=0 --compress_format_version=1 --compressed_secondary_cache_size=16777216 --compression_checksum=1 --compression_max_dict_buffer_bytes=549755813887 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=none --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --daily_offpeak_time_utc=00:00-23:59 --data_block_index_type=0 \
--db_write_buffer_size=0 --delete_obsolete_files_period_micros=0 --delpercent=0 --delrangepercent=0 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_file_deletions_one_in=0 --disable_manual_compaction_one_in=0 --disable_wal=0 --dump_malloc_stats=0 --enable_checksum_handoff=0 --enable_compaction_filter=0 --enable_custom_split_merge=0 --enable_do_not_compress_roles=1 --enable_index_compression=0 --enable_memtable_insert_with_hint_prefix_extractor=0 --enable_pipelined_write=0 --enable_sst_partitioner_factory=0 --enable_thread_tracking=0 --enable_write_thread_adaptive_yield=0 --error_recovery_with_no_fault_injection=0 --fail_if_options_file_error=0 --fifo_allow_compaction=1 --file_checksum_impl=xxh64 --fill_cache=0 --flush_one_in=100 --format_version=4 --get_all_column_family_metadata_one_in=0 --get_current_wal_file_one_in=0 --get_live_files_apis_one_in=0 --get_properties_of_all_tables_one_in=0 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --hard_pending_compaction_bytes_limit=274877906944 --high_pri_pool_ratio=0.5 --index_block_restart_interval=9 --index_shortening=1 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=0 --inplace_update_support=0 --iterpercent=0 --key_len_percent_dist=1,30,69 --key_may_exist_one_in=0 --last_level_temperature=kUnknown --level_compaction_dynamic_level_bytes=1 --lock_wal_one_in=0 --log2_keys_per_lock=10 --log_file_time_to_roll=0 --log_readahead_size=16777216 --long_running_snapshots=0 --low_pri_pool_ratio=0 --lowest_used_cache_tier=2 --manifest_preallocation_size=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=524288 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=1000 --max_key_len=3 --memtable_insert_hint_per_batch=0 --memtable_max_range_deletions=0 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=8 --memtable_whole_key_filtering=0 --memtablerep=skip_list --metadata_charge_policy=0 --metadata_read_fault_one_in=0 --metadata_write_fault_one_in=0 --min_write_buffer_number_to_merge=1 --mmap_read=0 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --open_files=-1 --open_metadata_read_fault_one_in=0 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=20000000 \
--optimize_filters_for_hits=1 --optimize_filters_for_memory=1 --optimize_multiget_for_io=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=0 --periodic_compaction_seconds=0 --prefix_size=1 --prefixpercent=0 --prepopulate_block_cache=0 --preserve_internal_time_seconds=0 --progress_reports=0 --promote_l0_one_in=0 --read_amp_bytes_per_bit=0 --read_fault_one_in=0 --readahead_size=0 --readpercent=0 --recycle_log_file_num=0 --reopen=0 --report_bg_io_stats=0 --reset_stats_one_in=1000000 --sample_for_compression=5 --secondary_cache_fault_one_in=0 --secondary_cache_uri= --skip_stats_update_on_db_open=0 --snapshot_hold_ops=100000 --soft_pending_compaction_bytes_limit=68719476736 --sqfc_name=bar --sqfc_version=1 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=10 --stats_history_buffer_size=0 --strict_bytes_per_sync=0 --subcompactions=1 --sync=0 --sync_fault_injection=1 --table_cache_numshardbits=0 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --uncache_aggressiveness=9890 --universal_max_read_amp=-1 --unpartitioned_pinning=3 --use_adaptive_mutex=0 --use_adaptive_mutex_lru=1 --use_attribute_group=0 --use_delta_encoding=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=0 --use_multi_cf_iterator=0 --use_multi_get_entity=0 --use_multiget=0 --use_put_entity_one_in=0 --use_sqfc_for_range_queries=0 --use_timed_put_one_in=0 --use_write_buffer_manager=0 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=0 --verify_checksum_one_in=0 --verify_compression=1 --verify_db_one_in=0 --verify_file_checksums_one_in=0 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=335544320 --write_dbid_to_manifest=1 --write_fault_one_in=100 --writepercent=100
```
- CI
Reviewed By: cbi42
Differential Revision: D58917145
Pulled By: hx235
fbshipit-source-id: b6397036bea035a92341c2b05fb01872db2153d7
2024-06-26 21:56:35 +00:00
|
|
|
bool MightHaveUnsyncedDataLoss() {
|
|
|
|
return FLAGS_sync_fault_injection || FLAGS_disable_wal ||
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
FLAGS_manual_wal_flush_one_in > 0;
|
|
|
|
}
|
2024-08-15 19:32:59 +00:00
|
|
|
Status EnableAutoCompaction() {
|
|
|
|
assert(options_.disable_auto_compactions);
|
|
|
|
Status s = db_->EnableAutoCompaction(column_families_);
|
|
|
|
return s;
|
|
|
|
}
|
2024-07-09 23:51:38 +00:00
|
|
|
void CleanUp();
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
protected:
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
static int GetMinInjectedErrorCount(int error_count_1, int error_count_2) {
|
|
|
|
if (error_count_1 > 0 && error_count_2 > 0) {
|
|
|
|
return std::min(error_count_1, error_count_2);
|
|
|
|
} else if (error_count_1 > 0) {
|
|
|
|
return error_count_1;
|
|
|
|
} else if (error_count_2 > 0) {
|
|
|
|
return error_count_2;
|
|
|
|
} else {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
2024-07-29 20:51:49 +00:00
|
|
|
|
Fix false-positive TestBackupRestore corruption (#12917)
Summary:
**Context:**
https://github.com/facebook/rocksdb/pull/12838 allows a write thread encountered certain injected error to release the lock and sleep before retrying write in order to reduce performance cost. This requires adding checks like [this](https://github.com/facebook/rocksdb/blob/b26b395e0a15255d322be08110db551976188745/db_stress_tool/expected_value.cc#L29-L31) to prevent writing to the same key from another thread.
The added check causes a false-positive failure when delete range + file ingestion + backup is used. Consider the following scenario:
(1) Issue a delete range covering some key that do not exist and a key does exist (named as k1). k1 will have "pending delete" state while the keys that does not exit will have whatever state they already have since we don't delete a key that does not exist already.
(2) After https://github.com/facebook/rocksdb/pull/12838, `PrepareDeleteRange(... &prepared)` will return `prepared = false`. So below logic will be executed and k1's "pending delete" won't get roll-backed nor committed.
```
std::vector<PendingExpectedValue> pending_expected_values =
shared->PrepareDeleteRange(rand_column_family, rand_key,
rand_key + FLAGS_range_deletion_width,
&prepared);
if (!prepared) {
for (PendingExpectedValue& pending_expected_value :
pending_expected_values) {
pending_expected_value.PermitUnclosedPendingState();
}
return s;
}
```
(3) Issue an file ingestion covering k1 and another key k2. Similar to (2), we will have `shared->PreparePut(column_family, key, &prepared)` return `prepared = false` for k1 while k2 will have a "pending put" state. So below logic will be executed and k2's "pending put" state won't get roll-backed nor committed.
```
for (int64_t key = key_base;
s.ok() && key < shared->GetMaxKey() &&
static_cast<int32_t>(keys.size()) < FLAGS_ingest_external_file_width;
++key)
PendingExpectedValue pending_expected_value =
shared->PreparePut(column_family, key, &prepared);
if (!prepared) {
pending_expected_value.PermitUnclosedPendingState();
for (PendingExpectedValue& pev : pending_expected_values) {
pev.PermitUnclosedPendingState();
}
return;
}
}
```
(4) Issue a backup and verify on k2. Below logic decides that k2 should exist in restored DB since it has a pending write state while k2 is never ingested into the original DB as (3) returns early.
```
bool Exists() const { return PendingPut() || !IsDeleted(); }
TestBackupRestore() {
...
Status get_status = restored_db->Get(
read_opts, restored_cf_handles[rand_column_families[i]], key,
&restored_value);
bool exists = thread->shared->Exists(rand_column_families[i], rand_keys[0]);
if (get_status.ok()) {
if (!exists && from_latest && ShouldAcquireMutexOnKey()) {
std::ostringstream oss;
oss << "0x" << key.ToString(true)
<< " exists in restore but not in original db";
s = Status::Corruption(oss.str());
}
} else if (get_status.IsNotFound()) {
if (exists && from_latest && ShouldAcquireMutexOnKey()) {
std::ostringstream oss;
oss << "0x" << key.ToString(true)
<< " exists in original db but not in restore";
s = Status::Corruption(oss.str());
}
}
...
}
```
So we see false-positive corruption like `Failure in a backup/restore operation with: Corruption: 0x000000000000017B0000000000000073787878 exists in original db but not in restore`
A simple fix is to remove `PendingPut()` from `bool Exists() ` since it's called under a lock and should never see a pending write. However, in order for "under a lock and should never see a pending write" to be true, we need to remove the logic of releasing the lock during sleep in the write thread, which expose pending write to other thread that can call Exists() like back up thread.
The downside of holding lock during sleep is blocking other write thread of the same key to proceed cuz they need to wait for the lock. This should happen rarely as the key of a thread is selected randomly in crash test like below.
```
void StressTest::OperateDb(ThreadState* thread) {
for (uint64_t i = 0; i < ops_per_open; i++) {
...
int64_t rand_key = GenerateOneKey(thread, i);
...
}
}
```
**Summary:**
- Removed the "lock release" part and related checks
- Printed recovery time if the write thread waited more than 10 seconds
- Reverted regression in testing coverage when deleting a non-existent key
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12917
Test Plan:
Below command repro-ed frequently before the fix and not after.
```
./db_stress --WAL_size_limit_MB=1 --WAL_ttl_seconds=60 --acquire_snapshot_one_in=0 --adaptive_readahead=0 --adm_policy=1 --advise_random_on_open=1 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --allow_fallocate=0 --allow_setting_blob_options_dynamically=1 --async_io=0 --auto_readahead_size=1 --avoid_flush_during_recovery=0 --avoid_flush_during_shutdown=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --bgerror_resume_retry_interval=100 --blob_cache_size=8388608 --blob_compaction_readahead_size=1048576 --blob_compression_type=none --blob_file_size=1073741824 --blob_file_starting_level=1 --blob_garbage_collection_age_cutoff=0.0 --blob_garbage_collection_force_threshold=0.75 --block_align=0 --block_protection_bytes_per_key=8 --block_size=16384 --bloom_before_level=2147483647 --bloom_bits=16.216959977115277 --bottommost_compression_type=xpress --bottommost_file_compaction_delay=600 --bytes_per_sync=262144 --cache_index_and_filter_blocks=1 --cache_index_and_filter_blocks_with_high_priority=1 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=0 --charge_filter_construction=0 --charge_table_reader=1 --check_multiget_consistency=0 --check_multiget_entity_consistency=0 --checkpoint_one_in=1000000 --checksum_type=kXXH3 --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000 --compact_range_one_in=0 --compaction_pri=3 --compaction_readahead_size=0 --compaction_ttl=10 --compress_format_version=2 --compressed_secondary_cache_size=8388608 --compression_checksum=0 --compression_max_dict_buffer_bytes=2097151 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zlib --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --daily_offpeak_time_utc=04:00-08:00 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_whitebox --db_write_buffer_size=0 --default_temperature=kUnknown --default_write_temperature=kWarm --delete_obsolete_files_period_micros=21600000000 --delpercent=0 --delrangepercent=5 --destroy_db_initially=0 --detect_filter_construct_corruption=1 --disable_file_deletions_one_in=10000 --disable_manual_compaction_one_in=1000000 --disable_wal=0 --dump_malloc_stats=0 --enable_blob_files=0 --enable_blob_garbage_collection=1 --enable_checksum_handoff=1 --enable_compaction_filter=1 --enable_custom_split_merge=1 --enable_do_not_compress_roles=0 --enable_index_compression=1 --enable_memtable_insert_with_hint_prefix_extractor=0 --enable_pipelined_write=1 --enable_sst_partitioner_factory=1 --enable_thread_tracking=0 --enable_write_thread_adaptive_yield=0 --error_recovery_with_no_fault_injection=1 --exclude_wal_from_write_fault_injection=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=0 --fifo_allow_compaction=1 --file_checksum_impl=big --fill_cache=1 --flush_one_in=1000000 --format_version=2 --get_all_column_family_metadata_one_in=10000 --get_current_wal_file_one_in=0 --get_live_files_apis_one_in=1000000 --get_properties_of_all_tables_one_in=100000 --get_property_one_in=100000 --get_sorted_wal_files_one_in=0 --hard_pending_compaction_bytes_limit=2097152 --high_pri_pool_ratio=0.5 --index_block_restart_interval=1 --index_shortening=2 --index_type=0 --ingest_external_file_one_in=1000 --initial_auto_readahead_size=0 --inplace_update_support=0 --iterpercent=0 --key_len_percent_dist=1,30,69 --key_may_exist_one_in=100 --last_level_temperature=kUnknown --level_compaction_dynamic_level_bytes=0 --lock_wal_one_in=10000 --log2_keys_per_lock=10 --log_file_time_to_roll=0 --log_readahead_size=0 --long_running_snapshots=1 --low_pri_pool_ratio=0.5 --lowest_used_cache_tier=1 --manifest_preallocation_size=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=16384 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=100000 --max_key_len=3 --max_log_file_size=1048576 --max_manifest_file_size=1073741824 --max_sequential_skip_in_iterations=16 --max_total_wal_size=0 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=10 --max_write_buffer_size_to_maintain=8388608 --memtable_insert_hint_per_batch=1 --memtable_max_range_deletions=1000 --memtable_prefix_bloom_size_ratio=0.001 --memtable_protection_bytes_per_key=4 --memtable_whole_key_filtering=1 --memtablerep=skip_list --metadata_charge_policy=1 --metadata_read_fault_one_in=0 --metadata_write_fault_one_in=0 --min_blob_size=16 --min_write_buffer_number_to_merge=2 --mmap_read=0 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --open_files=-1 --open_metadata_read_fault_one_in=0 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=20000000 --optimize_filters_for_hits=1 --optimize_filters_for_memory=0 --optimize_multiget_for_io=1 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=10000 --periodic_compaction_seconds=10 --prefix_size=8 --prefixpercent=0 --prepopulate_blob_cache=1 --prepopulate_block_cache=1 --preserve_internal_time_seconds=0 --progress_reports=0 --promote_l0_one_in=0 --read_amp_bytes_per_bit=0 --read_fault_one_in=0 --readahead_size=524288 --readpercent=60 --recycle_log_file_num=1 --reopen=20 --report_bg_io_stats=0 --reset_stats_one_in=1000000 --sample_for_compression=5 --secondary_cache_fault_one_in=0 --secondary_cache_uri= --skip_stats_update_on_db_open=1 --snapshot_hold_ops=100000 --soft_pending_compaction_bytes_limit=68719476736 --sqfc_name=foo --sqfc_version=1 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=10 --stats_history_buffer_size=1048576 --strict_bytes_per_sync=1 --subcompactions=2 --sync=0 --sync_fault_injection=0 --table_cache_numshardbits=0 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --uncache_aggressiveness=118 --universal_max_read_amp=-1 --unpartitioned_pinning=0 --use_adaptive_mutex=0 --use_adaptive_mutex_lru=1 --use_attribute_group=0 --use_blob_cache=0 --use_delta_encoding=1 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=0 --use_multi_cf_iterator=0 --use_multi_get_entity=0 --use_multiget=1 --use_put_entity_one_in=0 --use_shared_block_and_blob_cache=1 --use_sqfc_for_range_queries=1 --use_timed_put_one_in=0 --use_write_buffer_manager=0 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_compression=0 --verify_db_one_in=10000 --verify_file_checksums_one_in=1000000 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --write_fault_one_in=0 --writepercent=35
```
Reviewed By: cbi42
Differential Revision: D60890580
Pulled By: hx235
fbshipit-source-id: 401f90d6d351c7ee11088cad06fb00e54062d416
2024-08-09 21:51:36 +00:00
|
|
|
void UpdateIfInitialWriteFails(Env* db_stress_env, const Status& write_s,
|
|
|
|
Status* initial_write_s,
|
|
|
|
bool* initial_wal_write_may_succeed,
|
|
|
|
uint64_t* wait_for_recover_start_time) {
|
|
|
|
assert(db_stress_env && initial_write_s && initial_wal_write_may_succeed &&
|
|
|
|
wait_for_recover_start_time);
|
|
|
|
// Only update `initial_write_s`, `initial_wal_write_may_succeed` when the
|
|
|
|
// first write fails
|
|
|
|
if (!write_s.ok() && (*initial_write_s).ok()) {
|
|
|
|
*initial_write_s = write_s;
|
|
|
|
*initial_wal_write_may_succeed =
|
|
|
|
!FaultInjectionTestFS::IsFailedToWriteToWALError(*initial_write_s);
|
|
|
|
*wait_for_recover_start_time = db_stress_env->NowMicros();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void PrintWriteRecoveryWaitTimeIfNeeded(Env* db_stress_env,
|
|
|
|
const Status& initial_write_s,
|
|
|
|
bool initial_wal_write_may_succeed,
|
|
|
|
uint64_t wait_for_recover_start_time,
|
|
|
|
const std::string& thread_name) {
|
|
|
|
assert(db_stress_env);
|
|
|
|
bool waited_for_recovery = !initial_write_s.ok() &&
|
|
|
|
IsErrorInjectedAndRetryable(initial_write_s) &&
|
|
|
|
initial_wal_write_may_succeed;
|
|
|
|
if (waited_for_recovery) {
|
|
|
|
uint64_t elapsed_sec =
|
|
|
|
(db_stress_env->NowMicros() - wait_for_recover_start_time) / 1000000;
|
|
|
|
if (elapsed_sec > 10) {
|
|
|
|
fprintf(stdout,
|
|
|
|
"%s thread slept to wait for write recovery for "
|
|
|
|
"%" PRIu64 " seconds\n",
|
|
|
|
thread_name.c_str(), elapsed_sec);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2024-07-29 20:51:49 +00:00
|
|
|
void GetDeleteRangeKeyLocks(
|
|
|
|
ThreadState* thread, int rand_column_family, int64_t rand_key,
|
|
|
|
std::vector<std::unique_ptr<MutexLock>>* range_locks) {
|
|
|
|
for (int j = 0; j < FLAGS_range_deletion_width; ++j) {
|
|
|
|
if (j == 0 ||
|
|
|
|
((rand_key + j) & ((1 << FLAGS_log2_keys_per_lock) - 1)) == 0) {
|
|
|
|
range_locks->emplace_back(new MutexLock(
|
|
|
|
thread->shared->GetMutexForKey(rand_column_family, rand_key + j)));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
Status AssertSame(DB* db, ColumnFamilyHandle* cf,
|
|
|
|
ThreadState::SnapshotState& snap_state);
|
|
|
|
|
|
|
|
// Currently PreloadDb has to be single-threaded.
|
|
|
|
void PreloadDbAndReopenAsReadOnly(int64_t number_of_keys,
|
|
|
|
SharedState* shared);
|
|
|
|
|
|
|
|
Status SetOptions(ThreadState* thread);
|
|
|
|
|
Support WriteCommit policy with sync_fault_injection=1 (#10624)
Summary:
**Context:**
Prior to this PR, correctness testing with un-sync data loss [disabled](https://github.com/facebook/rocksdb/pull/10605) transaction (`use_txn=1`) thus all of the `txn_write_policy` . This PR improved that by adding support for one policy - WriteCommit (`txn_write_policy=0`).
**Summary:**
They key to this support is (a) handle Mark{Begin, End}Prepare/MarkCommit/MarkRollback in constructing ExpectedState under WriteCommit policy correctly and (b) monitor CI jobs and solve any test incompatibility issue till jobs are stable. (b) will be part of the test plan.
For (a)
- During prepare (i.e, between `MarkBeginPrepare()` and `MarkEndPrepare(xid)`), `ExpectedStateTraceRecordHandler` will buffer all writes by adding all writes to an internal `WriteBatch`.
- On `MarkEndPrepare()`, that `WriteBatch` will be associated with the transaction's `xid`.
- During the commit (i.e, on `MarkCommit(xid)`), `ExpectedStateTraceRecordHandler` will retrieve and iterate the internal `WriteBatch` and finally apply those writes to `ExpectedState`
- During the rollback (i.e, on `MarkRollback(xid)`), `ExpectedStateTraceRecordHandler` will erase the internal `WriteBatch` from the map.
For (b) - one major issue described below:
- TransactionsDB in db stress recovers prepared-but-not-committed txns from the previous crashed run by randomly committing or rolling back it at the start of the current run, see a historical [PR](https://github.com/facebook/rocksdb/commit/6d06be22c083ccf185fd38dba49fde73b644b4c1) predated correctness testing.
- And we will verify those processed keys in a recovered db against their expected state.
- However since now we turn on `sync_fault_injection=1` where the expected state is constructed from the trace instead of using the LATEST.state from previous run. The expected state now used to verify those processed keys won't contain UNKNOWN_SENTINEL as they should - see test 1 for a failed case.
- Therefore, we decided to manually update its expected state to be UNKNOWN_SENTINEL as part of the processing.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10624
Test Plan:
1. Test exposed the major issue described above. This test will fail without setting UNKNOWN_SENTINEL in expected state during the processing and pass after
```
db=/dev/shm/rocksdb_crashtest_blackbox
exp=/dev/shm/rocksdb_crashtest_expected
dbt=$db.tmp
expt=$exp.tmp
rm -rf $db $exp
mkdir -p $exp
echo "RUN 1"
./db_stress \
--clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \
--use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 &
pid=$!
sleep 0.2
sleep 20
kill $pid
sleep 0.2
echo "RUN 2"
./db_stress \
--clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \
--use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 &
pid=$!
sleep 0.2
sleep 20
kill $pid
sleep 0.2
echo "RUN 3"
./db_stress \
--clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \
--use_txn=1 --txn_write_policy=0 --sync_fault_injection=1
```
2. Manual testing to ensure ExpectedState is constructed correctly during recovery by verifying it against previously crashed TransactionDB's WAL.
- Run the following command to crash a TransactionDB with WriteCommit policy. Then `./ldb dump_wal` on its WAL file
```
db=/dev/shm/rocksdb_crashtest_blackbox
exp=/dev/shm/rocksdb_crashtest_expected
rm -rf $db $exp
mkdir -p $exp
./db_stress \
--clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \
--use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 &
pid=$!
sleep 30
kill $pid
sleep 1
```
- Run the following command to verify recovery of the crashed db under debugger. Compare the step-wise result with WAL records (e.g, WriteBatch content, xid, prepare/commit/rollback marker)
```
./db_stress \
--clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \
--use_txn=1 --txn_write_policy=0 --sync_fault_injection=1
```
3. Automatic testing by triggering all RocksDB stress/crash test jobs for 3 rounds with no failure.
Reviewed By: ajkr, riversand963
Differential Revision: D39199373
Pulled By: hx235
fbshipit-source-id: 7a1dec0e3e2ee6ea86ddf5dd19ceb5543a3d6f0c
2022-09-27 01:01:59 +00:00
|
|
|
// For transactionsDB, there can be txns prepared but not yet committeed
|
|
|
|
// right before previous stress run crash.
|
|
|
|
// They will be recovered and processed through
|
|
|
|
// ProcessRecoveredPreparedTxnsHelper on the start of current stress run.
|
|
|
|
void ProcessRecoveredPreparedTxns(SharedState* shared);
|
|
|
|
|
|
|
|
// Default implementation will first update ExpectedState to be
|
|
|
|
// `SharedState::UNKNOWN` for each keys in `txn` and then randomly
|
|
|
|
// commit or rollback `txn`.
|
|
|
|
virtual void ProcessRecoveredPreparedTxnsHelper(Transaction* txn,
|
|
|
|
SharedState* shared);
|
|
|
|
|
Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653)
Summary:
In rare cases, optimistic transaction commit returns TryAgain. This change tolerates that intentional behavior in db_stress, up to a small limit in a row. This way, we don't miss a possible regression with excessive TryAgain, and trying again (rolling back the transaction) should have a well renewed chance of success as the writes will be associated with fresh sequence numbers.
Also, some of the APIs were not clear about Transaction semantics, so I have clarified:
* (Best I can tell....) Destroying a Transaction is safe without calling Rollback() (or at least should be). I don't know why it's a common pattern in our test code and examples to rollback before unconditional destruction. Stress test updated not to call Rollback unnecessarily (to test safe destruction).
* Despite essentially doing what is asked, simply trying Commit() again when it returns TryAgain does not have a chance of success, because of the transaction being bound to the DB state at the time of operations before Commit. Similar logic applies to Busy AFAIK. Commit() API comments updated, and expanded unit test in optimistic_transaction_test.
Also also, because I can't stop myself, I refactored a good portion of the transaction handling code in db_stress.
* Avoid existing and new copy-paste for most transaction interactions with a new ExecuteTransaction (higher-order) function.
* Use unique_ptr (nicely complements removing unnecessary Rollbacks)
* Abstract out a pattern for safely calling std::terminate() and use it in more places. (The TryAgain errors we saw did not have stack traces because of "terminate called recursively".)
Intended follow-up: resurrect use of `FLAGS_rollback_one_in` but also include non-trivial cases
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11653
Test Plan:
this is the test :)
Also, temporarily bypassed the new retry logic and boosted the chance of hitting TryAgain. Quickly reproduced the TryAgain error. Then re-enabled the new retry logic, and was not able to hit the error after running for tens of minutes, even with the boosted chances.
Reviewed By: cbi42
Differential Revision: D47882995
Pulled By: pdillinger
fbshipit-source-id: 21eadb1525423340dbf28d17cf166b9583311a0d
2023-07-28 23:25:29 +00:00
|
|
|
// ExecuteTransaction is recommended instead
|
|
|
|
Status NewTxn(WriteOptions& write_opts,
|
|
|
|
std::unique_ptr<Transaction>* out_txn);
|
|
|
|
Status CommitTxn(Transaction& txn, ThreadState* thread = nullptr);
|
|
|
|
|
|
|
|
// Creates a transaction, executes `ops`, and tries to commit
|
|
|
|
Status ExecuteTransaction(WriteOptions& write_opts, ThreadState* thread,
|
|
|
|
std::function<Status(Transaction&)>&& ops);
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
virtual void MaybeClearOneColumnFamily(ThreadState* /* thread */) {}
|
|
|
|
|
|
|
|
virtual bool ShouldAcquireMutexOnKey() const { return false; }
|
|
|
|
|
2021-12-07 21:40:46 +00:00
|
|
|
// Returns true if DB state is tracked by the stress test.
|
|
|
|
virtual bool IsStateTracked() const = 0;
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
virtual std::vector<int> GenerateColumnFamilies(
|
|
|
|
const int /* num_column_families */, int rand_column_family) const {
|
|
|
|
return {rand_column_family};
|
|
|
|
}
|
|
|
|
|
|
|
|
virtual std::vector<int64_t> GenerateKeys(int64_t rand_key) const {
|
|
|
|
return {rand_key};
|
|
|
|
}
|
|
|
|
|
2024-05-09 22:37:38 +00:00
|
|
|
virtual void TestKeyMayExist(ThreadState*, const ReadOptions&,
|
|
|
|
const std::vector<int>&,
|
|
|
|
const std::vector<int64_t>&) {}
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
virtual Status TestGet(ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
|
|
|
|
|
|
|
virtual std::vector<Status> TestMultiGet(
|
|
|
|
ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
|
|
|
|
2023-03-17 21:47:29 +00:00
|
|
|
virtual void TestGetEntity(ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
|
|
|
|
2023-03-30 03:35:15 +00:00
|
|
|
virtual void TestMultiGetEntity(ThreadState* thread,
|
|
|
|
const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
virtual Status TestPrefixScan(ThreadState* thread,
|
|
|
|
const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
|
|
|
|
|
|
|
virtual Status TestPut(ThreadState* thread, WriteOptions& write_opts,
|
|
|
|
const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& cf_ids,
|
2022-09-15 22:55:37 +00:00
|
|
|
const std::vector<int64_t>& keys,
|
|
|
|
char (&value)[100]) = 0;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
virtual Status TestDelete(ThreadState* thread, WriteOptions& write_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
2022-09-15 22:55:37 +00:00
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
virtual Status TestDeleteRange(ThreadState* thread, WriteOptions& write_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
2022-09-15 22:55:37 +00:00
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
virtual void TestIngestExternalFile(
|
|
|
|
ThreadState* thread, const std::vector<int>& rand_column_families,
|
2022-09-15 22:55:37 +00:00
|
|
|
const std::vector<int64_t>& rand_keys) = 0;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2019-12-10 19:40:09 +00:00
|
|
|
// Issue compact range, starting with start_key, whose integer value
|
|
|
|
// is rand_key.
|
|
|
|
virtual void TestCompactRange(ThreadState* thread, int64_t rand_key,
|
|
|
|
const Slice& start_key,
|
|
|
|
ColumnFamilyHandle* column_family);
|
|
|
|
|
2024-05-09 22:37:38 +00:00
|
|
|
virtual void TestPromoteL0(ThreadState* thread,
|
|
|
|
ColumnFamilyHandle* column_family);
|
|
|
|
|
2019-12-10 19:40:09 +00:00
|
|
|
// Calculate a hash value for all keys in range [start_key, end_key]
|
|
|
|
// at a certain snapshot.
|
|
|
|
uint32_t GetRangeHash(ThreadState* thread, const Snapshot* snapshot,
|
|
|
|
ColumnFamilyHandle* column_family,
|
|
|
|
const Slice& start_key, const Slice& end_key);
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
// Return a column family handle that mirrors what is pointed by
|
|
|
|
// `column_family_id`, which will be used to validate data to be correct.
|
|
|
|
// By default, the column family itself will be returned.
|
|
|
|
virtual ColumnFamilyHandle* GetControlCfh(ThreadState* /* thread*/,
|
|
|
|
int column_family_id) {
|
|
|
|
return column_families_[column_family_id];
|
|
|
|
}
|
|
|
|
|
|
|
|
// Generated a list of keys that close to boundaries of SST keys.
|
|
|
|
// If there isn't any SST file in the DB, return empty list.
|
|
|
|
std::vector<std::string> GetWhiteBoxKeys(ThreadState* thread, DB* db,
|
|
|
|
ColumnFamilyHandle* cfh,
|
|
|
|
size_t num_keys);
|
|
|
|
|
|
|
|
// Given a key K, this creates an iterator which scans to K and then
|
|
|
|
// does a random sequence of Next/Prev operations.
|
|
|
|
virtual Status TestIterate(ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys);
|
|
|
|
|
2024-06-17 18:25:30 +00:00
|
|
|
// Given a key K, this creates an attribute group iterator which scans to K
|
|
|
|
// and then does a random sequence of Next/Prev operations. Called only when
|
|
|
|
// use_attribute_group=1
|
|
|
|
virtual Status TestIterateAttributeGroups(
|
|
|
|
ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys);
|
|
|
|
|
|
|
|
template <typename IterType, typename NewIterFunc, typename VerifyFunc>
|
|
|
|
Status TestIterateImpl(ThreadState* thread, const ReadOptions& read_opts,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys,
|
|
|
|
NewIterFunc new_iter_func, VerifyFunc verify_func);
|
|
|
|
|
Add Iterator test against expected state to stress test (#10538)
Summary:
As mentioned in https://github.com/facebook/rocksdb/pull/5506#issuecomment-506021913,
`db_stress` does not have much verification for iterator correctness.
It has a `TestIterate()` function, but that is mainly for comparing results
between two iterators, one with `total_order_seek` and the other optionally
sets auto_prefix, upper/lower bounds. Commit 49a0581ad2462e31aa3f768afa769e0d33390f33
added a new `TestIterateAgainstExpected()` function that compares iterator against
expected state. It locks a range of keys, creates an iterator, does
a random sequence of `Next/Prev` and compares against expected state.
This PR is based on that commit, the main changes include some logs
(for easier debugging if a test fails), a forward and backward scan to
cover the entire locked key range, and a flag for optionally turning on
this version of Iterator testing.
Added constraint that the checks against expected state in
`TestIterateAgainstExpected()` and in `TestGet()` are only turned on
when `--skip_verifydb` flag is not set.
Remove the change log introduced in https://github.com/facebook/rocksdb/issues/10553.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10538
Test Plan:
Run `db_stress` with `--verify_iterator_with_expected_state_one_in=1`,
and a large `--iterpercent` and `--num_iterations`. Checked `op_logs`
manually to ensure expected coverage. Tweaked part of the code in
https://github.com/facebook/rocksdb/issues/10449 and stress test was able to catch it.
- internally run various flavor of crash test
Reviewed By: ajkr
Differential Revision: D38847269
Pulled By: cbi42
fbshipit-source-id: 8b4402a9bba9f6cfa08051943cd672579d489599
2022-08-24 21:59:50 +00:00
|
|
|
virtual Status TestIterateAgainstExpected(
|
|
|
|
ThreadState* /* thread */, const ReadOptions& /* read_opts */,
|
|
|
|
const std::vector<int>& /* rand_column_families */,
|
2022-09-15 22:55:37 +00:00
|
|
|
const std::vector<int64_t>& /* rand_keys */) {
|
Add Iterator test against expected state to stress test (#10538)
Summary:
As mentioned in https://github.com/facebook/rocksdb/pull/5506#issuecomment-506021913,
`db_stress` does not have much verification for iterator correctness.
It has a `TestIterate()` function, but that is mainly for comparing results
between two iterators, one with `total_order_seek` and the other optionally
sets auto_prefix, upper/lower bounds. Commit 49a0581ad2462e31aa3f768afa769e0d33390f33
added a new `TestIterateAgainstExpected()` function that compares iterator against
expected state. It locks a range of keys, creates an iterator, does
a random sequence of `Next/Prev` and compares against expected state.
This PR is based on that commit, the main changes include some logs
(for easier debugging if a test fails), a forward and backward scan to
cover the entire locked key range, and a flag for optionally turning on
this version of Iterator testing.
Added constraint that the checks against expected state in
`TestIterateAgainstExpected()` and in `TestGet()` are only turned on
when `--skip_verifydb` flag is not set.
Remove the change log introduced in https://github.com/facebook/rocksdb/issues/10553.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10538
Test Plan:
Run `db_stress` with `--verify_iterator_with_expected_state_one_in=1`,
and a large `--iterpercent` and `--num_iterations`. Checked `op_logs`
manually to ensure expected coverage. Tweaked part of the code in
https://github.com/facebook/rocksdb/issues/10449 and stress test was able to catch it.
- internally run various flavor of crash test
Reviewed By: ajkr
Differential Revision: D38847269
Pulled By: cbi42
fbshipit-source-id: 8b4402a9bba9f6cfa08051943cd672579d489599
2022-08-24 21:59:50 +00:00
|
|
|
return Status::NotSupported();
|
|
|
|
}
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
// Enum used by VerifyIterator() to identify the mode to validate.
|
2019-12-20 22:54:30 +00:00
|
|
|
enum LastIterateOp {
|
|
|
|
kLastOpSeek,
|
|
|
|
kLastOpSeekForPrev,
|
|
|
|
kLastOpNextOrPrev,
|
|
|
|
kLastOpSeekToFirst,
|
|
|
|
kLastOpSeekToLast
|
|
|
|
};
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
// Compare the two iterator, iter and cmp_iter are in the same position,
|
|
|
|
// unless iter might be made invalidate or undefined because of
|
|
|
|
// upper or lower bounds, or prefix extractor.
|
|
|
|
// Will flag failure if the verification fails.
|
|
|
|
// diverged = true if the two iterator is already diverged.
|
|
|
|
// True if verification passed, false if not.
|
2019-12-20 22:54:30 +00:00
|
|
|
// op_logs is the information to print when validation fails.
|
2024-06-17 18:25:30 +00:00
|
|
|
template <typename IterType, typename VerifyFuncType>
|
2019-12-09 07:49:32 +00:00
|
|
|
void VerifyIterator(ThreadState* thread, ColumnFamilyHandle* cmp_cfh,
|
2024-06-17 18:25:30 +00:00
|
|
|
const ReadOptions& ro, IterType* iter, Iterator* cmp_iter,
|
2019-12-20 22:54:30 +00:00
|
|
|
LastIterateOp op, const Slice& seek_key,
|
2024-06-17 18:25:30 +00:00
|
|
|
const std::string& op_logs, VerifyFuncType verifyFunc,
|
|
|
|
bool* diverged);
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
virtual Status TestBackupRestore(ThreadState* thread,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys);
|
|
|
|
|
2023-10-28 00:07:39 +00:00
|
|
|
virtual Status PrepareOptionsForRestoredDB(Options* options);
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
virtual Status TestCheckpoint(ThreadState* thread,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys);
|
|
|
|
|
2019-12-19 22:03:14 +00:00
|
|
|
void TestCompactFiles(ThreadState* thread, ColumnFamilyHandle* column_family);
|
|
|
|
|
|
|
|
Status TestFlush(const std::vector<int>& rand_column_families);
|
|
|
|
|
2024-05-09 22:37:38 +00:00
|
|
|
Status TestResetStats();
|
|
|
|
|
2019-12-19 22:03:14 +00:00
|
|
|
Status TestPauseBackground(ThreadState* thread);
|
|
|
|
|
2024-04-16 22:43:26 +00:00
|
|
|
Status TestDisableFileDeletions(ThreadState* thread);
|
|
|
|
|
|
|
|
Status TestDisableManualCompaction(ThreadState* thread);
|
|
|
|
|
2019-12-19 22:03:14 +00:00
|
|
|
void TestAcquireSnapshot(ThreadState* thread, int rand_column_family,
|
|
|
|
const std::string& keystr, uint64_t i);
|
|
|
|
|
|
|
|
Status MaybeReleaseSnapshots(ThreadState* thread, uint64_t i);
|
2024-04-16 22:43:26 +00:00
|
|
|
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
Status TestGetLiveFiles() const;
|
|
|
|
Status TestGetLiveFilesMetaData() const;
|
|
|
|
Status TestGetLiveFilesStorageInfo() const;
|
|
|
|
Status TestGetAllColumnFamilyMetaData() const;
|
2024-04-16 22:43:26 +00:00
|
|
|
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
Status TestGetSortedWalFiles() const;
|
|
|
|
Status TestGetCurrentWalFile() const;
|
2020-07-14 19:10:56 +00:00
|
|
|
void TestGetProperty(ThreadState* thread) const;
|
2024-04-16 22:43:26 +00:00
|
|
|
Status TestGetPropertiesOfAllTables() const;
|
2020-03-19 00:11:06 +00:00
|
|
|
|
2019-12-21 05:42:19 +00:00
|
|
|
virtual Status TestApproximateSize(
|
|
|
|
ThreadState* thread, uint64_t iteration,
|
|
|
|
const std::vector<int>& rand_column_families,
|
|
|
|
const std::vector<int64_t>& rand_keys);
|
|
|
|
|
2021-12-14 21:33:16 +00:00
|
|
|
virtual Status TestCustomOperations(
|
|
|
|
ThreadState* /*thread*/,
|
|
|
|
const std::vector<int>& /*rand_column_families*/) {
|
|
|
|
return Status::NotSupported("TestCustomOperations() must be overridden");
|
|
|
|
}
|
|
|
|
|
2024-06-25 03:51:39 +00:00
|
|
|
bool IsErrorInjectedAndRetryable(const Status& error_s) const {
|
|
|
|
assert(!error_s.ok());
|
2024-07-29 20:51:49 +00:00
|
|
|
return error_s.getState() &&
|
|
|
|
FaultInjectionTestFS::IsInjectedError(error_s) &&
|
2024-06-25 03:51:39 +00:00
|
|
|
!status_to_io_status(Status(error_s)).GetDataLoss();
|
Inject more errors to more files in stress test (#12713)
Summary:
**Context:**
We currently have partial error injection:
- DB operation: all read, SST write
- DB open: all read, SST write, all metadata write.
This PR completes the error injection (with some limitations below):
- DB operation & open: all read, all write, all metadata write, all metadata read
**Summary:**
- Inject retryable metadata read, metadata write error concerning directory (e.g, dir sync, ) or file metadata (e.g, name, size, file creation/deletion...)
- Inject retryable errors to all major file types: random access file, sequential file, writable file
- Allow db stress test operations to handle above injected errors gracefully without crashing
- Change all error injection to thread-local implementation for easier disabling and enabling in the same thread. For example, we can control error handling thread to have no error injection. It's also cleaner in code.
- Limitation: compared to before, we now don't have write fault injection for backup/restore CopyOrCreateFiles work threads since they use anonymous background threads as well as read injection for db open bg thread
- Add a new flag to test error recovery without error injection so we can test the path where error recovery actually succeeds
- Some Refactory & fix to db stress test framework (see PR review comments)
- Fix some minor bugs surfaced (see PR review comments)
- Limitation: had to disable backup restore with metadata read/write injection since it surfaces too many testing issues. Will add it back later to focus on surfacing actual code/internal bugs first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12713
Test Plan:
- Existing UT
- CI with no trivial error failure
Reviewed By: pdillinger
Differential Revision: D58326608
Pulled By: hx235
fbshipit-source-id: 011b5195aaeb6011641ae0a9194f7f2a0e325ad7
2024-06-19 15:42:00 +00:00
|
|
|
}
|
|
|
|
|
2024-05-16 04:14:41 +00:00
|
|
|
void ProcessStatus(SharedState* shared, std::string msg, const Status& s,
|
|
|
|
bool ignore_injected_error = true) const;
|
2023-10-10 13:29:01 +00:00
|
|
|
|
|
|
|
void VerificationAbort(SharedState* shared, std::string msg) const;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
|
|
|
void VerificationAbort(SharedState* shared, std::string msg, int cf,
|
|
|
|
int64_t key) const;
|
|
|
|
|
2022-08-29 21:13:06 +00:00
|
|
|
void VerificationAbort(SharedState* shared, std::string msg, int cf,
|
|
|
|
int64_t key, Slice value_from_db,
|
|
|
|
Slice value_from_expected) const;
|
|
|
|
|
2022-10-06 22:07:16 +00:00
|
|
|
void VerificationAbort(SharedState* shared, int cf, int64_t key,
|
2023-03-17 21:47:29 +00:00
|
|
|
const Slice& value, const WideColumns& columns) const;
|
2022-10-06 22:07:16 +00:00
|
|
|
|
2023-03-17 21:47:29 +00:00
|
|
|
static std::string DebugString(const Slice& value,
|
|
|
|
const WideColumns& columns);
|
2022-10-06 22:07:16 +00:00
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
void PrintEnv() const;
|
|
|
|
|
2023-08-22 16:47:04 +00:00
|
|
|
void Open(SharedState* shared, bool reopen = false);
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2019-12-11 04:01:25 +00:00
|
|
|
void Reopen(ThreadState* thread);
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2022-04-28 00:50:54 +00:00
|
|
|
virtual void RegisterAdditionalListeners() {}
|
|
|
|
|
2022-05-02 20:25:45 +00:00
|
|
|
virtual void PrepareTxnDbOptions(SharedState* /*shared*/,
|
|
|
|
TransactionDBOptions& /*txn_db_opts*/) {}
|
2022-04-28 00:50:54 +00:00
|
|
|
|
2022-09-30 23:13:03 +00:00
|
|
|
// Returns whether the timestamp of read_opts is updated.
|
|
|
|
bool MaybeUseOlderTimestampForPointLookup(ThreadState* thread,
|
2022-07-05 20:30:15 +00:00
|
|
|
std::string& ts_str,
|
|
|
|
Slice& ts_slice,
|
|
|
|
ReadOptions& read_opts);
|
|
|
|
|
|
|
|
void MaybeUseOlderTimestampForRangeScan(ThreadState* thread,
|
|
|
|
std::string& ts_str, Slice& ts_slice,
|
|
|
|
ReadOptions& read_opts);
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
std::shared_ptr<Cache> cache_;
|
|
|
|
std::shared_ptr<Cache> compressed_cache_;
|
|
|
|
std::shared_ptr<const FilterPolicy> filter_policy_;
|
|
|
|
DB* db_;
|
|
|
|
TransactionDB* txn_db_;
|
2023-06-17 23:27:37 +00:00
|
|
|
OptimisticTransactionDB* optimistic_txn_db_;
|
2022-06-14 01:54:38 +00:00
|
|
|
|
|
|
|
// Currently only used in MultiOpsTxnsStressTest
|
|
|
|
std::atomic<DB*> db_aptr_;
|
|
|
|
|
2019-12-09 07:49:32 +00:00
|
|
|
Options options_;
|
2021-03-15 11:32:24 +00:00
|
|
|
SystemClock* clock_;
|
2019-12-09 07:49:32 +00:00
|
|
|
std::vector<ColumnFamilyHandle*> column_families_;
|
|
|
|
std::vector<std::string> column_family_names_;
|
|
|
|
std::atomic<int> new_column_family_name_;
|
|
|
|
int num_times_reopened_;
|
|
|
|
std::unordered_map<std::string, std::vector<std::string>> options_table_;
|
|
|
|
std::vector<std::string> options_index_;
|
|
|
|
std::atomic<bool> db_preload_finished_;
|
2024-06-18 23:16:09 +00:00
|
|
|
std::shared_ptr<SstQueryFilterConfigsManager::Factory> sqfc_factory_;
|
2019-12-09 07:49:32 +00:00
|
|
|
|
2019-12-20 16:46:52 +00:00
|
|
|
// Fields used for continuous verification from another thread
|
|
|
|
DB* cmp_db_;
|
|
|
|
std::vector<ColumnFamilyHandle*> cmp_cfhs_;
|
2021-07-01 21:15:49 +00:00
|
|
|
bool is_db_stopped_;
|
2019-12-09 07:49:32 +00:00
|
|
|
};
|
|
|
|
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
// Load options from OPTIONS file and populate `options`.
|
2024-01-29 18:38:08 +00:00
|
|
|
bool InitializeOptionsFromFile(Options& options);
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
|
|
|
|
// Initialize `options` using command line arguments.
|
|
|
|
// When this function is called, `cache`, `block_cache_compressed`,
|
|
|
|
// `filter_policy` have all been initialized. Therefore, we just pass them as
|
|
|
|
// input arguments.
|
2024-01-29 18:38:08 +00:00
|
|
|
void InitializeOptionsFromFlags(
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
const std::shared_ptr<Cache>& cache,
|
|
|
|
const std::shared_ptr<const FilterPolicy>& filter_policy, Options& options);
|
|
|
|
|
|
|
|
// Initialize `options` on which `InitializeOptionsFromFile()` and
|
|
|
|
// `InitializeOptionsFromFlags()` have both been called already.
|
|
|
|
// There are two cases.
|
|
|
|
// Case 1: OPTIONS file is not specified. Command line arguments have been used
|
|
|
|
// to initialize `options`. InitializeOptionsGeneral() will use
|
2023-01-25 23:37:50 +00:00
|
|
|
// `cache` and `filter_policy` to initialize
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
// corresponding fields of `options`. InitializeOptionsGeneral() will
|
|
|
|
// also set up other fields of `options` so that stress test can run.
|
|
|
|
// Examples include `create_if_missing` and
|
|
|
|
// `create_missing_column_families`, etc.
|
|
|
|
// Case 2: OPTIONS file is specified. It is possible that, after loading from
|
|
|
|
// the given OPTIONS files, some shared object fields are still not
|
|
|
|
// initialized because they are not set in the OPTIONS file. In this
|
|
|
|
// case, if command line arguments indicate that the user wants to set
|
|
|
|
// up such shared objects, e.g. block cache, compressed block cache,
|
|
|
|
// row cache, filter policy, then InitializeOptionsGeneral() will honor
|
2023-01-25 23:37:50 +00:00
|
|
|
// the user's choice, thus passing `cache`,
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
// `filter_policy` as input arguments.
|
|
|
|
//
|
|
|
|
// InitializeOptionsGeneral() must not overwrite fields of `options` loaded
|
|
|
|
// from OPTIONS file.
|
2024-01-29 18:38:08 +00:00
|
|
|
void InitializeOptionsGeneral(
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
const std::shared_ptr<Cache>& cache,
|
2024-06-18 23:16:09 +00:00
|
|
|
const std::shared_ptr<const FilterPolicy>& filter_policy,
|
|
|
|
const std::shared_ptr<SstQueryFilterConfigsManager::Factory>& sqfc_factory,
|
|
|
|
Options& options);
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
|
|
|
|
// If no OPTIONS file is specified, set up `options` so that we can test
|
|
|
|
// user-defined timestamp which requires `-user_timestamp_size=8`.
|
|
|
|
// This function also checks for known (currently) incompatible features with
|
|
|
|
// user-defined timestamp.
|
2024-01-29 18:38:08 +00:00
|
|
|
void CheckAndSetOptionsForUserTimestamp(Options& options);
|
Avoid overwriting options loaded from OPTIONS (#9943)
Summary:
This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
for stress test to run.
2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
as follows:
```cpp
if (!InitializeOptionsFromFile(...)) {
InitializeOptionsFromFlags(...);
}
InitializeOptionsGeneral(...);
```
3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
which should be referring to `num_shard_bits` argument.
4. Clarify `fail_if_options_file_error`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
Test Plan:
1. make check
2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
Reviewed By: jay-zhuang
Differential Revision: D36133769
Pulled By: riversand963
fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
2022-05-18 19:43:50 +00:00
|
|
|
|
2024-08-15 19:32:59 +00:00
|
|
|
bool ShouldDisableAutoCompactionsBeforeVerifyDb();
|
2020-02-20 20:07:53 +00:00
|
|
|
} // namespace ROCKSDB_NAMESPACE
|
2019-12-09 07:49:32 +00:00
|
|
|
#endif // GFLAGS
|