Commit Graph

1604 Commits

Author SHA1 Message Date
Yu Zhang 8a462eefae Add user timestamp support into interactive query command (#12716)
Summary:
As titled. This PR also makes the interactive query tool more permissive by allowing the user to continue to try out a different command after the previous command received some allowed errors, such as `Status::NotFound`, `Status::InvalidArgument`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12716

Test Plan:
Manually tested:
```
yuzhangyu@yuzhangyu-mbp rocksdb % ./ldb --db=$TEST_DB --key_hex --value_hex query
get 0x0000000000000000 --read_timestamp=1115559245398440
0x0000000000000000|timestamp:1115559245398440 ==> 0x07000000000102030C0D0E0F08090A0B14151617101112131C1D1E1F18191A1B24252627202122232C2D2E2F28292A2B34353637303132333C3D3E3F38393A3B
put 0x0000000000000000 0x0000
put 0x0000000000000000 => 0x0000 failed: Invalid argument: cannot call this method on column family default that enables timestamp
put 0x0000000000000000 aha 0x0000
put gets invalid argument: Invalid argument: user provided timestamp is not a valid uint64 value.
put 0x0000000000000000 1115559245398441 0x08000000000102030C0D0E0F08090A0B14151617101112131C1D1E1F18191A1B24252627202122232C2D2E2F28292A2B34353637303132333C3D3E3F38393A3B
put 0x0000000000000000 write_ts: 1115559245398441 => 0x08000000000102030C0D0E0F08090A0B14151617101112131C1D1E1F18191A1B24252627202122232C2D2E2F28292A2B34353637303132333C3D3E3F38393A3B succeeded
delete 0x0000000000000000
delete 0x0000000000000000 failed: Invalid argument: cannot call this method on column family default that enables timestamp
delete 0x0000000000000000 1115559245398442
delete 0x0000000000000000 write_ts: 1115559245398442 succeeded
get 0x0000000000000000 --read_timestamp=1115559245398442
get 0x0000000000000000 read_timestamp: 1115559245398442 status: NotFound:
get 0x0000000000000000 --read_timestamp=1115559245398441
0x0000000000000000|timestamp:1115559245398441 ==> 0x08000000000102030C0D0E0F08090A0B14151617101112131C1D1E1F18191A1B24252627202122232C2D2E2F28292A2B34353637303132333C3D3E3F38393A3B
count --from=0x0000000000000000 --to=0x0000000000000001
scan from 0x0000000000000000 to 0x0000000000000001failed: Invalid argument: cannot call this method on column family default that enables timestamp
count --from=0x0000000000000000 --to=0x0000000000000001 --read_timestamp=1115559245398442
0
count --from=0x0000000000000000 --to=0x0000000000000001 --read_timestamp=1115559245398441
1
```

Reviewed By: ltamasi

Differential Revision: D57992183

Pulled By: jowlyzhang

fbshipit-source-id: 720525de22412d16aa952870e088f2c371459ece
2024-05-30 17:23:38 -07:00
anand76 9cc6168c98 Add LDB command and option for follower instances (#12682)
Summary:
Add the `--leader_path` option to specify the directory path of the leader for a follower RocksDB instance. This PR also adds a `count` command to the repl shell. While not specific to followers, it is useful for testing purposes.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12682

Reviewed By: jowlyzhang

Differential Revision: D57642296

Pulled By: anand1976

fbshipit-source-id: 53767d496ecadc363ff92cd958b8e15a7bf3b151
2024-05-28 23:21:32 -07:00
Peter Dillinger d2ef70872f Rename, deprecate `LogFile` and `VectorLogPtr` (#12695)
Summary:
These names are confusing with `Logger` etc. so moving to `WalFile` etc.

Other small, related name refactorings.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12695

Test Plan: Left most unit tests using old names as an API compatibility test. Non-test code compiles with deprecated names removed. No functional changes.

Reviewed By: ajkr

Differential Revision: D57747458

Pulled By: pdillinger

fbshipit-source-id: 7b77596b9c20d865d43b9dc66c30c8bd2b3b424f
2024-05-28 09:24:49 -07:00
Changyu Bi 0ee7f8bacb Fix `max_read_amp` value in crash test (#12701)
Summary:
It should be no less than `level0_file_num_compaction_trigger`(which defaults to 4) when set to a positive value. Otherwise DB open will fail.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12701

Test Plan: crash test not failing DB open due to this option value.

Reviewed By: ajkr

Differential Revision: D57825062

Pulled By: cbi42

fbshipit-source-id: 22d8e12aeceb5cef815157845995a8448552e2d2
2024-05-26 17:26:55 -07:00
Changyu Bi fecb10c2fa Improve universal compaction sorted-run trigger (#12477)
Summary:
Universal compaction currently uses `level0_file_num_compaction_trigger` for two purposes:
1. the trigger for checking if there is any compaction to do, and
2. the limit on the number of sorted runs. RocksDB will do compaction to keep the number of sorted runs no more than the value of this option.

This can make the option inflexible. A value that is too small causes higher write amp: more compactions to reduce the number of sorted runs. A value that is too big delays potential compaction work and causes worse read performance. This PR introduce an option `CompactionOptionsUniversal::max_read_amp` for only the second purpose: to specify
the hard limit on the number of sorted runs.

For backward compatibility, `max_read_amp = -1` by default, which means to fallback to the current behavior.
When `max_read_amp > 0`,`level0_file_num_compaction_trigger` will only serve as a trigger to find potential compaction.
When `max_read_amp = 0`, RocksDB will auto-tune the limit on the number of sorted runs. The estimation is based on DB size, write_buffer_size and size_ratio, so it is adaptive to the size change of the DB. See more in `UniversalCompactionBuilder::PickCompaction()`.
Alternatively, users now can configure `max_read_amp` to a very big value and keep `level0_file_num_compaction_trigger` small. This will allow `size_ratio` and `max_size_amplification_percent` to control the number of sorted runs. This essentially disables compactions with reason kUniversalSortedRunNum.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12477

Test Plan:
* new unit test
* existing unit test for default behavior
* updated crash test with the new option
* benchmark:
  * Create a DB that is roughly 24GB in the last level. When `max_read_amp = 0`, we estimate that the DB needs 9 levels to avoid excessive compactions to reduce the number of sorted runs.
  * We then run fillrandom to ingest another 24GB data to compare write amp.
     * case 1: small level0 trigger: `level0_file_num_compaction_trigger=5, max_read_amp=-1`
       * write-amp: 4.8
     * case 2: auto-tune: `level0_file_num_compaction_trigger=5, max_read_amp=0`
       *  write-amp: 3.6
     * case 3: auto-tune with minimal trigger: `level0_file_num_compaction_trigger=1, max_read_amp=0`
       *  write-amp: 3.8
     * case 4: hard-code a good value for trigger: `level0_file_num_compaction_trigger=9`
       * write-amp: 2.8
```
Case 1:
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   1.0      0.0     0.0      0.0      22.6     22.6       0.0   1.0      0.0    163.2    141.94            111.10       108    1.314       0      0       0.0       0.0
 L45      8/0    1.81 GB   0.0     39.6    11.1     28.5      39.3     10.8       0.0   3.5    209.0    207.3    194.25            191.29        43    4.517    348M  2498K       0.0       0.0
 L46     13/0    3.12 GB   0.0     15.3     9.5      5.8      15.0      9.3       0.0   1.6    203.1    199.3     77.13             75.88        16    4.821    134M  2362K       0.0       0.0
 L47     19/0    4.68 GB   0.0     15.4    10.5      4.9      14.7      9.8       0.0   1.4    204.0    194.9     77.38             76.15         8    9.673    135M  5920K       0.0       0.0
 L48     38/0    9.42 GB   0.0     19.6    11.7      7.9      17.3      9.4       0.0   1.5    206.5    182.3     97.15             95.02         4   24.287    172M    20M       0.0       0.0
 L49     91/0   22.70 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0
 Sum    169/0   41.74 GB   0.0     89.9    42.9     47.0     109.0     61.9       0.0   4.8    156.7    189.8    587.85            549.45       179    3.284    791M    31M       0.0       0.0

Case 2:
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      1/0   214.47 MB   1.2      0.0     0.0      0.0      22.6     22.6       0.0   1.0      0.0    164.5    140.81            109.98       108    1.304       0      0       0.0       0.0
 L44      0/0    0.00 KB   0.0      1.3     1.3      0.0       1.2      1.2       0.0   1.0    206.1    204.9      6.24              5.98         3    2.081     11M    51K       0.0       0.0
 L45      4/0   844.36 MB   0.0      7.1     5.4      1.7       7.0      5.4       0.0   1.3    194.6    192.9     37.41             36.00        13    2.878     62M   489K       0.0       0.0
 L46     11/0    2.57 GB   0.0     14.6     9.8      4.8      14.3      9.5       0.0   1.5    193.7    189.8     77.09             73.54        17    4.535    128M  2411K       0.0       0.0
 L47     24/0    5.81 GB   0.0     19.8    12.0      7.8      18.8     11.0       0.0   1.6    191.4    181.1    106.19            101.21         9   11.799    174M  9166K       0.0       0.0
 L48     38/0    9.42 GB   0.0     19.6    11.8      7.9      17.3      9.4       0.0   1.5    197.3    173.6    101.97             97.23         4   25.491    172M    20M       0.0       0.0
 L49     91/0   22.70 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0
 Sum    169/0   41.54 GB   0.0     62.4    40.3     22.1      81.3     59.2       0.0   3.6    136.1    177.2    469.71            423.94       154    3.050    549M    32M       0.0       0.0

Case 3:
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   5.0      0.0     0.0      0.0      22.6     22.6       0.0   1.0      0.0    163.8    141.43            111.13       108    1.310       0      0       0.0       0.0
 L44      0/0    0.00 KB   0.0      0.8     0.8      0.0       0.8      0.8       0.0   1.0    201.4    200.2      4.26              4.19         2    2.130   7360K    33K       0.0       0.0
 L45      4/0   844.38 MB   0.0      6.3     5.0      1.2       6.2      5.0       0.0   1.2    202.0    200.3     31.81             31.50        12    2.651     55M   403K       0.0       0.0
 L46      7/0    1.62 GB   0.0     13.3     8.8      4.6      13.1      8.6       0.0   1.5    198.9    195.7     68.72             67.89        17    4.042    117M  1696K       0.0       0.0
 L47     24/0    5.81 GB   0.0     21.7    12.9      8.8      20.6     11.8       0.0   1.6    198.5    188.6    112.04            109.97        12    9.336    191M  9352K       0.0       0.0
 L48     41/0   10.14 GB   0.0     24.8    13.0     11.8      21.9     10.1       0.0   1.7    198.6    175.6    127.88            125.36         6   21.313    218M    25M       0.0       0.0
 L49     91/0   22.70 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0
 Sum    167/0   41.10 GB   0.0     67.0    40.5     26.4      85.4     58.9       0.0   3.8    141.1    179.8    486.13            450.04       157    3.096    589M    36M       0.0       0.0

Case 4:
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.7      0.0     0.0      0.0      22.6     22.6       0.0   1.0      0.0    158.6    146.02            114.68       108    1.352       0      0       0.0       0.0
 L42      0/0    0.00 KB   0.0      1.7     1.7      0.0       1.7      1.7       0.0   1.0    185.4    184.3      9.25              8.96         4    2.314     14M    67K       0.0       0.0
 L43      0/0    0.00 KB   0.0      2.5     2.5      0.0       2.5      2.5       0.0   1.0    197.8    195.6     13.01             12.65         4    3.253     22M   202K       0.0       0.0
 L44      4/0   844.40 MB   0.0      4.2     4.2      0.0       4.1      4.1       0.0   1.0    188.1    185.1     22.81             21.89         5    4.562     36M   503K       0.0       0.0
 L45     13/0    3.12 GB   0.0      7.5     6.5      1.0       7.2      6.2       0.0   1.1    188.7    181.8     40.69             39.32         5    8.138     65M  2282K       0.0       0.0
 L46     17/0    4.18 GB   0.0      8.3     7.1      1.2       7.9      6.6       0.0   1.1    192.2    181.8     44.23             43.06         4   11.058     73M  3846K       0.0       0.0
 L47     22/0    5.34 GB   0.0      8.9     7.5      1.4       8.2      6.8       0.0   1.1    189.1    174.1     48.12             45.37         3   16.041     78M  6098K       0.0       0.0
 L48     27/0    6.58 GB   0.0      9.2     7.6      1.6       8.2      6.6       0.0   1.1    195.2    172.9     48.52             47.11         2   24.262     81M  9217K       0.0       0.0
 L49     91/0   22.70 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0
 Sum    174/0   42.74 GB   0.0     42.3    37.0      5.3      62.4     57.1       0.0   2.8    116.3    171.3    372.66            333.04       135    2.760    372M    22M       0.0       0.0

setup:
./db_bench --benchmarks=fillseq,compactall,waitforcompaction --num=200000000 --compression_type=none --disable_wal=1 --compaction_style=1 --num_levels=50 --target_file_size_base=268435456 --max_compaction_bytes=6710886400 --level0_file_num_compaction_trigger=10 --write_buffer_size=268435456 --seed 1708494134896523

benchmark:
./db_bench --benchmarks=overwrite,waitforcompaction,stats --num=200000000 --compression_type=none --disable_wal=1 --compaction_style=1 --write_buffer_size=268435456 --level0_file_num_compaction_trigger=5 --target_file_size_base=268435456 --use_existing_db=1 --num_levels=50 --writes=200000000 --universal_max_read_amp=-1 --seed=1716488324800233

```

Reviewed By: ajkr

Differential Revision: D55370922

Pulled By: cbi42

fbshipit-source-id: 9be69979126b840d08e93e7059260e76a878bb2a
2024-05-24 10:10:31 -07:00
Yu Zhang 9a72cf1a61 Add timestamp support in dump_wal/dump/idump (#12690)
Summary:
As titled.  For dumping wal files, since a mapping from column family id to the user comparator object is needed to print the timestamp in human readable format, option `[--db=<db_path>]` is added to `dump_wal` command to allow the user to choose to optionally open the DB as read only instance and dump the wal file with better timestamp formatting.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12690

Test Plan:
Manually tested

dump_wal:
[dump a wal file specified with --walfile]
```
>> ./ldb --walfile=$TEST_DB/000004.log dump_wal  --print_value
>>1,1,28,13,PUT(0) : 0x666F6F0100000000000000 : 0x7631
(Column family id: [0] contained in WAL are not opened in DB. Applied default hex formatting for user key. Specify --db=<db_path> to open DB for better user key formatting if it contains timestamp.)
```

[dump with --db specified for better timestamp formatting]
```
>> ./ldb --walfile=$TEST_DB/000004.log dump_wal  --db=$TEST_DB --print_value
>> 1,1,28,13,PUT(0) : 0x666F6F|timestamp:1 : 0x7631
```

dump:
[dump a file specified with --path]
```
>>./ldb --path=/tmp/rocksdbtest-501/column_family_test_75359_17910784957761284041/000004.log dump
Sequence,Count,ByteSize,Physical Offset,Key(s) : value
1,1,28,13,PUT(0) : 0x666F6F0100000000000000 : 0x7631
(Column family id: [0] contained in WAL are not opened in DB. Applied default hex formatting for user key. Specify --db=<db_path> to open DB for better user key formatting if it contains timestamp.)
```

[dump db specified with --db]
```
>> ./ldb --db=/tmp/rocksdbtest-501/column_family_test_75359_17910784957761284041 dump
>> foo|timestamp:1 ==> v1
Keys in range: 1
```

idump
```
./ldb --db=$TEST_DB idump
'foo|timestamp:1' seq:1, type:1 => v1
Internal keys in range: 1
```

Reviewed By: ltamasi

Differential Revision: D57755382

Pulled By: jowlyzhang

fbshipit-source-id: a0a2ef80c92801cbf7bfccc64769c1191824362e
2024-05-23 20:26:57 -07:00
Levi Tamasi db0960800a Add Transaction::PutEntity to the stress tests (#12688)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12688

As a first step of covering the wide-column transaction APIs, the patch adds `PutEntity` to the optimistic and pessimistic transaction stress tests (for the latter, only when the WriteCommitted policy is utilized). Other APIs and the multi-operation transaction test will be covered by subsequent PRs.

Reviewed By: jaykorean

Differential Revision: D57675781

fbshipit-source-id: bfe062ec5f6ab48641cd99a70f239ce4aa39299c
2024-05-22 11:30:33 -07:00
Levi Tamasi ad6f6e24c8 Fix txn_write_policy check in crash test script (#12683)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12683

With optimistic transactions, the stress test parameter `txn_write_policy` is not applicable and is thus not set. When the parameter is subsequently checked, Python's dictionary `get` method returns `None`, which is not equal to zero. The net result of this is that currently, `sync_fault_injection` and `manual_wal_flush_one_in` are always disabled in optimistic transaction mode (most likely unintentionally).

Reviewed By: cbi42

Differential Revision: D57655339

fbshipit-source-id: 8b93a788f9b02307b6ea7b2129dc012271130334
2024-05-22 00:49:18 -07:00
Peter Dillinger d89ab23bec Disallow memtable flush and sst ingest while WAL is locked (#12652)
Summary:
We recently noticed that some memtable flushed and file
ingestions could proceed during LockWAL, in violation of its stated
contract. (Note: we aren't 100% sure its actually needed by MySQL, but
we want it to be in a clean state nonetheless.)

Despite earlier skepticism that this could be done safely (https://github.com/facebook/rocksdb/issues/12666), I
found a place to wait to wait for LockWAL to be cleared before allowing
these operations to proceed: WaitForPendingWrites()

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12652

Test Plan:
Added to unit tests. Extended how db_stress validates LockWAL
and re-enabled combination of ingestion and LockWAL in crash test, in
follow-up to https://github.com/facebook/rocksdb/issues/12642

Ran blackbox_crash_test for a long while with relevant features
amplified.

Suggested follow-up: fix FaultInjectionTestFS to report file sizes
consistent with what the user has requested to be flushed.

Reviewed By: jowlyzhang

Differential Revision: D57622142

Pulled By: pdillinger

fbshipit-source-id: aef265fce69465618974b4ec47f4636257c676ce
2024-05-21 10:17:34 -07:00
Hui Xiao d7b938882e Sync WAL during db Close() (#12556)
Summary:
**Context/Summary:**
Below crash test found out we don't sync WAL upon DB close, which can lead to unsynced data loss. This PR syncs it.
```
./db_stress --threads=1 --disable_auto_compactions=1 --WAL_size_limit_MB=0 --WAL_ttl_seconds=0 --acquire_snapshot_one_in=0 --adaptive_readahead=0 --adm_policy=1 --advise_random_on_open=1 --allow_concurrent_memtable_write=1 --allow_data_in_errors=True --allow_fallocate=0 --async_io=0 --auto_readahead_size=0 --avoid_flush_during_recovery=1 --avoid_flush_during_shutdown=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --bgerror_resume_retry_interval=1000000 --block_align=0 --block_protection_bytes_per_key=2 --block_size=16384 --bloom_before_level=1 --bloom_bits=29.895303579352174 --bottommost_compression_type=disable --bottommost_file_compaction_delay=0 --bytes_per_sync=0 --cache_index_and_filter_blocks=0 --cache_index_and_filter_blocks_with_high_priority=1 --cache_size=33554432 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=0 --charge_filter_construction=1 --charge_table_reader=1 --checkpoint_one_in=0 --checksum_type=kxxHash64 --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=0 --compact_range_one_in=0 --compaction_pri=0 --compaction_readahead_size=0 --compaction_style=0 --compaction_ttl=0 --compress_format_version=2 --compressed_secondary_cache_ratio=0 --compressed_secondary_cache_size=0 --compression_checksum=1 --compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 --compression_parallel_threads=4 --compression_type=zstd --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_whitebox --db_write_buffer_size=0 --default_temperature=kUnknown --default_write_temperature=kUnknown --delete_obsolete_files_period_micros=0 --delpercent=0 --delrangepercent=0 --destroy_db_initially=1 --detect_filter_construct_corruption=1 --disable_wal=0 --dump_malloc_stats=0 --enable_checksum_handoff=0 --enable_compaction_filter=0 --enable_custom_split_merge=0 --enable_do_not_compress_roles=1 --enable_index_compression=1 --enable_memtable_insert_with_hint_prefix_extractor=0 --enable_pipelined_write=0 --enable_sst_partitioner_factory=0 --enable_thread_tracking=1 --enable_write_thread_adaptive_yield=0 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=0 --fifo_allow_compaction=1 --file_checksum_impl=none --fill_cache=0 --flush_one_in=1000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=0 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --hard_pending_compaction_bytes_limit=274877906944 --high_pri_pool_ratio=0 --index_block_restart_interval=6 --index_shortening=0 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=16384 --iterpercent=0 --key_len_percent_dist=1,30,69 --last_level_temperature=kUnknown --level_compaction_dynamic_level_bytes=1 --lock_wal_one_in=0 --log2_keys_per_lock=10 --log_file_time_to_roll=0 --log_readahead_size=16777216 --long_running_snapshots=0 --low_pri_pool_ratio=0 --lowest_used_cache_tier=0 --manifest_preallocation_size=5120 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=0 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=2500000 --max_key_len=3 --max_log_file_size=0 --max_manifest_file_size=1073741824 --max_sequential_skip_in_iterations=8 --max_total_wal_size=0 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=10 --max_write_buffer_size_to_maintain=0 --memtable_insert_hint_per_batch=0 --memtable_max_range_deletions=0 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --metadata_charge_policy=0 --min_write_buffer_number_to_merge=1 --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --num_levels=1 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=3 --optimize_filters_for_hits=1 --optimize_filters_for_memory=1 --optimize_multiget_for_io=0 --paranoid_file_checks=0 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=0 --periodic_compaction_seconds=0 --prefix_size=1 --prefixpercent=0 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_amp_bytes_per_bit=0 --read_fault_one_in=0 --readahead_size=16384 --readpercent=0 --recycle_log_file_num=0 --reopen=2 --report_bg_io_stats=1 --sample_for_compression=5 --secondary_cache_fault_one_in=0 --secondary_cache_uri= --skip_stats_update_on_db_open=1 --snapshot_hold_ops=0 --soft_pending_compaction_bytes_limit=68719476736 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=10 --stats_history_buffer_size=1048576 --strict_bytes_per_sync=0 --subcompactions=3 --sync=0 --sync_fault_injection=1 --table_cache_numshardbits=6 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=0 --unpartitioned_pinning=3 --use_adaptive_mutex=1 --use_adaptive_mutex_lru=0 --use_delta_encoding=1 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_get_entity=0 --use_merge=0 --use_multi_get_entity=0 --use_multiget=1 --use_put_entity_one_in=0 --use_write_buffer_manager=0 --user_timestamp_size=0 --value_size_mult=32 --verification_only=0 --verify_checksum=1 --verify_checksum_one_in=1000 --verify_compression=0 --verify_db_one_in=100000 --verify_file_checksums_one_in=0 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=33554432 --write_dbid_to_manifest=0 --write_fault_one_in=0 --writepercent=100

 Verification failed for column family 0 key 000000000000B9D1000000000000012B000000000000017D (4756691): value_from_db: , value_from_expected: 010000000504070609080B0A0D0C0F0E111013121514171619181B1A1D1C1F1E212023222524272629282B2A2D2C2F2E313033323534373639383B3A3D3C3F3E, msg: Iterator verification: Value not found: NotFound:
Verification failed :(
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12556

Test Plan:
- New UT
- Same stress test command failed before this fix but pass after
- CI

Reviewed By: ajkr

Differential Revision: D56267964

Pulled By: hx235

fbshipit-source-id: af1b7e8769c129f64ba1c7f1ff17102f1239b929
2024-05-20 17:33:43 -07:00
Levi Tamasi ef1d4955ba Fix the output of `ldb dump_wal` for PutEntity records (#12677)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12677

The patch contains two fixes related to printing `PutEntity` records with `ldb dump_wal`:
1) It adds the key to the printout (it was missing earlier).
2) It restores the formatting flags of the output stream after dumping the wide-column structure so that any `hex` flag that might have been set does not affect subsequent printing of e.g. sequence numbers.

Reviewed By: jaykorean, jowlyzhang

Differential Revision: D57591295

fbshipit-source-id: af4e3e219f0082ad39bbdfd26f8c5a57ebb898be
2024-05-20 17:04:14 -07:00
Changyu Bi 35985a988c Fix value of `inplace_update_support` across stress test runs (#12675)
Summary:
the value of `inplace_update_support` option need to be fixed across runs of db_stress on the same DB (https://github.com/facebook/rocksdb/issues/12577). My recent fix (https://github.com/facebook/rocksdb/issues/12673) regressed this behavior. Also fix some existing places where this does not hold.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12675

Test Plan: monitor crash tests related to `inplace_update_support`.

Reviewed By: hx235

Differential Revision: D57576375

Pulled By: cbi42

fbshipit-source-id: 75b1bd233f03e5657984f5d5234dbbb1ffc35c27
2024-05-20 13:23:34 -07:00
Changyu Bi c4782bde41 Disable `inplace_update_support` in crash test with unsynced data loss (#12673)
Summary:
With unsynced data loss, we replay traces to recover expected state to DB's latest sequence number. With `inplace_update_support`, the largest sequence number of memtable may not reflect the latest update. This is because inplace updates in memtable do not update sequence number. So we disable `inplace_update_support` where traces need to be replayed.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12673

Reviewed By: ltamasi

Differential Revision: D57512548

Pulled By: cbi42

fbshipit-source-id: 69278fe2e935874faf744d0ac4fd85263773c3ec
2024-05-18 16:48:41 -07:00
Yu Zhang 1567d50a27 Temporarily disable file ingestion if LockWAL is tested (#12642)
Summary:
As titled. A proper fix should probably be failing file ingestion if the DB is in a lock wal state as it promises to "Freezes the logical state of the DB".

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12642

Reviewed By: pdillinger

Differential Revision: D57235869

Pulled By: jowlyzhang

fbshipit-source-id: c70031463842220f865621eb6f53424df27d29e9
2024-05-14 09:27:48 -07:00
Yu Zhang c110091d36 Support read timestamp in ldb (#12641)
Summary:
As titled. Also updated sst_dump to print out user-defined timestamp separately and in human readable format.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12641

Test Plan:
manually tested:
Example success run:
./ldb --db=$TEST_DB multi_get_entity 0x00000000000000950000000000000120 --key_hex --column_family=7  --read_timestamp=1115613683797942 --value_hex
0x00000000000000950000000000000120 ==> :0x0E0000000A0B080906070405020300011E1F1C1D1A1B181916171415121310112E2F2C2D2A2B282926272425222320213E3F3C3D3A3B383936373435323330314E4F4C4D4A4B484946474445424340415E5F5C5D5A5B58595657545552535051
Example failed run:
Failed: GetEntity failed: Invalid argument: column family enables user-defined timestamp while --read_timestamp is not a valid uint64 value.

sst_dump print out:
'000000000000015D000000000000012B000000000000013B|timestamp:1113554493256671' seq:2330405, type:1 => 010000000504070609080B0A0D0C0F0E111013121514171619181B1A1D1C1F1E212023222524272629282B2A2D2C2F2E313033323534373639383B3A3D3C3F3E

Reviewed By: ltamasi

Differential Revision: D57297006

Pulled By: jowlyzhang

fbshipit-source-id: 8486d91468e4f6c0d42dca3c9629f1c45a92bf5a
2024-05-13 15:43:12 -07:00
Peter Dillinger 7747abdc15 Disable PromoteL0 in crash test (#12651)
Summary:
Seeing way too many errors likely related to PromoteL0 from https://github.com/facebook/rocksdb/issues/12617, containing
```
Cannot delete table file #N from level 0 since it is on level X
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12651

Test Plan: watch crash test results

Reviewed By: hx235

Differential Revision: D57286208

Pulled By: pdillinger

fbshipit-source-id: f7f0560cc0804ca297373c8d20ebc34986cc19d0
2024-05-13 12:42:01 -07:00
Jay Huh 1f2715d1d2 AttributeGroup APIs in stress test - PutEntity and GetEntity (#12605)
Summary:
Adding AttributeGroup APIs in stress test. This contains the following changes only. More PRs to follow.

- Introduce `use_attribute_group` flag
- AttributeGroup `PutEntity()` and `GetEntity()` are now used per `use_attribute_group` flag in BatchOps, NonBatchOps and CfConsistency tests

In the next PRs I plan to add
- AttributeGroup `MultiGetEntity()` in Stress Test
- AttributeGroupIterator in Stress Test (along with CoalescingIterator)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12605

Test Plan:
NonBatchOps
```
python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 --use_attribute_group=1 --use_put_entity_one_in=1
```

BatchOps
```
python3 tools/db_crashtest.py blackbox --test_batches_snapshots=1 --max_key=25000000 --write_buffer_size=4194304 --use_attribute_group=1 --use_put_entity_one_in=1
```

CfConsistency Test
```
python3 tools/db_crashtest.py blackbox --cf_consistency --max_key=25000000 --write_buffer_size=4194304 --use_attribute_group=1 --use_put_entity_one_in=1
```

Reviewed By: ltamasi

Differential Revision: D56916768

Pulled By: jaykorean

fbshipit-source-id: 8555d9e0d05927740a10e4e8301e44beec59a6f5
2024-05-09 16:40:22 -07:00
Hui Xiao 9bddac0dcf Add more public APIs to crash test (#12617)
Summary:
**Context/Summary:**
As titled. Bonus: found that PromoteL0 called with other concurrent PromoteL0 will return non-okay error so clarify the API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12617

Test Plan: CI

Reviewed By: ajkr

Differential Revision: D56954428

Pulled By: hx235

fbshipit-source-id: 0e056153c515003fd241ffec59b0d8a27529db4c
2024-05-09 15:37:38 -07:00
Peter Dillinger eeda54fe63 Fix db_crashtest.py for prefixpercent and enable_compaction_filter (#12627)
Summary:
After https://github.com/facebook/rocksdb/issues/12624 seeing db_stress failures due to db_crashtest.py calling it with --prefixpercent=5 --enable_compaction_filter=1

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12627

Test Plan: watch crash test

Reviewed By: ajkr

Differential Revision: D57121592

Pulled By: pdillinger

fbshipit-source-id: 55727355a7662e67efcd22d7e353153e78e24f59
2024-05-08 13:16:44 -07:00
Changyu Bi 174b8294c8 Do not change compaction style for tiered_storage crash tests (#12609)
Summary:
tiered_storage crash tests are intended to run with universal compaction only: e2ef349f56/tools/db_crashtest.py (L562) Update the test script to not change compaction style for tiered_storage crash tests.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12609

Test Plan: `python3 tools/db_crashtest.py whitebox --test_tiered_storage  --ops_per_thread=10 --max_key=100000 --duration=300 --reopen=1`

Reviewed By: ajkr

Differential Revision: D56917965

Pulled By: cbi42

fbshipit-source-id: 5ab693d7153832bae884788ff1f1762b8ec8262d
2024-05-03 10:34:56 -07:00
anand76 6cc7ad15b6 Implement secondary cache admission policy to allow all evicted blocks (#12599)
Summary:
Add a secondary cache admission policy to admit all blocks evicted from the block cache.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12599

Reviewed By: pdillinger

Differential Revision: D56891760

Pulled By: anand1976

fbshipit-source-id: 193c98c055aa3477f4e3a78e5d3daef27a5eacf4
2024-05-02 11:23:35 -07:00
anand76 6349da612b Update HISTORY.md and version to 9.3.0 (#12601)
Summary:
Update HISTORY.md for 9.2 and version to 9.3.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12601

Reviewed By: jaykorean, jowlyzhang

Differential Revision: D56845901

Pulled By: anand1976

fbshipit-source-id: 0d1137a6568e4712be2f8b705f4f7b438217dbed
2024-05-01 16:33:04 -07:00
Yu Zhang 8b3d9e6bfe Add TimedPut to stress test (#12559)
Summary:
This also updates WriteBatch's protection info to include write time since there are several places in memtable that by default protects the whole value slice.

This PR is stacked on https://github.com/facebook/rocksdb/issues/12543

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12559

Reviewed By: pdillinger

Differential Revision: D56308285

Pulled By: jowlyzhang

fbshipit-source-id: 5524339fe0dd6c918dc940ca2f0657b5f2111c56
2024-04-30 15:40:35 -07:00
Andrew Kryczka d80e1d99bc Add `ldb multi_get_entity` subcommand (#12593)
Summary:
Mixed code from `MultiGetCommand` and `GetEntityCommand` to introduce `MultiGetEntityCommand`. Some minor fixes for the related subcommands are included.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12593

Reviewed By: jaykorean

Differential Revision: D56687147

Pulled By: ajkr

fbshipit-source-id: 2ad7b7ba8e05e990b43f2d1eb4990f746ce5f1ea
2024-04-28 21:22:31 -07:00
Andrew Kryczka 2ec25a3e54 Prevent data block compression with `BlockBasedTableOptions::block_align` (#12592)
Summary:
Made `BlockBasedTableOptions::block_align` incompatible (i.e., APIs will return `Status::InvalidArgument`) with more ways of enabling compression: `CompactionOptions::compression`, `ColumnFamilyOptions::compression_per_level`, and `ColumnFamilyOptions::bottommost_compression`. Previously it was only incompatible with `ColumnFamilyOptions::compression`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12592

Reviewed By: hx235

Differential Revision: D56650862

Pulled By: ajkr

fbshipit-source-id: f5201602c2ce436e6d8d30893caa6a161a61f141
2024-04-26 20:05:30 -07:00
Jay Huh a82ba52756 Disable inplace_update_support in OptimisticTxnDB (#12589)
Summary:
Adding OptimisticTransactionDB like https://github.com/facebook/rocksdb/issues/12586

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12589

Test Plan:
```
python3 tools/db_crashtest.py whitebox --optimistic_txn
```
```
Running db_stress with pid=773197: ./db_stress  ...
--inplace_update_support=0  ...
--use_optimistic_txn=1 ...
...
```

Reviewed By: ajkr

Differential Revision: D56635338

Pulled By: jaykorean

fbshipit-source-id: fc3ef13420a2d539c7651d3f5b7dd6c4c89c836d
2024-04-26 16:00:06 -07:00
Changyu Bi d610e14f93 Disable `inplace_update_support` in transaction stress tests (#12586)
Summary:
`MultiOpsTxnsStressTest` relies on snapshot which is incompatible with `inplace_update_support`. TransactionDB uses snapshot too so we don't expect it to be used with `inplace_update_support` either.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12586

Test Plan:
```
python3 tools/db_crashtest.py whitebox --[test_multiops_txn|txn] --txn_write_policy=1
```

Reviewed By: hx235

Differential Revision: D56602769

Pulled By: cbi42

fbshipit-source-id: 8778541295f0af71e8ce912c8f872ab5cc607fc1
2024-04-25 17:06:30 -07:00
Hui Xiao 490d11a012 Clarify `inplace_update_support` with DeleteRange and reenable `inplace_update_support` in crash test (#12577)
Summary:
**Context/Summary:**
Our crash test recently surfaced incompatibilities between DeleteRange and inplace_update_support. Incorrect read result will be returned after insertion into memtables already contain delete range data.

This PR is to clarify this in API and re-enable `inplace_update_support` in crash test with sanitization.

Ideally there should be a way to check memtable for delete range entry upon put under inplace_update_support = true

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12577

Test Plan: CI

Reviewed By: ajkr

Differential Revision: D56492556

Pulled By: hx235

fbshipit-source-id: 9e80e5c69dd708716619a266f41580959680c83b
2024-04-25 14:07:39 -07:00
Hui Xiao d72e60397f Enable block_align in crash test (#12560)
Summary:
**Context/Summary:**
After https://github.com/facebook/rocksdb/pull/12542 there should be no blocker to re-enable block_align in crash test

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12560

Test Plan: CI

Reviewed By: jowlyzhang

Differential Revision: D56479173

Pulled By: hx235

fbshipit-source-id: 7c0bf327da0bd619deb89ab706e6ccd24e5b9543
2024-04-23 15:06:56 -07:00
Hui Xiao 9d37408f9a Temporarily disable inplace_update_support in crash test (#12574)
Summary:
**Context/Summary:**

Our recent crash test failures show inplace_update_support can cause DB to return value inconsistent with expected state upon crash recovery if delete range was used in the previous run AND inplace_update_support=true is used in either previous or the current verification run. Since it's a bit hard to keep track of whether previous run has used delete range or not, I decided to temporarily disable inplace_update_support in crash test to keep crash test stabilized before figuring why these two features are incompatible and how to prevent such combination in crash test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12574

Test Plan: Rehearsed many stress run with `inplace_update_support=0` and they passed

Reviewed By: jaykorean

Differential Revision: D56454951

Pulled By: hx235

fbshipit-source-id: 57f2ae6308bad7ed4077ddb9e658380742afa293
2024-04-23 10:02:18 -07:00
anand76 d8fb849b7e Basic RocksDB follower implementation (#12540)
Summary:
A basic implementation of RocksDB follower mode, which opens a remote database (referred to as leader) on a distributed file system by tailing its MANIFEST. It leverages the secondary instance mode, but is different in some key ways -
1. It has its own directory with links to the leader's database
2. Periodically refreshes itself
3. (Future) Snapshot support
4. (Future) Garbage collection of obsolete links
5. (Long term) Memtable replication

There are two main classes implementing this functionality - `DBImplFollower` and `OnDemandFileSystem`. The former is derived from `DBImplSecondary`. Similar to `DBImplSecondary`, it implements recovery and catch up through MANIFEST tailing using the `ReactiveVersionSet`, but does not consider logs. In a future PR, we will implement memtable replication, which will eliminate the need to catch up using logs. In addition, the recovery and catch-up tries to avoid directory listing as repeated metadata operations are expensive.

The second main piece is the `OnDemandFileSystem`, which plugs in as an `Env` for the follower instance and creates the illusion of the follower directory as a clone of the leader directory. It creates links to SSTs on first reference. When the follower tails the MANIFEST and attempts to create a new `Version`, it calls `VerifyFileMetadata` to verify the size of the file, and optionally the unique ID of the file. During this process, links are created which prevent the underlying files from getting deallocated even if the leader deletes the files.

TODOs: Deletion of obsolete links, snapshots, robust checking against misconfigurations, better observability etc.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12540

Reviewed By: jowlyzhang

Differential Revision: D56315718

Pulled By: anand1976

fbshipit-source-id: d19e1aca43a6af4000cb8622a718031b69ebd97b
2024-04-19 19:13:31 -07:00
Hui Xiao f0864d3eec Temporarily disable reopen with unsync data loss (#12567)
Summary:
**Context/Summary:**
See https://github.com/facebook/rocksdb/pull/12556 for the original problem.
The [fix](https://github.com/facebook/rocksdb/pull/12556) encountered some design [discussion](https://github.com/facebook/rocksdb/pull/12556#discussion_r1572729453) that might take longer than I expected. Temporarily disable reopen with unsync data loss now just to stablize our crash test since the original problem is root-caused already.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12567

Test Plan: CI

Reviewed By: ltamasi

Differential Revision: D56365503

Pulled By: hx235

fbshipit-source-id: 0755e82617c065f42be4c8429e86fa289b250855
2024-04-19 15:23:52 -07:00
Hui Xiao 3bbacda9b1 Disallow inplace_update_support with allow_concurrent_memtable_write (#12550)
Summary:
**Context/Summary:**
In-place memtable updates (inplace_update_support) is not compatible with concurrent writes (allow_concurrent_memtable_write). So we disallow this combination in crash test

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12550

Test Plan: CI

Reviewed By: jaykorean

Differential Revision: D56204269

Pulled By: hx235

fbshipit-source-id: 06608f2591db5e37470a1da6afcdfd2701781c2d
2024-04-16 19:41:38 -07:00
Hui Xiao 24a35b6e57 Add more public APIs to crash/stress test (#12541)
Summary:
**Context/Summary:**
This PR includes some public DB APIs not tested in crash/stress yet can be added in a straightforward way.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12541

Test Plan:
- Locally run crash test heavily stressing on these new APIs
- CI

Reviewed By: jowlyzhang

Differential Revision: D56164892

Pulled By: hx235

fbshipit-source-id: 8bb568c3e65aec39d642987033f1d76c52f69bd8
2024-04-16 15:43:26 -07:00
Jay Huh dfdc3b158e Add offpeak feature to crash test (#12549)
Summary:
As title. Add offpeak feature in stress test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12549

Test Plan:
Ran stress test locally with the flag set
```
Running db_stress with pid=701060: ./db_stress ... --daily_offpeak_time_utc=04:00-08:00 ... --periodic_compaction_seconds=10 ...
...
KILLED 701060

stdout:
 Choosing random keys with no overwrite
Creating 6250000 locks
2024/04/16-11:38:19  Initializing db_stress
RocksDB version           : 9.2
Format version            : 5
TransactionDB             : false
Stacked BlobDB            : false
Read only mode            : false
Atomic flush              : false
Manual WAL flush          : true
Column families           : 1
Clear CFs one in          : 0
Number of threads         : 32
Ops per thread            : 100000000
Time to live(sec)         : unused
Read percentage           : 60%
Prefix percentage         : 0%
Write percentage          : 35%
Delete percentage         : 4%
Delete range percentage   : 1%
No overwrite percentage   : 1%
Iterate percentage        : 0%
Custom ops percentage     : 0%
DB-write-buffer-size      : 0
Write-buffer-size         : 4194304
Iterations                : 10
Max key                   : 25000000
Ratio #ops/#keys          : 128.000000
Num times DB reopens      : 0
Batches/snapshots         : 0
Do update in place        : 0
Num keys per lock         : 4
Compression               : LZ4
Bottommost Compression    : DisableOption
Checksum type             : kxxHash
File checksum impl        : none
Bloom bits / key          : 18.000000
Max subcompactions        : 4
Use MultiGet              : false
Use GetEntity             : false
Use MultiGetEntity        : false
Verification only         : false
Memtablerep               : skip_list
Test kill odd             : 0
Periodic Compaction Secs  : 10
Daily Offpeak UTC         : 04:00-08:00  <<<<<<<<<<<<<<< Newly added
Compaction TTL            : 0
Compaction Pri            : kMinOverlappingRatio
Background Purge          : 0
Write DB ID to manifest   : 0
Max Write Batch Group Size: 16
Use dynamic level         : 1
Read fault one in         : 0
Write fault one in        : 1000
Open metadata write fault one in:
                            8
Sync fault injection      : 0
Best efforts recovery     : 0
Fail if OPTIONS file error: 0
User timestamp size bytes : 0
Persist user defined timestamps : 1
WAL compression           : zstd
Try verify sst unique id  : 1
------------------------------------------------
```

Reviewed By: hx235

Differential Revision: D56203102

Pulled By: jaykorean

fbshipit-source-id: 11a9be7362b3b26940d74d41c8bf4ebac3f03a2d
2024-04-16 12:44:44 -07:00
Hui Xiao d41e568b1c Add inplace_update_support to crash/stress test (#12535)
Summary:
**Context/Summary:**
`inplace_update_support=true` is not tested in crash/stress test. Since it's not compatible with snapshots like compaction_filter, we need to sanitize its value in presence of snapshots-related options. A minor refactoring is added to centralize such sanitization in db_crashtest.py - see `check_multiget_consistency` and `check_multiget_entity_consistency`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12535

Test Plan: CI

Reviewed By: ajkr

Differential Revision: D56102978

Pulled By: hx235

fbshipit-source-id: 2e2ab6685a65123b14a321b99f45f60bc6509c6b
2024-04-15 16:11:58 -07:00
Hui Xiao ef26d68e8d Renable kAdmPolicyThreeQueue in crash test (#12524)
Summary:
Context/Summary:

We need a `nvm_sec_cache` when `kAdmPolicyThreeQueue` is used otherwise a nullptr cache will be accessed causing us segfault in https://github.com/facebook/rocksdb/pull/12521

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12524

Test Plan: - Re-enabled `kAdmPolicyThreeQueue` and rehearsed stress test that failed before this fix and pass after

Reviewed By: jowlyzhang

Differential Revision: D55997093

Pulled By: hx235

fbshipit-source-id: e1c6f1015091b4cff0ce6a3fff981d5dece52a62
2024-04-11 14:53:11 -07:00
Hui Xiao 447b7aa7ed Temporarily disable `kAdmPolicyThreeQueue` in crash test (#12521)
Summary:
**Context/Summary**

This policy leads to segfault in `CompressedCacheSetCapacityThread` with some build/compilation. Before figuring out the why, disable it for now.

**Test**
Rehearse stress test that failed before the fix but passes after

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12521

Reviewed By: jowlyzhang

Differential Revision: D55942399

Pulled By: hx235

fbshipit-source-id: 85f28e50d596dcfd4a316481570b78fdce58ed0b
2024-04-09 16:15:54 -07:00
Hui Xiao ad423abbd1 Add more missing options in crash test (#12508)
Summary:
**Context/Summary:**
This is to improve our crash test coverage.

Bonus change:
- Added the missing Options string mapping for `CacheTier::kVolatileCompressedTier`
- Deprecated crash test options `enable_tiered_storage` mainly for setting `last_level_temperature` which is now covered in crash test by itself
- Intensified `verify_checksum_one_in\verify_file_checksums_one_in` as I found out these together with new coverage surface more issues

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12508

Test Plan: CI to look out for trivial failures

Reviewed By: jowlyzhang

Differential Revision: D55768594

Pulled By: hx235

fbshipit-source-id: 9b829da0309a7db3fcdb17992a524dd64498325c
2024-04-08 09:48:03 -07:00
Changyu Bi 7c28dc8beb Enable parallel compression in crash test (#12506)
Summary:
Since some internal user might be interested in using this feature.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12506

Test Plan:
The option was disabled in stress test due to causing failures.
I've ran a round of crash tests internally and there was no failure due to parallel compression. Will monitor if more runs cause failures. So we will know at least how it's broken and decide to fix them or reverse the change.

Reviewed By: jowlyzhang

Differential Revision: D55747552

Pulled By: cbi42

fbshipit-source-id: ae5cda78c338b8b58f651c557d9b70790362444d
2024-04-05 10:29:08 -07:00
Hui Xiao 8e6e8957fb Disable `wal_bytes_per_sync` at one more place (#12492)
Summary:
Summary/Context: supplement to https://github.com/facebook/rocksdb/pull/12489

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12492

Test Plan: CI

Reviewed By: jaykorean

Differential Revision: D55612747

Pulled By: hx235

fbshipit-source-id: 5c8fbda3e6c8482f2a3363a98a545f1c11e4ea27
2024-04-02 09:44:37 -07:00
Hui Xiao 21d11de761 Temporarily disable `wal_bytes_per_sync` in crash test (#12489)
Summary:
**Context/Summary:**

`wal_bytes_per_sync > 0` can sync newer WAL but not an older WAL by its nature. This creates a hole in synced WAL data. By our crash test, we recently discovered that our DB can recover past that hole. This resulted in crash-recovery-verification error. Before we fix that recovery behavior, we will temporarily disable `wal_bytes_per_sync` in crash test

Bonus: updated the API to make the nature of this option more explicitly documented

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12489

Test Plan: More stabilized crash test

Reviewed By: ajkr

Differential Revision: D55531589

Pulled By: hx235

fbshipit-source-id: 6dea6486420dc0f50550d488c15652f93972a0ea
2024-03-29 13:01:15 -07:00
Andrew Kryczka 3d4e78937a Initialize `FaultInjectionTestFS::checksum_handoff_func_type_` to `kCRC32c` (#12485)
Summary:
Previously it was uninitialized. Setting `checksum_handoff_file_types` will cause `kCRC32c` checksums to be passed down in the `DataVerificationInfo`, so it makes sense for `kCRC32c` to be the default.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12485

Test Plan:
ran `db_stress` in a way that failed before. Building with ASAN was needed to ensure the uninitialized bytes are nonzero according to `malloc_fill_byte` (default 0xbe)

```
$ COMPILE_WITH_ASAN=1 make -j28 db_stress
...
$ ./db_stress -sync_fault_injection=1 -enable_checksum_handoff=true
```

Reviewed By: jaykorean

Differential Revision: D55450587

Pulled By: ajkr

fbshipit-source-id: 53dc829b86e49b3fa80570032e83af0bb12adaad
2024-03-27 18:37:58 -07:00
akankshamahajan 1856734821 Branch cut 9.1.fb (#12476)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/12476

Reviewed By: jowlyzhang

Differential Revision: D55319508

Pulled By: akankshamahajan15

fbshipit-source-id: 2b6db671e027511282775c0fea155335d8e73cc2
2024-03-25 15:07:43 -07:00
Peter Dillinger b515a5db3f Replace ScopedArenaIterator with ScopedArenaPtr<InternalIterator> (#12470)
Summary:
ScopedArenaIterator is not an iterator. It is a pointer wrapper. And we don't need a custom implemented pointer wrapper when std::unique_ptr can be instantiated with what we want.

So this adds ScopedArenaPtr<T> to replace those uses.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12470

Test Plan: CI (including ASAN/UBSAN)

Reviewed By: jowlyzhang

Differential Revision: D55254362

Pulled By: pdillinger

fbshipit-source-id: cc96a0b9840df99aa807f417725e120802c0ae18
2024-03-22 13:40:42 -07:00
anand76 63a105a481 Enable recycle_log_file_num option for point in time recovery (#12403)
Summary:
This option was previously disabled due to a bug in the recovery logic. The recovery code in `DBImpl::RecoverLogFiles` couldn't tell if an EoF reported by the log reader was really an EoF or a possible corruption that made a record look like an old log record. To fix this, the log reader now explicitly reports when it encounters what looks like an old record. The recovery code treats it as a possible corruption, and uses the next sequence number in the WAL to determine if it should continue replaying the WAL.

This PR also fixes a couple of bugs that log file recycling exposed in the backup and checkpoint path.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12403

Test Plan:
1. Add new unit tests to verify behavior upon corruption
2. Re-enable disabled tests for verifying recycling behavior

Reviewed By: ajkr

Differential Revision: D54544824

Pulled By: anand1976

fbshipit-source-id: 12f5ce39bd6bc0d63b0bc6432dc4db510e0e802a
2024-03-21 12:29:35 -07:00
Richard Barnes fc40165614 Remove extra semi colon from internal_repo_rocksdb/repo/tools/ldb_cmd.cc
Summary:
`-Wextra-semi` or `-Wextra-semi-stmt`

If the code compiles, this is safe to land.

Reviewed By: palmje

Differential Revision: D54362227

fbshipit-source-id: ac634ba34f9351ba559c4ed96448f51d6ef33175
2024-03-18 18:51:50 -07:00
Changyu Bi 3d5be596a5 Fix a bug in iterator with UDT + `ReadOptions::pin_data` (#12451)
Summary:
with https://github.com/facebook/rocksdb/issues/12414 enabling `ReadOptions::pin_data`, this bug surfaced as corrupted per key-value checksum during crash test. `saved_key_.GetUserKey()` could be pinned user key, so DBIter should not overwrite it.

In one case, it only surfaces when iterator skips many keys of the same user key. To stress that code path, this PR also added `max_sequential_skip_in_iterations` to crash test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12451

Test Plan:
- Set ReadOptions::pin_data to true, the bug can be reproed quickly with `./db_stress --persist_user_defined_timestamps=1 --user_timestamp_size=8 --writepercent=35 --delpercent=4 --delrangepercent=1 --iterpercent=20 --nooverwritepercent=1 --prefix_size=8 --prefixpercent=10 --readpercent=30 --memtable_protection_bytes_per_key=8 --block_protection_bytes_per_key=2 --clear_column_family_one_in=0`.
    - Set max_sequential_skip_in_iterations to 1 for the other occurrence of the bug.

Reviewed By: jowlyzhang

Differential Revision: D55003766

Pulled By: cbi42

fbshipit-source-id: 23e1049129456684dafb028b6132b70e0afc07fb
2024-03-18 09:05:11 -07:00
Changyu Bi ba022dd44c Disable `enable_checksum_handoff` in crash test (#12431)
Summary:
since it been causing a few crash tests failures, I suspect it'll be easy to repro locally. Also fixed how to print its corruption message so it does not crash with output cannot be utf-8 decoded.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12431

Reviewed By: hx235

Differential Revision: D54881023

Pulled By: cbi42

fbshipit-source-id: 47208a637cd69b30d2545154849405e37db62ed3
2024-03-13 18:03:55 -07:00
Hui Xiao 30243c6573 Add missing db crash options (#12414)
Summary:
**Context/Summary:**
We are doing a sweep in all public options, including but not limited to the `Options`, `Read/WriteOptions`, `IngestExternalFileOptions`, cache options.., to find and add the uncovered ones into db crash. The options included in this PR require minimum changes to db crash other than adding the options themselves.

A bonus change: to surface new issues by improved coverage in stderror, we decided to fail/terminate crash test for manual compactions (CompactFiles, CompactRange()) on meaningful errors. See https://github.com/facebook/rocksdb/pull/12414/files#diff-5c4ced6afb6a90e27fec18ab03b2cd89e8f99db87791b4ecc6fa2694284d50c0R2528-R2532, https://github.com/facebook/rocksdb/pull/12414/files#diff-5c4ced6afb6a90e27fec18ab03b2cd89e8f99db87791b4ecc6fa2694284d50c0R2330-R2336 for more.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12414

Test Plan:
- Run `python3 ./tools/db_crashtest.py --simple blackbox` for 10 minutes to ensure no trivial failure
- Run `python3 tools/db_crashtest.py --simple blackbox --compact_files_one_in=1 --compact_range_one_in=1 --read_fault_one_in=1 --write_fault_one_in=1 --interval=50` for a while to ensure the bonus change does not result in trivial crash/termination of stress test

Reviewed By: ajkr, jowlyzhang, cbi42

Differential Revision: D54691774

Pulled By: hx235

fbshipit-source-id: 50443dfb6aaabd8e24c79a2e42b68c6de877be88
2024-03-12 17:24:12 -07:00