Commit graph

11464 commits

Author SHA1 Message Date
Peter Dillinger 3770d6b74b Fix possible NaN StandardDeviation in Histogram (#10586)
Summary:
Appears possible after 5de98f2 introduced possible lost
updates. Could be related to 2af132c also. Simply ensure no sqrt of
negative.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10586

Test Plan: test added

Reviewed By: ajkr

Differential Revision: D39068391

Pulled By: pdillinger

fbshipit-source-id: 230b214a41e6c9ae91a1ef3e8b2a17b46bbb17c2
2022-09-01 17:46:30 -07:00
Peter Dillinger 9d5b3dabcf Increase CircleCI no_output_timeout for macos-java builds (#10627)
Summary:
... because we are frequently seeing the 10m "no output"
timeouts on these

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10627

Test Plan: CI

Reviewed By: hx235

Differential Revision: D39224922

Pulled By: pdillinger

fbshipit-source-id: f54c7adb5de87b2f57ccbc7f4e6c541b9cd37e08
2022-09-01 17:32:25 -07:00
Levi Tamasi b07217da04 Pin the newly cached blob after insert (#10625)
Summary:
With the current code, when a blob isn't found in the cache and gets read
from the blob file and then inserted into the cache, the application gets
passed the self-contained `PinnableSlice` resulting from the blob file read.
The patch changes this so that the `PinnableSlice` pins the cache entry
instead in this case.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10625

Test Plan: `make check`

Reviewed By: pdillinger

Differential Revision: D39220904

Pulled By: ltamasi

fbshipit-source-id: cb9c62881e3523b1e9f614e00bf503bac2fe3b0a
2022-09-01 16:25:46 -07:00
Akanksha Mahajan 4cd16d65ae Add new option num_file_reads_for_auto_readahead in BlockBasedTableOptions (#10556)
Summary:
RocksDB does auto-readahead for iterators on noticing more
than two reads for a table file if user doesn't provide readahead_size and reads are sequential.
A new option num_file_reads_for_auto_readahead is added which can be
configured and indicates after how many sequential reads prefetching should
be start.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10556

Test Plan: Existing and new unit test

Reviewed By: anand1976

Differential Revision: D38947147

Pulled By: akankshamahajan15

fbshipit-source-id: c9eeab495f84a8df7f701c42f04894e46440ad97
2022-09-01 11:56:00 -07:00
anand76 5fbcc8c54d Update MULTIGET_IO_BATCH_SIZE for non-async MultiGet (#10622)
Summary:
This stat was only getting updated in the async (coroutine) version of MultiGet.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10622

Reviewed By: akankshamahajan15

Differential Revision: D39188790

Pulled By: anand1976

fbshipit-source-id: 7e231507f65fc94c8a006c38f79dfba182a2c24a
2022-08-31 21:03:52 -07:00
Changyu Bi 3a75219e5d Validate option memtable_protection_bytes_per_key (#10621)
Summary:
sanity check value for option `memtable_protection_bytes_per_key` in `ColumnFamilyData::ValidateOptions()`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10621

Test Plan: `make check`, added unit test in ColumnFamilyTest.

Reviewed By: ajkr

Differential Revision: D39180133

Pulled By: cbi42

fbshipit-source-id: 009e0da3ccb332d1c9e14d20193304610bd4eb8a
2022-08-31 17:47:07 -07:00
Andrew Kryczka ccf822492f Reenable sync_fault_injection in crash test (#10172)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10172

Reviewed By: riversand963

Differential Revision: D37164671

Pulled By: ajkr

fbshipit-source-id: 40eb919b8dc261d502510e878ee8ac7874ab35d0
2022-08-31 14:27:23 -07:00
Hui Xiao e7525a1fff Disable use_txn=1 with sync_fault_injection=1 in db_crashtest.py (#10605)
Summary:
**Context/Summary:**
`ExpectedState` is not aware of transaction-related concept so `use_txn=1 ` is not compatible with `sync_fault_injection=1`. Therefore this PR disabled this combination until we expand our correctness testing to transaction related features.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10605

Test Plan:
- Run the following commands to verify `--use_txn` is correctly sanitized
   - `python3 ./tools/db_crashtest.py blackbox --use_txn=1 --sync_fault_injection=1 `
   - `python3 ./tools/db_crashtest.py blackbox --use_txn=0 --sync_fault_injection=1 `

Reviewed By: ajkr

Differential Revision: D39121287

Pulled By: hx235

fbshipit-source-id: 7d5d6dd32479ea1c07df4f38322650f3a60def9c
2022-08-31 13:16:39 -07:00
sdong 9509003503 Option migration tool to break down files for FIFO compaction (#10600)
Summary:
Right now, when the option migration tool migrates to FIFO compaction, it compacts all the data into one single SST file and move to L0. Although it creates a valid LSM-tree for FIFO, for any data to be deleted for FIFO, the giant file will be deleted, which might make the DB almost empty. There is not good solution for it, because usually we don't have enough information to reconstruct the FIFO LSM-tree. This change changes to a solution that compromises the FIFO condition. We hope the solution is more useable.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10600

Test Plan: Add unit tests for that.

Reviewed By: jay-zhuang

Differential Revision: D39106424

fbshipit-source-id: bdfd852c3b343373765b8d9716fefc08fd27145c
2022-08-31 12:08:23 -07:00
Levi Tamasi 228f2c5bf5 Adjust the blob cache printout in db_bench/db_stress (#10614)
Summary:
Currently, `db_bench` and `db_stress` print the blob cache options even if
a shared block/blob cache is configured, i.e. when they are not actually
in effect. The patch changes this so they are only printed when a separate blob
cache is used.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10614

Test Plan: Tested manually using `db_bench` and `db_stress`.

Reviewed By: akankshamahajan15

Differential Revision: D39144603

Pulled By: ltamasi

fbshipit-source-id: f714304c5d46186f8514746c27ee6f52aa3e4af8
2022-08-31 09:55:50 -07:00
Levi Tamasi 01e88dfeb4 Support using cache warming with the secondary blob cache (#10603)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10603

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D39117952

Pulled By: ltamasi

fbshipit-source-id: 5e956fa2fc18974876a5c87686acb50718e0edb7
2022-08-30 17:03:45 -07:00
Hui Xiao 8a85946f58 Add missing mutex when reading from shared variable bg_bottom_compaction_scheduled_, bg_compaction_scheduled_ (#10610)
Summary:
**Context/Summary:**
According to https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_job.h#L328-L332, any reading in the form of `*bg_compaction_scheduled_` , `*bg_bottom_compaction_scheduled_` should be protected by mutex, which isn't the case for some assert statement. This leads to a data race that can be repro-ed by the following command (command coming soon)

```
db=/dev/shm/rocksdb_crashtest_blackbox
exp=/dev/shm/rocksdb_crashtest_expected
rm -rf $db $exp
mkdir -p $exp

./db_stress --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=1 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --value_size_mult=32 --writepercent=90  --compaction_pri=4 --use_txn=1 --level_compaction_dynamic_level_bytes=True  --compaction_ttl=0  --compact_files_one_in=1000000 --compact_range_one_in=1000000 --value_size_mult=32 --verify_db_one_in=1000  --write_buffer_size=65536 --mark_for_compaction_one_file_in=10 --max_background_compactions=20 --max_key=25000000 --max_key_len=3 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=2097152 --target_file_size_base=2097152 --target_file_size_multiplier=2
```
```
WARNING: ThreadSanitizer: data race (pid=73424)
  Read of size 4 at 0x7b8c0000151c by thread T13:
    #0 ReleaseSubcompactionResources internal_repo_rocksdb/repo/db/compaction/compaction_job.cc:390 (db_stress+0x630aa3)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::CompactionJob::Run() internal_repo_rocksdb/repo/db/compaction/compaction_job.cc:741 (db_stress+0x630aa3)
    https://github.com/facebook/rocksdb/issues/2 rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) internal_repo_rocksdb/repo/db/db_impl/db_impl_compaction_flush.cc:3436 (db_stress+0x60b2cc)
    https://github.com/facebook/rocksdb/issues/3 rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) internal_repo_rocksdb/repo/db/db_impl/db_impl_compaction_flush.cc:2950 (db_stress+0x606d79)
    https://github.com/facebook/rocksdb/issues/4 rocksdb::DBImpl::BGWorkCompaction(void*) internal_repo_rocksdb/repo/db/db_impl/db_impl_compaction_flush.cc:2693 (db_stress+0x60356a)

  Previous write of size 4 at 0x7b8c0000151c by thread T12 (mutexes: write M438955329917552448):
    #0 rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) internal_repo_rocksdb/repo/db/db_impl/db_impl_compaction_flush.cc:3018 (db_stress+0x6072a1)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::DBImpl::BGWorkCompaction(void*) internal_repo_rocksdb/repo/db/db_impl/db_impl_compaction_flush.cc:2693 (db_stress+0x60356a)

Location is heap block of size 6720 at 0x7b8c00000000 allocated by main thread:
    #0 operator new(unsigned long, std::align_val_t) <null> (db_stress+0xbab5bb)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool, bool) internal_repo_rocksdb/repo/db/db_impl/db_impl_open.cc:1811 (db_stress+0x69769a)
    https://github.com/facebook/rocksdb/issues/2 rocksdb::TransactionDB::Open(rocksdb::DBOptions const&, rocksdb::TransactionDBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::TransactionDB**) internal_repo_rocksdb/repo/utilities/transactions/pessimistic_transaction_db.cc:258 (db_stress+0x8ae1f4)
    https://github.com/facebook/rocksdb/issues/3 rocksdb::StressTest::Open(rocksdb::SharedState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:2611 (db_stress+0x32b927)
    https://github.com/facebook/rocksdb/issues/4 rocksdb::StressTest::InitDb(rocksdb::SharedState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:290 (db_stress+0x34712c)
```
This PR added all the missing mutex that should've been in place

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10610

Test Plan:
- Past repro command
- Existing CI

Reviewed By: riversand963

Differential Revision: D39143016

Pulled By: hx235

fbshipit-source-id: 51dd4db55ad306f3dbda5d0dd54d6f2513cf70f2
2022-08-30 16:24:01 -07:00
gitbw95 6cd8133035 Fix an import issue in fbcode. (#10604)
Summary:
This should fix an import issue detected in meta internal tests.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10604

Test Plan: Unit Tests.

Reviewed By: hx235

Differential Revision: D39120414

Pulled By: gitbw95

fbshipit-source-id: dbd016d7f47b9f54aab5ea61e8d3cd79734f46af
2022-08-29 21:09:36 -07:00
Yanqin Jin 7c0838e65e Use std::make_unique when possible (#10578)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10578

Test Plan: make check

Reviewed By: ajkr

Differential Revision: D39064748

Pulled By: riversand963

fbshipit-source-id: c7c135b7b713608edb14614846050ece6d4cc59d
2022-08-29 19:09:29 -07:00
Hui Xiao e484b81eee Sync dir containing CURRENT after RenameFile on CURRENT as much as possible (#10573)
Summary:
**Context:**
Below crash test revealed a bug that directory containing CURRENT file (short for `dir_contains_current_file` below) was not always get synced after a new CURRENT is created and being called with `RenameFile` as part of the creation.

This bug exposes a risk that such un-synced directory containing the updated CURRENT can’t survive a host crash (e.g, power loss) hence get corrupted. This then will be followed by a recovery from a corrupted CURRENT that we don't want.

The root-cause is that a nullptr `FSDirectory* dir_contains_current_file` sometimes gets passed-down to `SetCurrentFile()` hence in those case `dir_contains_current_file->FSDirectory::FsyncWithDirOptions()` will be skipped  (which otherwise will internally call`Env/FS::SyncDic()` )
```
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_data_in_errors=True --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=8 --block_size=16384 --bloom_bits=134.8015470676662 --bottommost_compression_type=disable --cache_size=8388608 --checkpoint_one_in=1000000 --checksum_type=kCRC32c --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=2 --compaction_ttl=100 --compression_max_dict_buffer_bytes=511 --compression_max_dict_bytes=16384 --compression_type=zstd --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=65536 --continuous_verification_interval=0 --data_block_index_type=0 --db=$db --db_write_buffer_size=1048576 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --expected_values_dir=$exp --fail_if_options_file_error=1 --file_checksum_impl=none --flush_one_in=1000000 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=4 --ingest_external_file_one_in=0 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --mark_for_compaction_one_file_in=10 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=10000 --max_key_len=3 --max_manifest_file_size=16384 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.001 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --mmap_read=1 --nooverwritepercent=1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=1 --partition_pinning=2 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=5 --prefixpercent=5 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=45 --recycle_log_file_num=0 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=32 --secondary_cache_uri=compressed_secondary_cache://capacity=8388608 --set_options_one_in=10000 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=3 --sync_fault_injection=1 --target_file_size_base=2097 --target_file_size_multiplier=2 --test_batches_snapshots=1 --top_level_index_pinning=1 --use_full_merge_v1=1 --use_merge=1 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=524288 --write_buffer_size=4194 --writepercent=35
```

```
stderr:
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
db_stress: utilities/fault_injection_fs.cc:748: virtual rocksdb::IOStatus rocksdb::FaultInjectionTestFS::RenameFile(const std::string &, const std::string &, const rocksdb::IOOptions &, rocksdb::IODebugContext *): Assertion `tlist.find(tdn.second) == tlist.end()' failed.`
```

**Summary:**
The PR ensured the non-test path pass down a non-null dir containing CURRENT (which is by current RocksDB assumption just db_dir) by doing the following:
- Renamed `directory_to_fsync` as `dir_contains_current_file` in `SetCurrentFile()` to tighten the association between this directory and CURRENT file
- Changed `SetCurrentFile()` API to require `dir_contains_current_file` being passed-in, instead of making it by default nullptr.
    -  Because `SetCurrentFile()`'s `dir_contains_current_file` is passed down from `VersionSet::LogAndApply()` then `VersionSet::ProcessManifestWrites()` (i.e, think about this as a chain of 3 functions related to MANIFEST update), these 2 functions also got refactored to require `dir_contains_current_file`
- Updated the non-test-path callers of these 3 functions to obtain and pass in non-nullptr `dir_contains_current_file`, which by current assumption of RocksDB, is the `FSDirectory* db_dir`.
    - `db_impl` path will obtain `DBImpl::directories_.getDbDir()` while others with no access to such `directories_` are obtained on the fly by creating such object `FileSystem::NewDirectory(..)` and manage it by unique pointers to ensure short life time.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10573

Test Plan:
- `make check`
- Passed the repro db_stress command
- For future improvement, since we currently don't assert dir containing CURRENT to be non-nullptr due to https://github.com/facebook/rocksdb/pull/10573#pullrequestreview-1087698899, there is still chances that future developers mistakenly pass down nullptr dir containing CURRENT thus resulting skipped sync dir and cause the bug again. Therefore a smarter test (e.g, such as quoted from ajkr  "(make) unsynced data loss to be dropping files corresponding to unsynced directory entries") is still needed.

Reviewed By: ajkr

Differential Revision: D39005886

Pulled By: hx235

fbshipit-source-id: 336fb9090d0cfa6ca3dd580db86268007dde7f5a
2022-08-29 17:35:21 -07:00
Levi Tamasi 7818560194 Add a dedicated cache entry role for blobs (#10601)
Summary:
The patch adds a dedicated cache entry role for blob values and switches
to a registered deleter so that blobs show up as a separate bucket
(as opposed to "Misc") in the cache occupancy statistics, e.g.

```
Block cache entry stats(count,size,portion): DataBlock(133515,531.73 MB,13.6866%) BlobValue(1824855,3.10 GB,81.7071%) Misc(1,0.00 KB,0%)
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10601

Test Plan: Ran `make check` and tested the cache occupancy statistics using `db_bench`.

Reviewed By: riversand963

Differential Revision: D39107915

Pulled By: ltamasi

fbshipit-source-id: 8446c3b190a41a144030df73f318eeda4398c125
2022-08-29 16:11:59 -07:00
anand76 72a3fb3424 Update statistics for async scan readaheads (#10585)
Summary:
Imported a fix to "rocksdb.prefetched.bytes.discarded" stat from https://github.com/facebook/rocksdb/issues/10561, and added a new stat "rocksdb.async.prefetch.abort.micros" to measure time spent waiting for async reads to abort.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10585

Reviewed By: akankshamahajan15

Differential Revision: D39067000

Pulled By: anand1976

fbshipit-source-id: d7cda71abb48017239bd5fd832345a16c7024faf
2022-08-29 14:37:44 -07:00
Yanqin Jin 3613d862ba print value when verification fails (#10587)
Summary:
When verification fails for db_stress, print more information about
value read from the db and expected state.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10587

Test Plan:
make check
./db_stress

Reviewed By: akankshamahajan15, hx235

Differential Revision: D39078511

Pulled By: riversand963

fbshipit-source-id: 77ac8ffae01fc3a9b58a02c2e7bbe141e1a18f0b
2022-08-29 14:13:06 -07:00
Peter Dillinger c5afbbfe4b Don't wait for indirect flush in read-only DB (#10569)
Summary:
Some APIs for getting live files, which are used by Checkpoint
and BackupEngine, can optionally trigger and wait for a flush. These
would deadlock when used on a read-only DB. Here we fix that by assuming
the user wants the overall operation to succeed and is OK without
flushing (because the DB is read-only).

Follow-up work: the same or other issues can be hit by directly invoking
some DB functions that are clearly not appropriate for read-only
instance, but are not covered by overrides in DBImplReadOnly and
CompactedDBImpl. These should be fixed to avoid similar problems on
accidental misuse. (Long term, it would be nice to have a DBReadOnly
class without those members, like BackupEngineReadOnly.)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10569

Test Plan: tests updated to catch regression (hang before the fix)

Reviewed By: riversand963

Differential Revision: D38995759

Pulled By: pdillinger

fbshipit-source-id: f5f8bc7123e13cb45bd393dd974d7d6eda20bc68
2022-08-29 13:36:23 -07:00
Changyu Bi 5532b462c4 Verify Iterator/Get() against expected state in only no_batched_ops_test (#10590)
Summary:
https://github.com/facebook/rocksdb/issues/10538 added `TestIterateAgainstExpected()` in `no_batched_ops_test` to verify iterator correctness against the in memory expected state. It is not compatible when run after some other stress tests, e.g. `TestPut()` in `batched_op_stress`, that either do not set expected state when writing to DB or use keys that cannot be parsed by `GetIntVal()`. The assert [here](d17be55aab/db_stress_tool/db_stress_common.h (L520)) could fail. This PR fixed this issue by setting iterator upperbound to `max_key` when `destroy_db_initially=0` to avoid the key space that `batched_op_stress` touches.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10590

Test Plan:
```
# set up DB with batched_op_stress
./db_stress --test_batches_snapshots=1 --verify_iterator_with_expected_state_one_in=1 --max_key_len=3 --max_key=100000000 --skip_verifydb=1 --continuous_verification_interval=0 --writepercent=85 --delpercent=3 --delrangepercent=0 --iterpercent=10 --nooverwritepercent=1 --prefixpercent=0 --readpercent=2 --key_len_percent_dist=1,30,69

# Before this PR, the following test will fail the asserts with error msg like the following
# Assertion failed: (size_key <= key_gen_ctx.weights.size() * sizeof(uint64_t)), function GetIntVal, file db_stress_common.h, line 524.
./db_stress --verify_iterator_with_expected_state_one_in=1 --max_key_len=3 --max_key=100000000 --skip_verifydb=1 --continuous_verification_interval=0 --writepercent=0 --delpercent=3 --delrangepercent=0 --iterpercent=95 --nooverwritepercent=1 --prefixpercent=0 --readpercent=2 --key_len_percent_dist=1,30,69 --destroy_db_initially=0
```

Reviewed By: ajkr

Differential Revision: D39085243

Pulled By: cbi42

fbshipit-source-id: a7dfee2320c330773b623b442d730fd014ec7056
2022-08-29 09:51:40 -07:00
Levi Tamasi 64e74723f7 Use the default metadata charge policy when creating an LRU cache via the Java API (#10577)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10577

Reviewed By: akankshamahajan15

Differential Revision: D39035884

Pulled By: ltamasi

fbshipit-source-id: 48f116f8ca172b7eb5eb3651f39ddb891a7ffade
2022-08-29 09:42:04 -07:00
Andrew Hutchings ce529a4ce1 Fix FreeBSD building (#10575)
Summary:
FreeBSD doesn't have `JEMALLOC_USABLE_SIZE_CONST` so we need to define
it.

This fixes MariaDB MDEV-20248.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10575

Reviewed By: jay-zhuang

Differential Revision: D39057665

Pulled By: ajkr

fbshipit-source-id: 3874779d12a1dd5036324947f6372e6ad57a7b08
2022-08-28 00:05:51 -07:00
zhangenming d17be55aab Make header more natural. (#10580)
Summary:
Fixed #10381 for blog's navigation bar UI.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10580

Reviewed By: hx235

Differential Revision: D39079045

Pulled By: cbi42

fbshipit-source-id: 922cf2624f201c0af42815b23d97361fc0151d93
2022-08-26 20:48:18 -07:00
Levi Tamasi 23376aa576 Improve the accounting of memory used by cached blobs (#10583)
Summary:
The patch improves the bookkeeping around the memory usage of
cached blobs in two ways: 1) it uses `malloc_usable_size`, which accounts
for allocator bin sizes etc., and 2) it also considers the memory usage
of the `BlobContents` object in addition to the blob itself. Note: some unit
tests had been relying on the cache charge being equal to the size of the
cached blob; these were updated.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10583

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D39060680

Pulled By: ltamasi

fbshipit-source-id: 3583adce2b4ce6e84861f3fadccbfd2e5a3cc482
2022-08-26 15:53:08 -07:00
bilyz 7670fdd690 fix trace_analyzer_tool args column position (#10576)
Summary:
The column  meaning explanation is not correct according to the parsed human-readable trace file.

Following are the results data from parsed trace human-readable file format.
The key is in the first column.

```
0x00000005 6 1 0 1661317998095439
0x00000007 0 1 0 1661317998095479
0x00000008 6 1 0 1661317998095493
0x0000000300000001 1 1 6 1661317998101508
0x0000000300000000 1 1 6 1661317998101508
0x0000000300000001 0 1 0 1661317998106486
0x0000000300000000 0 1 0 1661317998106498
0x0000000A 6 1 0 1661317998106515
0x00000007 0 1 0 1661317998111887
0x00000001 6 1 0 1661317998111923
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10576

Reviewed By: ajkr

Differential Revision: D39039110

Pulled By: jay-zhuang

fbshipit-source-id: eade6394c7870005b717846af09a848be6f677ce
2022-08-26 08:44:52 -07:00
Jay Zhuang d9e71fb2c5 Fix periodic_task unable to re-register the same task type (#10379)
Summary:
Timer has a limitation that it cannot re-register a task with the same name,
because the cancel only mark the task as invalid and wait for the Timer thread
to clean it up later, before the task is cleaned up, the same task name cannot
be added. Which makes the task option update likely to fail, which basically
cancel and re-register the same task name. Change the periodic task name to a
random unique id and store it in periodic_task_scheduler.

Also refactor the `periodic_work` to `periodic_task` to make each job function
as a `task`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10379

Test Plan: unittests

Reviewed By: ajkr

Differential Revision: D38000615

Pulled By: jay-zhuang

fbshipit-source-id: e4135f9422e3b53aaec8eda54f4e18ce633a279e
2022-08-25 18:52:37 -07:00
Levi Tamasi 3f57d84af4 Introduce a dedicated class to represent blob values (#10571)
Summary:
The patch introduces a new class called `BlobContents`, which represents
a single uncompressed blob value. We currently use `std::string` for this
purpose; `BlobContents` is somewhat smaller but the primary reason for a
dedicated class is that it enables certain improvements and optimizations
like eliding a copy when inserting a blob into the cache, using custom
allocators, or more control over and better accounting of the memory usage
of cached blobs (see https://github.com/facebook/rocksdb/issues/10484).
(We plan to implement these in subsequent PRs.)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10571

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D39000965

Pulled By: ltamasi

fbshipit-source-id: f296eddf9dec4fc3e11cad525b462bdf63c78f96
2022-08-25 16:45:48 -07:00
Brendan MacDonell 418b36a9bc Support CompactionPri::kRoundRobin in RocksJava (#10572)
Summary:
Pretty trivial — this PR just adds the new compaction priority to the Java API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10572

Reviewed By: hx235

Differential Revision: D39006523

Pulled By: ajkr

fbshipit-source-id: ea8d665817e7b05826c397afa41c3abcda81484e
2022-08-25 13:32:03 -07:00
Brendan MacDonell 9f290a5d15 Update the javadoc for setforceConsistencyChecks (#10574)
Summary:
As of v6.14 (released in 2020), force_consistency_checks is enabled by default. However, the Java documentation does not seem to have been updated to reflect the change at the time.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10574

Reviewed By: hx235

Differential Revision: D39006566

Pulled By: ajkr

fbshipit-source-id: c7b029484d62deaa1f260ec55084049fe39eb84a
2022-08-25 13:27:13 -07:00
Andrew Kryczka 7ad4b38617 Ensure writes to WAL tail during FlushWAL(true /* sync */) will be synced (#10560)
Summary:
WAL append and switch can both happen between `FlushWAL(true /* sync */)`'s sync operations and its call to `MarkLogsSynced()`. We permit this since locks need to be released for the sync operations. Such an appended/switched WAL is both inactive and incompletely synced at the time `MarkLogsSynced()` processes it.

Prior to this PR, `MarkLogsSynced()` assumed all inactive WALs were fully synced and removed them from consideration for future syncs. That was wrong in the scenario described above and led to the latest append(s) never being synced. This PR changes `MarkLogsSynced()` to only remove inactive WALs from consideration for which all flushed data has been synced.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10560

Test Plan: repro unit test for the scenario described above. Without this PR, it fails on "key2" not found

Reviewed By: riversand963

Differential Revision: D38957391

Pulled By: ajkr

fbshipit-source-id: da77175eba97ff251a4219b227b3bb2d4843ed26
2022-08-25 12:53:46 -07:00
Alan Paxton 7fbee01f0c CI benchmarks refine configuration (#10514)
Summary:
CI benchmarks refine configuration

Run only “essential” benchmarks, but for longer
Fix (reduce) the NUM_KEYS to ensure cached behaviour
Reduce level size to try to ensure more levels

Refine test durations again, more time per test, but fewer tests.
In CI benchmark mode, the only read test is readrandom.
There are still 3 mostly-read tests.

Goal is to squeeze complete run a little bit inside 1 hour so it doesn’t clash with the next run (cron scheduled for main branch), but it gets to run as long as possible, so that results are as credible as possible.

Reduce thread count to physical capacity, in an attempt to reduce throughput variance for write heavy tests. See Mark Callaghan’s comments in related documentation..

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10514

Reviewed By: ajkr

Differential Revision: D38952469

Pulled By: jay-zhuang

fbshipit-source-id: 72fa6bba897cc47066ced65facd1fd36e28f30a8
2022-08-25 09:47:03 -07:00
Andrew Kryczka d95e376368 Disable db_stress features incompatible with unsynced data dropping when sync_fault_injection=1 (#10559)
Summary:
The features that cannot work with disable_wal=1 due to unsynced data dropping (ingest_external_file_one_in and enable_compaction_filter) similarly cannot work with sync_fault_injection=1. This PR prevents those features from being used together with sync_fault_injection=1.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10559

Reviewed By: hx235

Differential Revision: D38953019

Pulled By: ajkr

fbshipit-source-id: 7e2c7644ec84d7323f632cf976bcee00502d0ed7
2022-08-24 21:50:34 -07:00
Changyu Bi d140fbfd7d Add Iterator test against expected state to stress test (#10538)
Summary:
As mentioned in https://github.com/facebook/rocksdb/pull/5506#issuecomment-506021913,
`db_stress` does not have much verification for iterator correctness.
It has a `TestIterate()` function, but that is mainly for comparing results
between two iterators, one with `total_order_seek` and the other optionally
sets auto_prefix, upper/lower bounds. Commit 49a0581ad2462e31aa3f768afa769e0d33390f33
added a new `TestIterateAgainstExpected()` function that compares iterator against
expected state. It locks a range of keys, creates an iterator, does
a random sequence of `Next/Prev` and compares against expected state.
This PR is based on that commit, the main changes include some logs
(for easier debugging if a test fails), a forward and backward scan to
cover the entire locked key range, and a flag for optionally turning on
this version of Iterator testing.

Added constraint that the checks against expected state in
`TestIterateAgainstExpected()` and in `TestGet()` are only turned on
when `--skip_verifydb` flag is not set.
Remove the change log introduced in https://github.com/facebook/rocksdb/issues/10553.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10538

Test Plan:
Run `db_stress` with `--verify_iterator_with_expected_state_one_in=1`,
and a large `--iterpercent` and `--num_iterations`. Checked `op_logs`
manually to ensure expected coverage. Tweaked part of the code in
https://github.com/facebook/rocksdb/issues/10449 and stress test was able to catch it.
- internally run various flavor of crash test

Reviewed By: ajkr

Differential Revision: D38847269

Pulled By: cbi42

fbshipit-source-id: 8b4402a9bba9f6cfa08051943cd672579d489599
2022-08-24 14:59:50 -07:00
muthukrishnan.s 79ed4be80f Add get_name, get_id for column family handle in C API (#10499)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10499

Reviewed By: hx235

Differential Revision: D38523859

Pulled By: ajkr

fbshipit-source-id: 268bba1fcce4a3e20c51e498a79d7b476f663aea
2022-08-24 13:49:02 -07:00
Levi Tamasi 78bbdef530 Fix a typo in BlobSecondaryCacheTest (#10566)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10566

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D38989926

Pulled By: ltamasi

fbshipit-source-id: 6402635fe745e4e7eb3083ef9ad9f04c0177d762
2022-08-24 13:08:43 -07:00
sdong 4915f89513 WritableFileWriter to allow operation after failure when SyncWithoutFlush() is involved (#10555)
Summary:
https://github.com/facebook/rocksdb/pull/10489 adds an assertion in most functions in WritableFileWriter to check no previous error. However, it only works without calling SyncWithoutFlush(). The nature of SyncWithoutFlush() makes two concurrent call fails to check status code of each other and causing assertion failure. Fix the problem by skipping the check after SyncWithoutFlush() is called and not check status code in SyncWithoutFlush().

Since the original change was not officially released yet, the fix isn't added to HISTORY.md.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10555

Test Plan: Make sure existing tests still pass

Reviewed By: anand1976

Differential Revision: D38946208

fbshipit-source-id: 63566732d3f25c8a8342840499cf7b7d745f27c2
2022-08-24 07:30:52 -07:00
Changyu Bi 198e5d8ee9 Update TestGet() to verify against expected state (#10553)
Summary:
updated `TestGet()` in `no_batched_op_stress` to check the result of `Get()` operations against expected state (`expected_state_manager_`). More specifically, if `Get()` finds a key, expected state should not have `DELETION_SENTINEL` for the same key, and if `Get()` returns NotFound for a key, expected state should not have the key. One intention for this change it to verify correctness of code path change regarding range tombstones.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10553

Test Plan: run db_stress with nonzero readpercent: `./db_stress_branch --readpercent=57 --prefixpercent=4 --writepercent=25 -delpercent=5 --iterpercent=5 --delrangepercent=4`. When I initially used wrong column family in `thread->shared->Get`, the test reported inconsistencies.

Reviewed By: ajkr

Differential Revision: D38927007

Pulled By: cbi42

fbshipit-source-id: f9f61b312ad0b4c21a799329609ba8526169b048
2022-08-23 17:08:14 -07:00
Mohamed Issa cbe2c6d2d2 Remove unnecessary append to PLUGINS variable in top-level CMakeLists.txt (#10494)
Summary:
The PLUGINS variable already contains a semicolon separated list of plugins to compile, so there is no need to append the space separated list in ROCKSDB_PLUGINS passed in as compile argument. Removing this unnecessary append now allows CMake based compiles for two or more plugins at a time.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10494

Reviewed By: hx235

Differential Revision: D38482094

Pulled By: ajkr

fbshipit-source-id: 61565f7cae2717e70a92132c972b25692ce6f0e8
2022-08-23 16:00:14 -07:00
muthukrishnan.s 616f3bd02e Add grocksdb in Go language bindings (#10498)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10498

Reviewed By: hx235

Differential Revision: D38523574

Pulled By: ajkr

fbshipit-source-id: 4df46fe3bfe49335a278594dfe6fd887879e71ec
2022-08-23 15:02:10 -07:00
lhsoft 38bf569ee7 Fix build error with NIOSTATS_CONTEXT (#10506)
Summary:
Fix https://github.com/facebook/rocksdb/issues/10475

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10506

Reviewed By: hx235

Differential Revision: D38549337

Pulled By: ajkr

fbshipit-source-id: fba864fba1b584c41419ca6015d5d62051539812
2022-08-23 15:00:14 -07:00
EdvardD 6e93d24935 Expose set_checksum function to C api (#10537)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10537

Reviewed By: hx235

Differential Revision: D38797662

Pulled By: ajkr

fbshipit-source-id: a8db723c3eb9d5592cd78f8be7e442e4826686ad
2022-08-23 14:59:27 -07:00
Ryan Mack 06f73d2575 Fix autovector::emplace_back return type for C++17 (#10542)
Summary:
C++17 changes emplace_back API to return the new object. Needed to compile rocksdb on recent compilers.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10542

Reviewed By: hx235

Differential Revision: D38896019

Pulled By: ajkr

fbshipit-source-id: cd7ddf34c0dcd449ecedc41e89a37b3a270a5603
2022-08-23 14:58:16 -07:00
Chen Lixiang 9593fd1c82 Fix wrong compression type and options in universal compaction picker (#10515)
Summary:
In UniversalCompactionBuilder::PickCompactionToReduceSortedRuns, we passed start_level to get compression type and options. I think that is wrong and we should use output_level instead.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10515

Reviewed By: hx235

Differential Revision: D38611335

Pulled By: ajkr

fbshipit-source-id: bb860caed4b6c6bbde8f75fc50cf875a9f04723d
2022-08-23 14:58:02 -07:00
Peter Dillinger db7606a41a Fix "Behavior Changes" in 7.6 HISTORY.md (#10557)
Summary:
see diff

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10557

Test Plan: no functional change

Reviewed By: gitbw95

Differential Revision: D38950531

Pulled By: pdillinger

fbshipit-source-id: af72e80a31d7df38f6e633fa7115984c2274ed60
2022-08-23 13:03:41 -07:00
Changyu Bi 7b9e970042 Optionally issue DeleteRange in *whilewriting benchmarks (#10552)
Summary:
Optionally issue DeleteRange in `*whilewriting` benchmarks. This happens in `BGWriter` and uses similar logic as in `DoWrite` to issue DeleteRange operations. I added this when I was benchmarking https://github.com/facebook/rocksdb/issues/10547, but this should be an independent PR.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10552

Test Plan: ran some benchmarks with various delete range options, e.g. `./db_bench --benchmarks=readwhilewriting --writes_per_range_tombstone=100 --writes=200000 --reads=1000000 --disable_auto_compactions --max_num_range_tombstones=10000`

Reviewed By: ajkr

Differential Revision: D38927020

Pulled By: cbi42

fbshipit-source-id: 31ee20cb8127f7173f0816ea0cc2a204ec02aad6
2022-08-23 11:06:09 -07:00
Hui Xiao b16655a547 Add missing synchronization in TestFSWritableFile (#10544)
Summary:
**Context:**
ajkr's command revealed an existing TSAN data race between `TestFSWritableFile::Append` and `TestFSWritableFile::Sync` on `TestFSWritableFile::state_`

```
$ make clean && COMPILE_WITH_TSAN=1 make -j56 db_stress
$ python3 tools/db_crashtest.py blackbox --simple --duration=3600 --interval=10 --sync_fault_injection=1 --disable_wal=0 --max_key=10000 --checkpoint_one_in=1000
```

The race is due to concurrent access from [checkpoint's WAL sync](https://github.com/facebook/rocksdb/blob/7.4.fb/utilities/fault_injection_fs.cc#L324) and [db put's WAL write when ‘sync_fault_injection=1 ‘](https://github.com/facebook/rocksdb/blob/7.4.fb/utilities/fault_injection_fs.cc#L208) to the `state_` on the same WAL `TestFSWritableFile` under the missing synchronization.

```
WARNING: ThreadSanitizer: data race (pid=11275)
Write of size 8 at 0x7b480003d850 by thread T23 (mutexes: write M69230):
#0 rocksdb::TestFSWritableFile::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:297 (db_stress+0x716004)
https://github.com/facebook/rocksdb/issues/1 rocksdb::(anonymous namespace)::CompositeWritableFileWrapper::Sync() internal_repo_rocksdb/repo/env/composite_env.cc:154 (db_stress+0x4dfa78)
https://github.com/facebook/rocksdb/issues/2 rocksdb::(anonymous namespace)::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/env.cc:280 (db_stress+0x6dfd24)
https://github.com/facebook/rocksdb/issues/3 rocksdb::WritableFileWriter::SyncInternal(bool) internal_repo_rocksdb/repo/file/writable_file_writer.cc:460 (db_stress+0xa1b98c)
https://github.com/facebook/rocksdb/issues/4 rocksdb::WritableFileWriter::SyncWithoutFlush(bool) internal_repo_rocksdb/repo/file/writable_file_writer.cc:435 (db_stress+0xa1e441)
https://github.com/facebook/rocksdb/issues/5 rocksdb::DBImpl::SyncWAL() internal_repo_rocksdb/repo/db/db_impl/db_impl.cc:1385 (db_stress+0x529458)
https://github.com/facebook/rocksdb/issues/6 rocksdb::DBImpl::FlushWAL(bool) internal_repo_rocksdb/repo/db/db_impl/db_impl.cc:1339 (db_stress+0x54f82a)
https://github.com/facebook/rocksdb/issues/7 rocksdb::DBImpl::GetLiveFilesStorageInfo(rocksdb::LiveFilesStorageInfoOptions const&, std::vector<rocksdb::LiveFileStorageInfo, std::allocator<rocksdb::LiveFileStorageInfo> >*) internal_repo_rocksdb/repo/db/db_filesnapshot.cc:387 (db_stress+0x5c831d)
https://github.com/facebook/rocksdb/issues/8 rocksdb::CheckpointImpl::CreateCustomCheckpoint(std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileType)>, std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, rocksdb::FileType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Temperature)>, std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileType)>, unsigned long*, unsigned long, bool) internal_repo_rocksdb/repo/utilities/checkpoint/checkpoint_impl.cc:214 (db_stress+0x4c0343)
https://github.com/facebook/rocksdb/issues/9 rocksdb::CheckpointImpl::CreateCheckpoint(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long*) internal_repo_rocksdb/repo/utilities/checkpoint/checkpoint_impl.cc:123 (db_stress+0x4c237e)
https://github.com/facebook/rocksdb/issues/10 rocksdb::StressTest::TestCheckpoint(rocksdb::ThreadState*, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:1699 (db_stress+0x328340)
https://github.com/facebook/rocksdb/issues/11 rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:825 (db_stress+0x33921f)
https://github.com/facebook/rocksdb/issues/12 rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 (db_stress+0x354857)
https://github.com/facebook/rocksdb/issues/13 rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:447 (db_stress+0x6eb2ad)

Previous read of size 8 at 0x7b480003d850 by thread T64 (mutexes: write M980798978697532600, write M253744503184415024, write M1262):
#0 memcpy <null> (db_stress+0xbc9696)
https://github.com/facebook/rocksdb/issues/1 operator= internal_repo_rocksdb/repo/utilities/fault_injection_fs.h:35 (db_stress+0x70d5f1)
https://github.com/facebook/rocksdb/issues/2 rocksdb::FaultInjectionTestFS::WritableFileAppended(rocksdb::FSFileState const&) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:827 (db_stress+0x70d5f1)
https://github.com/facebook/rocksdb/issues/3 rocksdb::TestFSWritableFile::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:173 (db_stress+0x7143af)
https://github.com/facebook/rocksdb/issues/4 rocksdb::(anonymous namespace)::CompositeWritableFileWrapper::Append(rocksdb::Slice const&) internal_repo_rocksdb/repo/env/composite_env.cc:115 (db_stress+0x4de3ab)
https://github.com/facebook/rocksdb/issues/5 rocksdb::(anonymous namespace)::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/env.cc:248 (db_stress+0x6df44b)
https://github.com/facebook/rocksdb/issues/6 rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long, rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/file/writable_file_writer.cc:551 (db_stress+0xa1a953)
https://github.com/facebook/rocksdb/issues/7 rocksdb::WritableFileWriter::Flush(rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/file/writable_file_writer.cc:327 (db_stress+0xa16ee8)
https://github.com/facebook/rocksdb/issues/8 rocksdb::log::Writer::AddRecord(rocksdb::Slice const&, rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/db/log_writer.cc:147 (db_stress+0x7f121f)
https://github.com/facebook/rocksdb/issues/9 rocksdb::DBImpl::WriteToWAL(rocksdb::WriteBatch const&, rocksdb::log::Writer*, unsigned long*, unsigned long*, rocksdb::Env::IOPriority, rocksdb::DBImpl::LogFileNumberSize&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:1285 (db_stress+0x695042)
https://github.com/facebook/rocksdb/issues/10 rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long, rocksdb::DBImpl::LogFileNumberSize&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:1328 (db_stress+0x6907e8)
https://github.com/facebook/rocksdb/issues/11 rocksdb::DBImpl::PipelinedWriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:731 (db_stress+0x68e8a7)
https://github.com/facebook/rocksdb/issues/12 rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*, rocksdb::PostMemTableCallback*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:283 (db_stress+0x688370)
https://github.com/facebook/rocksdb/issues/13 rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:126 (db_stress+0x69a7b5)
https://github.com/facebook/rocksdb/issues/14 rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:2247 (db_stress+0x698634)
https://github.com/facebook/rocksdb/issues/15 rocksdb::DBImpl::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:37 (db_stress+0x699868)
https://github.com/facebook/rocksdb/issues/16 rocksdb::NonBatchedOpsStressTest::TestPut(rocksdb::ThreadState*, rocksdb::WriteOptions&, rocksdb::ReadOptions const&, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&, char (&) [100], std::unique_ptr<rocksdb::MutexLock, std::default_delete<rocksdb::MutexLock> >&) internal_repo_rocksdb/repo/db_stress_tool/no_batched_ops_stress.cc:681 (db_stress+0x38d20c)
https://github.com/facebook/rocksdb/issues/17 rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:897 (db_stress+0x3399ec)
https://github.com/facebook/rocksdb/issues/18 rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 (db_stress+0x354857)
https://github.com/facebook/rocksdb/issues/19 rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:447 (db_stress+0x6eb2ad)

Location is heap block of size 352 at 0x7b480003d800 allocated by thread T23:
#0 operator new(unsigned long) <null> (db_stress+0xb685dc)
https://github.com/facebook/rocksdb/issues/1 rocksdb::FaultInjectionTestFS::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:506 (db_stress+0x711192)
https://github.com/facebook/rocksdb/issues/2 rocksdb::CompositeEnv::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/env/composite_env.cc:329 (db_stress+0x4d33fa)
https://github.com/facebook/rocksdb/issues/3 rocksdb::EnvWrapper::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/include/rocksdb/env.h:1425 (db_stress+0x300662)
...
```

**Summary:**
- Added the missing lock in functions mentioned above along with three other functions with a similar need in TestFSWritableFile
- Added clarification comment

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10544

Test Plan: - Past the above race condition repro

Reviewed By: ajkr

Differential Revision: D38886634

Pulled By: hx235

fbshipit-source-id: 0571bae9615f35b16fbd8168204607e306b1b486
2022-08-22 15:50:22 -07:00
Bo Wang b0048b673c Post 7.6 branch cut changes (#10546)
Summary:
After branch 7.6.fb branch is cut, following release process, upgrade version number to 7.7 and add 7.6.fb to format compatibility check.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10546

Test Plan: Watch CI

Reviewed By: ajkr

Differential Revision: D38892023

Pulled By: gitbw95

fbshipit-source-id: 94e96dedbd973f5f9713e73d3bed336e4678565b
2022-08-21 20:42:12 -07:00
gitbw95 a9c2c7778d Update HISTORY.md for the upcoming 7.6 release (#10543)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10543

Reviewed By: anand1976

Differential Revision: D38877168

Pulled By: gitbw95

fbshipit-source-id: d6888f7dbb1f2a5bef144ad2443429a61663c1e8
2022-08-20 20:56:40 -07:00
anand76 35cdd3e71e MultiGet async IO across multiple levels (#10535)
Summary:
This PR exploits parallelism in MultiGet across levels. It applies only to the coroutine version of MultiGet. Previously, MultiGet file reads from SST files in the same level were parallelized. With this PR, MultiGet batches with keys distributed across multiple levels are read in parallel. This is accomplished by splitting the keys not present in a level (determined by bloom filtering) into a separate batch, and processing the new batch in parallel with the original batch.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10535

Test Plan:
1. Ensure existing MultiGet unit tests pass, updating them as necessary
2. New unit tests - TODO
3. Run stress test - TODO

No noticeable regression (<1%) without async IO -
Without PR: `multireadrandom :       7.261 micros/op 1101724 ops/sec 60.007 seconds 66110936 operations;  571.6 MB/s (8168992 of 8168992 found)`
With PR: `multireadrandom :       7.305 micros/op 1095167 ops/sec 60.007 seconds 65717936 operations;  568.2 MB/s (8271992 of 8271992 found)`

For a fully cached DB, but with async IO option on, no regression observed (<1%) -
Without PR: `multireadrandom :       5.201 micros/op 1538027 ops/sec 60.005 seconds 92288936 operations;  797.9 MB/s (11540992 of 11540992 found) `
With PR: `multireadrandom :       5.249 micros/op 1524097 ops/sec 60.005 seconds 91452936 operations;  790.7 MB/s (11649992 of 11649992 found) `

Reviewed By: akankshamahajan15

Differential Revision: D38774009

Pulled By: anand1976

fbshipit-source-id: c955e259749f1c091590ade73105b3ee46cd0007
2022-08-19 16:52:52 -07:00
Levi Tamasi 81388b36e0 Add support for wide-column point lookups (#10540)
Summary:
The patch adds a new API `GetEntity` that can be used to perform
wide-column point lookups. It also extends the `Get` code path and
the `MemTable` / `MemTableList` and `Version` / `GetContext` logic
accordingly so that wide-column entities can be served from both
memtables and SSTs. If the result of a lookup is a wide-column entity
(`kTypeWideColumnEntity`), it is passed to the application in deserialized
form; if it is a plain old key-value (`kTypeValue`), it is presented as a
wide-column entity with a single default (anonymous) column.
(In contrast, regular `Get` returns plain old key-values as-is, and
returns the value of the default column for wide-column entities, see
https://github.com/facebook/rocksdb/issues/10483 .)

The result of `GetEntity` is a self-contained `PinnableWideColumns` object.
`PinnableWideColumns` contains a `PinnableSlice`, which either stores the
underlying data in its own buffer or holds on to a cache handle. It also contains
a `WideColumns` instance, which indexes the contents of the `PinnableSlice`,
so applications can access the values of columns efficiently.

There are several pieces of functionality which are currently not supported
for wide-column entities: there is currently no `MultiGetEntity` or wide-column
iterator; also, `Merge` and `GetMergeOperands` are not supported, and there
is no `GetEntity` implementation for read-only and secondary instances.
We plan to implement these in future PRs.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10540

Test Plan: `make check`

Reviewed By: akankshamahajan15

Differential Revision: D38847474

Pulled By: ltamasi

fbshipit-source-id: 42311a34ccdfe88b3775e847a5e2a5296e002b5b
2022-08-19 11:51:12 -07:00