rocksdb/table
Yanqin Jin 09528f9fa1 Fix a bug for SeekForPrev with partitioned filter and prefix (#8137)
Summary:
According to https://github.com/facebook/rocksdb/issues/5907, each filter partition "should include the bloom of the prefix of the last
key in the previous partition" so that SeekForPrev() in prefix mode can return correct result.
The prefix of the last key in the previous partition does not necessarily have the same prefix
as the first key in the current partition. Regardless of the first key in current partition, the
prefix of the last key in the previous partition should be added. The existing code, however,
does not follow this. Furthermore, there is another issue: when finishing current filter partition,
`FullFilterBlockBuilder::AddPrefix()` is called for the first key in next filter partition, which effectively
overwrites `last_prefix_str_` prematurely. Consequently, when the filter block builder proceeds
to the next partition, `last_prefix_str_` will be the prefix of its first key, leaving no way of adding
the bloom of the prefix of the last key of the previous partition.

Prefix extractor is FixedLength.2.
```
[  filter part 1   ]    [  filter part 2    ]
                  abc    d
```
When SeekForPrev("abcd"), checking the filter partition will land on filter part 2 because "abcd" > "abc"
but smaller than "d".
If the filter in filter part 2 happens to return false for the test for "ab", then SeekForPrev("abcd") will build
incorrect iterator tree in non-total-order mode.

Also fix a unit test which starts to fail following this PR. `InDomain` should not fail due to assertion
error when checking on an arbitrary key.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8137

Test Plan:
```
make check
```

Without this fix, the following command will fail pretty soon.
```
./db_stress --acquire_snapshot_one_in=10000 --avoid_flush_during_recovery=0 \
--avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 \
--batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=17 \
--bottommost_compression_type=disable --cache_index_and_filter_blocks=1 --cache_size=1048576 \
--checkpoint_one_in=0 --checksum_type=kxxHash64 --clear_column_family_one_in=0 \
--compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=0 \
--compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 \
--compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=0 \
--continuous_verification_interval=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_whitebox \
--db_write_buffer_size=8388608 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --enable_blob_files=0 \
--enable_compaction_filter=0 --enable_pipelined_write=1 --file_checksum_impl=big --flush_one_in=1000000 \
--format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 \
--get_sorted_wal_files_one_in=0 --index_block_restart_interval=4 --index_type=2 --ingest_external_file_one_in=0 \
--iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True \
--log2_keys_per_lock=10 --long_running_snapshots=1 --mark_for_compaction_one_file_in=0 \
--max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=100000000 --max_key_len=3 \
--max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 \
--max_write_buffer_size_to_maintain=8388608 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False \
--nooverwritepercent=0 --open_files=500000 --ops_per_thread=20000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=1 --partition_pinning=0 --pause_background_one_in=1000000 \
--periodic_compaction_seconds=0 --prefixpercent=5 --progress_reports=0 --read_fault_one_in=0 --read_only=0 \
--readpercent=45 --recycle_log_file_num=0 --reopen=20 --secondary_catch_up_one_in=0 \
--snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 \
--sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=False \
--target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --test_cf_consistency=0 \
--top_level_index_pinning=0 --unpartitioned_pinning=1 --use_blob_db=0 --use_block_based_filter=0 \
--use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=0 \
--use_multiget=0 --use_ribbon_filter=0 --use_txn=0 --user_timestamp_size=8 --verify_checksum=1 \
--verify_checksum_one_in=1000000 --verify_db_one_in=100000 --write_buffer_size=4194304 \
--write_dbid_to_manifest=1 --writepercent=35
```

Reviewed By: pdillinger

Differential Revision: D27553054

Pulled By: riversand963

fbshipit-source-id: 60e391e4a2d8d98a9a3172ec5d6176b90ec3de98
2021-04-06 12:14:08 -07:00
..
adaptive Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
block_based Fix a bug for SeekForPrev with partitioned filter and prefix (#8137) 2021-04-06 12:14:08 -07:00
cuckoo Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
plain Create a Customizable class to load classes and configurations (#6590) 2020-11-11 15:10:41 -08:00
block_fetcher.cc Replace Status with IOStatus for block fetcher IO function (#8130) 2021-04-01 10:07:55 -07:00
block_fetcher.h Replace Status with IOStatus for block fetcher IO function (#8130) 2021-04-01 10:07:55 -07:00
block_fetcher_test.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
cleanable_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
format.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
format.h Add a host location property to TableProperties (#7479) 2020-10-19 11:38:48 -07:00
get_context.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
get_context.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
internal_iterator.h Clean up InternalIterator upper bound logic a little bit (#7200) 2020-08-05 10:44:57 -07:00
iter_heap.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
iterator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
iterator_wrapper.h Redesign block cache pinning API (#7520) 2020-10-11 14:58:24 -07:00
merger_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
merging_iterator.cc Add some simulator cache and block tracer tests to ASSERT_STATUS_CHECKED (#7305) 2020-08-24 16:43:31 -07:00
merging_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
meta_blocks.cc Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
meta_blocks.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
mock_table.cc Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
mock_table.h No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
multiget_context.h Add initial blob support to batched MultiGet (#7766) 2020-12-14 13:48:22 -08:00
persistent_cache_helper.cc Add more tests to ASSERT_STATUS_CHECKED (#7367) 2020-09-16 15:48:07 -07:00
persistent_cache_helper.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
persistent_cache_options.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
scoped_arena_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
sst_file_dumper.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
sst_file_dumper.h Limit buffering for collecting samples for compression dictionary (#7970) 2021-02-19 14:09:54 -08:00
sst_file_reader.cc Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
sst_file_reader_test.cc Add a host location property to TableProperties (#7479) 2020-10-19 11:38:48 -07:00
sst_file_writer.cc Remove check for status returned by InvalidatePageCache (#8156) 2021-04-06 11:55:14 -07:00
sst_file_writer_collectors.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_builder.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_factory.cc Create a Customizable class to load classes and configurations (#6590) 2020-11-11 15:10:41 -08:00
table_properties.cc Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
table_properties_internal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_reader.h dedup ReadOptions in iterator hierarchy (#7210) 2020-08-03 15:23:04 -07:00
table_reader_bench.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_reader_caller.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_test.cc Fix a bug for SeekForPrev with partitioned filter and prefix (#8137) 2021-04-06 12:14:08 -07:00
two_level_iterator.cc Redesign block cache pinning API (#7520) 2020-10-11 14:58:24 -07:00
two_level_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00