rocksdb/db
Aaron Gao 259a00eaca unbiase readamp bitmap
Summary:
Consider BlockReadAmpBitmap with bytes_per_bit = 32. Suppose bytes [a, b) were used, while bytes [a-32, a)
 and [b+1, b+33) weren't used; more formally, the union of ranges passed to BlockReadAmpBitmap::Mark() contains [a, b) and doesn't intersect with [a-32, a) and [b+1, b+33). Then bits [floor(a/32), ceil(b/32)] will be set, and so the number of useful bytes will be estimated as (ceil(b/32) - floor(a/32)) * 32, which is on average equal to b-a+31.

An extreme example: if we use 1 byte from each block, it'll be counted as 32 bytes from each block.

It's easy to remove this bias by slightly changing the semantics of the bitmap. Currently each bit represents a byte range [i*32, (i+1)*32).

This diff makes each bit represent a single byte: i*32 + X, where X is a random number in [0, 31] generated when bitmap is created. So, e.g., if you read a single byte at random, with probability 31/32 it won't be counted at all, and with probability 1/32 it will be counted as 32 bytes; so, on average it's counted as 1 byte.

*But there is one exception: the last bit will always set with the old way.*

(*) - assuming read_amp_bytes_per_bit = 32.
Closes https://github.com/facebook/rocksdb/pull/2259

Differential Revision: D5035652

Pulled By: lightmark

fbshipit-source-id: bd98b1b9b49fbe61f9e3781d07f624e3cbd92356
2017-05-10 01:49:52 -07:00
..
builder.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
builder.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
c.cc Add C API to set max_file_opening_threads option 2017-05-08 22:49:32 -07:00
c_test.c travis: add Windows cross-compilation 2017-05-05 23:20:01 -07:00
column_family.cc Allow IntraL0 compaction in FIFO Compaction 2017-05-04 18:16:13 -07:00
column_family.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
column_family_test.cc Add bulk create/drop column family API 2017-05-07 23:20:46 -07:00
compact_files_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compacted_db_impl.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compacted_db_impl.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction.cc Avoid calling fallocate with UINT64_MAX 2017-05-04 17:43:22 -07:00
compaction.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_iteration_stats.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_iterator.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_iterator.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_iterator_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_job.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_job.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_job_stats_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_job_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_picker.cc Allow IntraL0 compaction in FIFO Compaction 2017-05-04 18:16:13 -07:00
compaction_picker.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_picker_test.cc Set lower-bound on dynamic level sizes 2017-05-04 18:16:12 -07:00
compaction_picker_universal.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
compaction_picker_universal.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
comparator_db_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
convenience.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
corruption_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
cuckoo_table_db_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_basic_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_block_cache_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_bloom_filter_test.cc dont skip IO for filter blocks 2017-05-09 09:52:02 -07:00
db_compaction_filter_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_compaction_test.cc Roundup read bytes in ReadaheadRandomAccessFile 2017-05-05 12:14:14 -07:00
db_dynamic_level_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_filesnapshot.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_flush_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl.cc Add bulk create/drop column family API 2017-05-07 23:20:46 -07:00
db_impl.h Add bulk create/drop column family API 2017-05-07 23:20:46 -07:00
db_impl_compaction_flush.cc Fix an issue of manual / auto compaction data race 2017-05-02 15:11:59 -07:00
db_impl_debug.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl_experimental.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl_files.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl_open.cc Add bulk create/drop column family API 2017-05-07 23:20:46 -07:00
db_impl_readonly.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl_readonly.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_impl_write.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_info_dumper.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_info_dumper.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_inplace_update_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_io_failure_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_iter.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_iter.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_iter_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_iterator_test.cc do not read next datablock if upperbound is reached 2017-05-05 23:20:01 -07:00
db_log_iter_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_memtable_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_merge_operator_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_options_test.cc Max open files mutable 2017-05-03 21:13:14 -07:00
db_properties_test.cc Add DB:ResetStats() 2017-04-18 16:56:48 -07:00
db_range_del_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_sst_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_statistics_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_table_properties_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_tailing_iter_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_test.cc Allow IntraL0 compaction in FIFO Compaction 2017-05-04 18:16:13 -07:00
db_test2.cc unbiase readamp bitmap 2017-05-10 01:49:52 -07:00
db_test_util.cc Configure index partition size 2017-03-28 12:09:12 -07:00
db_test_util.h Avoid calling fallocate with UINT64_MAX 2017-05-04 17:43:22 -07:00
db_universal_compaction_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
db_wal_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
dbformat.cc do not read next datablock if upperbound is reached 2017-05-05 23:20:01 -07:00
dbformat.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
dbformat_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
deletefile_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
event_helpers.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
event_helpers.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
experimental.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
external_sst_file_basic_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
external_sst_file_ingestion_job.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
external_sst_file_ingestion_job.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
external_sst_file_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
fault_injection_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
file_indexer.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
file_indexer.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
file_indexer_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
filename_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
flush_job.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
flush_job.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
flush_job_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
flush_scheduler.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
flush_scheduler.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
forward_iterator.cc do not read next datablock if upperbound is reached 2017-05-05 23:20:01 -07:00
forward_iterator.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
forward_iterator_bench.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
internal_stats.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
internal_stats.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
job_context.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
listener_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_format.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_reader.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_reader.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_writer.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
log_writer.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
managed_iterator.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
managed_iterator.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
manual_compaction_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
memtable.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
memtable.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
memtable_list.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
memtable_list.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
memtable_list_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_context.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_helper.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_helper.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_helper_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_operator.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
merge_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
options_file_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
perf_context_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
pinned_iterators_manager.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
plain_table_db_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
prefix_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
range_del_aggregator.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
range_del_aggregator.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
range_del_aggregator_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
repair.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
repair_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
snapshot_impl.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
snapshot_impl.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
table_cache.cc do not read next datablock if upperbound is reached 2017-05-05 23:20:01 -07:00
table_cache.h max_open_files dynamic set, follow up 2017-05-04 10:42:45 -07:00
table_properties_collector.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
table_properties_collector.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
table_properties_collector_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
transaction_log_impl.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
transaction_log_impl.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_builder.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_builder.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_builder_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_edit.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_edit.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_edit_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_set.cc do not read next datablock if upperbound is reached 2017-05-05 23:20:01 -07:00
version_set.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
version_set_test.cc Set lower-bound on dynamic level sizes 2017-05-04 18:16:12 -07:00
wal_manager.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
wal_manager.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
wal_manager_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_batch.cc support PopSavePoint for WriteBatch 2017-05-03 10:57:45 -07:00
write_batch_base.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_batch_internal.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_batch_test.cc support PopSavePoint for WriteBatch 2017-05-03 10:57:45 -07:00
write_callback.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
write_callback_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_controller.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_controller.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_controller_test.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_thread.cc Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
write_thread.h Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00