rocksdb/table
Andrew Kryczka 1ba2b8a568 Add sample_for_compression results to table properties (#8139)
Summary:
Added `TableProperties::{fast,slow}_compression_estimated_data_size`.
These properties are present in block-based tables when
`ColumnFamilyOptions::sample_for_compression > 0` and the necessary
compression library is supported when the file is generated. They
contain estimates of what `TableProperties::data_size` would be if the
"fast"/"slow" compression library had been used instead. One
limitation is we do not record exactly which "fast" (ZSTD or Zlib)
or "slow" (LZ4 or Snappy) compression library produced the result.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8139

Test Plan:
- new unit test
- ran `db_bench` with `sample_for_compression=1`; verified the `data_size` property matches the `{slow,fast}_compression_estimated_data_size` when the same compression type is used for the output file compression and the sampled compression

Reviewed By: riversand963

Differential Revision: D27454338

Pulled By: ajkr

fbshipit-source-id: 9529293de93ddac7f03b2e149d746e9f634abac4
2021-03-31 18:21:50 -07:00
..
adaptive Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
block_based Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
cuckoo Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
plain Create a Customizable class to load classes and configurations (#6590) 2020-11-11 15:10:41 -08:00
block_fetcher.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
block_fetcher.h Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) 2020-04-24 15:32:56 -07:00
block_fetcher_test.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
cleanable_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
format.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
format.h Add a host location property to TableProperties (#7479) 2020-10-19 11:38:48 -07:00
get_context.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
get_context.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
internal_iterator.h Clean up InternalIterator upper bound logic a little bit (#7200) 2020-08-05 10:44:57 -07:00
iter_heap.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
iterator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
iterator_wrapper.h Redesign block cache pinning API (#7520) 2020-10-11 14:58:24 -07:00
merger_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
merging_iterator.cc Add some simulator cache and block tracer tests to ASSERT_STATUS_CHECKED (#7305) 2020-08-24 16:43:31 -07:00
merging_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
meta_blocks.cc Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
meta_blocks.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
mock_table.cc Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
mock_table.h No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
multiget_context.h Add initial blob support to batched MultiGet (#7766) 2020-12-14 13:48:22 -08:00
persistent_cache_helper.cc Add more tests to ASSERT_STATUS_CHECKED (#7367) 2020-09-16 15:48:07 -07:00
persistent_cache_helper.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
persistent_cache_options.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
scoped_arena_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
sst_file_dumper.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
sst_file_dumper.h Limit buffering for collecting samples for compression dictionary (#7970) 2021-02-19 14:09:54 -08:00
sst_file_reader.cc Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
sst_file_reader_test.cc Add a host location property to TableProperties (#7479) 2020-10-19 11:38:48 -07:00
sst_file_writer.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
sst_file_writer_collectors.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_builder.h Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_factory.cc Create a Customizable class to load classes and configurations (#6590) 2020-11-11 15:10:41 -08:00
table_properties.cc Add sample_for_compression results to table properties (#8139) 2021-03-31 18:21:50 -07:00
table_properties_internal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_reader.h dedup ReadOptions in iterator hierarchy (#7210) 2020-08-03 15:23:04 -07:00
table_reader_bench.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_reader_caller.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_test.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
two_level_iterator.cc Redesign block cache pinning API (#7520) 2020-10-11 14:58:24 -07:00
two_level_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00