rocksdb/table
Andrew Kryczka 6c40806e51 Digest ZSTD compression dictionary once per SST file (#4251)
Summary:
In RocksDB, for a given SST file, all data blocks are compressed with the same dictionary. When we compress a block using the dictionary's raw bytes, the compression library first has to digest the dictionary to get it into a usable form. This digestion work is redundant and ideally should be done once per file.

ZSTD offers APIs for the caller to create and reuse a digested dictionary object (`ZSTD_CDict`). In this PR, we call `ZSTD_createCDict` once per file to digest the raw bytes. Then we use `ZSTD_compress_usingCDict` to compress each data block using the pre-digested dictionary. Once the file's created `ZSTD_freeCDict` releases the resources held by the digested dictionary.

There are a couple other changes included in this PR:

- Changed the parameter object for (un)compression functions from `CompressionContext`/`UncompressionContext` to `CompressionInfo`/`UncompressionInfo`. This avoids the previous pattern, where `CompressionContext`/`UncompressionContext` had to be mutated before calling a (un)compression function depending on whether dictionary should be used. I felt that mutation was error-prone so eliminated it.
- Added support for digested uncompression dictionaries (`ZSTD_DDict`) as well. However, this PR does not support reusing them across uncompression calls for the same file. That work is deferred to a later PR when we will store the `ZSTD_DDict` objects in block cache.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4251

Differential Revision: D9257078

Pulled By: ajkr

fbshipit-source-id: 21b8cb6bbdd48e459f1c62343780ab66c0a64438
2018-08-23 19:28:18 -07:00
..
adaptive_table_factory.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
adaptive_table_factory.h Comment out unused variables 2018-03-05 13:13:41 -08:00
block.cc DataBlockHashIndex: avoiding expensive iiter->Next when handling hash kNoEntry (#4296) 2018-08-23 10:12:58 -07:00
block.h Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
block_based_filter_block.cc PrefixMayMatch: remove unnecessary check for prefix_extractor_ (#4067) 2018-06-27 20:42:43 -07:00
block_based_filter_block.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
block_based_filter_block_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
block_based_table_builder.cc Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
block_based_table_builder.h Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
block_based_table_factory.cc Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
block_based_table_factory.h Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
block_based_table_reader.cc Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
block_based_table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
block_builder.cc DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
block_builder.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
block_fetcher.cc Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
block_fetcher.h Pin mmap files in ReadOnlyDB (#4053) 2018-06-27 17:13:34 -07:00
block_prefix_index.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
block_prefix_index.h Change RocksDB License 2017-07-15 16:11:23 -07:00
block_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
bloom_block.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
bloom_block.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cleanable_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_builder.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
cuckoo_table_builder.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_builder_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
cuckoo_table_factory.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
cuckoo_table_factory.h comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
cuckoo_table_reader.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
cuckoo_table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
cuckoo_table_reader_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
data_block_footer.cc Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_footer.h Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_hash_index.cc DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
data_block_hash_index.h DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
data_block_hash_index_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
filter_block.h use user_key and iterate_upper_bound to determine compatibility of bloom filters (#3899) 2018-06-26 15:57:26 -07:00
flush_block_policy.cc Align SST file data blocks to avoid spanning multiple pages 2018-03-26 20:26:10 -07:00
format.cc Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
format.h Digest ZSTD compression dictionary once per SST file (#4251) 2018-08-23 19:28:18 -07:00
full_filter_bits_builder.h Skip duplicate bloom keys when whole_key and prefix are mixed 2018-04-24 10:58:16 -07:00
full_filter_block.cc Charging block cache more accurately (#4073) 2018-06-29 08:57:20 -07:00
full_filter_block.h Charging block cache more accurately (#4073) 2018-06-29 08:57:20 -07:00
full_filter_block_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
get_context.cc #3865 followup for fix performance degression introduced by switching order of operands (#4284) 2018-08-17 10:57:25 -07:00
get_context.h Avoid unnecessary big for-loop when reporting ticker stats stored in GetContext (#3490) 2018-07-20 16:58:13 -07:00
index_builder.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
index_builder.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
internal_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iter_heap.h Make InternalKeyComparator final and directly use it in merging iterator 2017-09-11 12:04:21 -07:00
iterator.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iterator_wrapper.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
merger_test.cc Make InternalKeyComparator final and directly use it in merging iterator 2017-09-11 12:04:21 -07:00
merging_iterator.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
merging_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
meta_blocks.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
meta_blocks.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
mock_table.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
mock_table.h Make BlockBasedTableIterator compaction-aware (#4048) 2018-06-25 13:19:27 -07:00
partitioned_filter_block.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
partitioned_filter_block.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
partitioned_filter_block_test.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
persistent_cache_helper.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
persistent_cache_helper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
persistent_cache_options.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_builder.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_builder.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_factory.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_factory.h Comment out unused variables 2018-03-05 13:13:41 -08:00
plain_table_index.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_index.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_key_coding.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
plain_table_key_coding.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_reader.cc Make BlockBasedTableIterator compaction-aware (#4048) 2018-06-25 13:19:27 -07:00
plain_table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
scoped_arena_iterator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
sst_file_writer.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
sst_file_writer_collectors.h Comment out unused variables 2018-03-05 13:13:41 -08:00
table_builder.h Remove random writes from SST file ingestion (#4172) 2018-07-27 16:12:23 -07:00
table_properties.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
table_properties_internal.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
table_reader_bench.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
table_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
two_level_iterator.cc Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
two_level_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00