don't run ZSTD_TrainDictionary in BlockBasedTableBuilder if there isn't compression needed (#12453)

Summary:
fixes https://github.com/facebook/rocksdb/issues/12409

### Issue

ZSTD_TrainDictionary [[link](a53ed91691/table/block_based/block_based_table_builder.cc (L1894))] runs for SSTFileWriter::Finish even when bottommost_compression option is set to kNoCompression. This reduces throughput for SstFileWriter::Finish

We construct rocksdb options using ZSTD compression for levels including 2 and above. For levels 0 and 1, we set it to kNoCompression. We also set zstd_max_train_bytes to a non-zero positive value (which is applicable for levels with ZSTD compression enabled). These options are used for the database and also passed to SstFileWriter for creating sst files to be later added to that database. Since the BlockBasedTableBuilder::Finish [[link](a53ed91691/table/block_based/block_based_table_builder.cc (L1892))] only checks for zstd_max_train_bytes to be non-zero positive value, it runs ZSTD_TrainDictionary even when it shouldn't since SSTFileWriter is operating at bottommost level

### Fix

If compression_type is set to kNoCompression, then don't run ZSTD_TrainDictionary and dictionary building

### Testing

I see we have tests for sst file writer with compression type set/unset. Let me know if it isn't covered and I can extend

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12453

Reviewed By: cbi42

Differential Revision: D55030484

Pulled By: ajkr

fbshipit-source-id: 834de2174c2b087d61bf045ca1ae29f337b821a7
This commit is contained in:
Kshitij Wadhwa 2024-03-20 11:07:32 -07:00 committed by Facebook GitHub Bot
parent 3f3f4660bd
commit 4ce1dc930c
2 changed files with 7 additions and 6 deletions

View File

@ -1834,6 +1834,7 @@ class DBBlockCachePinningTest
PinningTier unpartitioned_pinning_;
};
#ifdef LZ4
TEST_P(DBBlockCachePinningTest, TwoLevelDB) {
// Creates one file in L0 and one file in L1. Both files have enough data that
// their index and filter blocks are partitioned. The L1 file will also have
@ -1845,10 +1846,7 @@ TEST_P(DBBlockCachePinningTest, TwoLevelDB) {
const int kNumKeysPerFile = kBlockSize * kNumBlocksPerFile / kKeySize;
Options options = CurrentOptions();
// `kNoCompression` makes the unit test more portable. But it relies on the
// current behavior of persisting/accessing dictionary even when there's no
// (de)compression happening, which seems fairly likely to change over time.
options.compression = kNoCompression;
options.compression = kLZ4Compression;
options.compression_opts.max_dict_bytes = 4 << 10;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
BlockBasedTableOptions table_options;
@ -1961,6 +1959,7 @@ TEST_P(DBBlockCachePinningTest, TwoLevelDB) {
ASSERT_EQ(expected_compression_dict_misses,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
}
#endif
INSTANTIATE_TEST_CASE_P(
DBBlockCachePinningTest, DBBlockCachePinningTest,

View File

@ -480,8 +480,10 @@ struct BlockBasedTableBuilder::Rep {
compression_ctxs(tbo.compression_opts.parallel_threads),
verify_ctxs(tbo.compression_opts.parallel_threads),
verify_dict(),
state((tbo.compression_opts.max_dict_bytes > 0) ? State::kBuffered
: State::kUnbuffered),
state((tbo.compression_opts.max_dict_bytes > 0 &&
tbo.compression_type != kNoCompression)
? State::kBuffered
: State::kUnbuffered),
use_delta_encoding_for_index_values(table_opt.format_version >= 4 &&
!table_opt.block_align),
reason(tbo.reason),