mirror of
https://github.com/facebook/rocksdb.git
synced 2024-12-03 05:54:17 +00:00
7a1b0207e6
Summary: ## Context checksum All RocksDB checksums currently use 32 bits of checking power, which should be 1 in 4 billion false negative (FN) probability (failing to detect corruption). This is true for random corruptions, and in some cases small corruptions are guaranteed to be detected. But some possible corruptions, such as in storage metadata rather than storage payload data, would have a much higher FN rate. For example: * Data larger than one SST block is replaced by data from elsewhere in the same or another SST file. Especially with block_align=true, the probability of exact block size match is probably around 1 in 100, making the FN probability around that same. Without `block_align=true` the probability of same block start location is probably around 1 in 10,000, for FN probability around 1 in a million. To solve this problem in new format_version=6, we add "context awareness" to block checksum checks. The stored and expected checksum value is modified based on the block's position in the file and which file it is in. The modifications are cleverly chosen so that, for example * blocks within about 4GB of each other are guaranteed to use different context * blocks that are offset by exactly some multiple of 4GiB are guaranteed to use different context * files generated by the same process are guaranteed to use different context for the same offsets, until wrap-around after 2^32 - 1 files Thus, with format_version=6, if a valid SST block and checksum is misplaced, its checksum FN probability should be essentially ideal, 1 in 4B. ## Footer checksum This change also adds checksum protection to the SST footer (with format_version=6), for the first time without relying on whole file checksum. To prevent a corruption of the format_version in the footer (e.g. 6 -> 5) to defeat the footer checksum, we change much of the footer data format including an "extended magic number" in format_version 6 that would be interpreted as empty index and metaindex block handles in older footer versions. We also change the encoding of handles to free up space for other new data in footer. ## More detail: making space in footer In order to keep footer the same size in format_version=6 (avoid change to IO patterns), we have to free up some space for new data. We do this two ways: * Metaindex block handle is encoded down to 4 bytes (from 10) by assuming it immediately precedes the footer, and by assuming it is < 4GB. * Index block handle is moved into metaindex. (I don't know why it was in footer to begin with.) ## Performance In case of small performance penalty, I've made a "pay as you go" optimization to compensate: replace `MutableCFOptions` in BlockBasedTableBuilder::Rep with the only field used in that structure after construction: `prefix_extractor`. This makes the PR an overall performance improvement (results below). Nevertheless I'm seeing essentially no difference going from fv=5 to fv=6, even including that improvement for both. That's based on extreme case table write performance testing, many files with many blocks. This is relatively checksum intensive (small blocks) and salt generation intensive (small files). ``` (for I in `seq 1 100`; do TEST_TMPDIR=/dev/shm/dbbench2 ./db_bench -benchmarks=fillseq -memtablerep=vector -disable_wal=1 -allow_concurrent_memtable_write=false -num=3000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -write_buffer_size=100000 -compression_type=none -block_size=1000; done) 2>&1 | grep micros/op | tee out awk '{ tot += $5; n += 1; } END { print int(1.0 * tot / n) }' < out ``` Each value below is ops/s averaged over 100 runs, run simultaneously with competing configuration for load fairness Before -> after (both fv=5): 483530 -> 483673 (negligible) Re-run 1: 480733 -> 485427 (1.0% faster) Re-run 2: 483821 -> 484541 (0.1% faster) Before (fv=5) -> after (fv=6): 482006 -> 485100 (0.6% faster) Re-run 1: 482212 -> 485075 (0.6% faster) Re-run 2: 483590 -> 484073 (0.1% faster) After fv=5 -> after fv=6: 483878 -> 485542 (0.3% faster) Re-run 1: 485331 -> 483385 (0.4% slower) Re-run 2: 485283 -> 483435 (0.4% slower) Re-run 3: 483647 -> 486109 (0.5% faster) Pull Request resolved: https://github.com/facebook/rocksdb/pull/9058 Test Plan: unit tests included (table_test, db_properties_test, salt in env_test). General DB tests and crash test updated to test new format_version. Also temporarily updated the default format version to 6 and saw some test failures. Almost all were due to an inadvertent additional read in VerifyChecksum to verify the index block checksum, though it's arguably a bug that VerifyChecksum does not appear to (re-)verify the index block checksum, just assuming it was verified in opening the index reader (probably *usually* true but probably not always true). Some other concerns about VerifyChecksum are left in FIXME comments. The only remaining test failure on change of default (in block_fetcher_test) now has a comment about how to upgrade the test. The format compatibility test does not need updating because we have not updated the default format_version. Reviewed By: ajkr, mrambacher Differential Revision: D33100915 Pulled By: pdillinger fbshipit-source-id: 8679e3e572fa580181a737fd6d113ed53c5422ee
524 lines
21 KiB
C++
524 lines
21 KiB
C++
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
|
|
#include "table/block_fetcher.h"
|
|
|
|
#include "db/table_properties_collector.h"
|
|
#include "file/file_util.h"
|
|
#include "options/options_helper.h"
|
|
#include "port/port.h"
|
|
#include "port/stack_trace.h"
|
|
#include "rocksdb/db.h"
|
|
#include "rocksdb/file_system.h"
|
|
#include "table/block_based/binary_search_index_reader.h"
|
|
#include "table/block_based/block_based_table_builder.h"
|
|
#include "table/block_based/block_based_table_factory.h"
|
|
#include "table/block_based/block_based_table_reader.h"
|
|
#include "table/format.h"
|
|
#include "test_util/testharness.h"
|
|
#include "utilities/memory_allocators.h"
|
|
|
|
namespace ROCKSDB_NAMESPACE {
|
|
namespace {
|
|
struct MemcpyStats {
|
|
int num_stack_buf_memcpy;
|
|
int num_heap_buf_memcpy;
|
|
int num_compressed_buf_memcpy;
|
|
};
|
|
|
|
struct BufAllocationStats {
|
|
int num_heap_buf_allocations;
|
|
int num_compressed_buf_allocations;
|
|
};
|
|
|
|
struct TestStats {
|
|
MemcpyStats memcpy_stats;
|
|
BufAllocationStats buf_allocation_stats;
|
|
};
|
|
|
|
class BlockFetcherTest : public testing::Test {
|
|
public:
|
|
enum class Mode {
|
|
kBufferedRead = 0,
|
|
kBufferedMmap,
|
|
kDirectRead,
|
|
kNumModes,
|
|
};
|
|
// use NumModes as array size to avoid "size of array '...' has non-integral
|
|
// type" errors.
|
|
const static int NumModes = static_cast<int>(Mode::kNumModes);
|
|
|
|
protected:
|
|
void SetUp() override {
|
|
SetupSyncPointsToMockDirectIO();
|
|
test_dir_ = test::PerThreadDBPath("block_fetcher_test");
|
|
env_ = Env::Default();
|
|
fs_ = FileSystem::Default();
|
|
ASSERT_OK(fs_->CreateDir(test_dir_, IOOptions(), nullptr));
|
|
}
|
|
|
|
void TearDown() override { EXPECT_OK(DestroyDir(env_, test_dir_)); }
|
|
|
|
void AssertSameBlock(const std::string& block1, const std::string& block2) {
|
|
ASSERT_EQ(block1, block2);
|
|
}
|
|
|
|
// Creates a table with kv pairs (i, i) where i ranges from 0 to 9, inclusive.
|
|
void CreateTable(const std::string& table_name,
|
|
const CompressionType& compression_type) {
|
|
std::unique_ptr<WritableFileWriter> writer;
|
|
NewFileWriter(table_name, &writer);
|
|
|
|
// Create table builder.
|
|
ImmutableOptions ioptions(options_);
|
|
InternalKeyComparator comparator(options_.comparator);
|
|
ColumnFamilyOptions cf_options(options_);
|
|
MutableCFOptions moptions(cf_options);
|
|
IntTblPropCollectorFactories factories;
|
|
std::unique_ptr<TableBuilder> table_builder(table_factory_.NewTableBuilder(
|
|
TableBuilderOptions(ioptions, moptions, comparator, &factories,
|
|
compression_type, CompressionOptions(),
|
|
0 /* column_family_id */, kDefaultColumnFamilyName,
|
|
-1 /* level */),
|
|
writer.get()));
|
|
|
|
// Build table.
|
|
for (int i = 0; i < 9; i++) {
|
|
std::string key = ToInternalKey(std::to_string(i));
|
|
// Append "00000000" to string value to enhance compression ratio
|
|
std::string value = "00000000" + std::to_string(i);
|
|
table_builder->Add(key, value);
|
|
}
|
|
ASSERT_OK(table_builder->Finish());
|
|
}
|
|
|
|
void FetchIndexBlock(const std::string& table_name,
|
|
CountedMemoryAllocator* heap_buf_allocator,
|
|
CountedMemoryAllocator* compressed_buf_allocator,
|
|
MemcpyStats* memcpy_stats, BlockContents* index_block,
|
|
std::string* result) {
|
|
FileOptions fopt(options_);
|
|
std::unique_ptr<RandomAccessFileReader> file;
|
|
NewFileReader(table_name, fopt, &file);
|
|
|
|
// Get handle of the index block.
|
|
Footer footer;
|
|
ReadFooter(file.get(), &footer);
|
|
const BlockHandle& index_handle = footer.index_handle();
|
|
// FIXME: index handle will need to come from metaindex for
|
|
// format_version >= 6 when that becomes the default
|
|
ASSERT_FALSE(index_handle.IsNull());
|
|
|
|
CompressionType compression_type;
|
|
FetchBlock(file.get(), index_handle, BlockType::kIndex,
|
|
false /* compressed */, false /* do_uncompress */,
|
|
heap_buf_allocator, compressed_buf_allocator, index_block,
|
|
memcpy_stats, &compression_type);
|
|
ASSERT_EQ(compression_type, CompressionType::kNoCompression);
|
|
result->assign(index_block->data.ToString());
|
|
}
|
|
|
|
// Fetches the first data block in both direct IO and non-direct IO mode.
|
|
//
|
|
// compressed: whether the data blocks are compressed;
|
|
// do_uncompress: whether the data blocks should be uncompressed on fetching.
|
|
// compression_type: the expected compression type.
|
|
//
|
|
// Expects:
|
|
// Block contents are the same.
|
|
// Bufferr allocation and memory copy statistics are expected.
|
|
void TestFetchDataBlock(
|
|
const std::string& table_name_prefix, bool compressed, bool do_uncompress,
|
|
std::array<TestStats, NumModes> expected_stats_by_mode) {
|
|
for (CompressionType compression_type : GetSupportedCompressions()) {
|
|
bool do_compress = compression_type != kNoCompression;
|
|
if (compressed != do_compress) continue;
|
|
std::string compression_type_str =
|
|
CompressionTypeToString(compression_type);
|
|
|
|
std::string table_name = table_name_prefix + compression_type_str;
|
|
CreateTable(table_name, compression_type);
|
|
|
|
CompressionType expected_compression_type_after_fetch =
|
|
(compressed && !do_uncompress) ? compression_type : kNoCompression;
|
|
|
|
BlockContents blocks[NumModes];
|
|
std::string block_datas[NumModes];
|
|
MemcpyStats memcpy_stats[NumModes];
|
|
CountedMemoryAllocator heap_buf_allocators[NumModes];
|
|
CountedMemoryAllocator compressed_buf_allocators[NumModes];
|
|
for (int i = 0; i < NumModes; ++i) {
|
|
SetMode(static_cast<Mode>(i));
|
|
FetchFirstDataBlock(table_name, compressed, do_uncompress,
|
|
expected_compression_type_after_fetch,
|
|
&heap_buf_allocators[i],
|
|
&compressed_buf_allocators[i], &blocks[i],
|
|
&block_datas[i], &memcpy_stats[i]);
|
|
}
|
|
|
|
for (int i = 0; i < NumModes - 1; ++i) {
|
|
AssertSameBlock(block_datas[i], block_datas[i + 1]);
|
|
}
|
|
|
|
// Check memcpy and buffer allocation statistics.
|
|
for (int i = 0; i < NumModes; ++i) {
|
|
const TestStats& expected_stats = expected_stats_by_mode[i];
|
|
|
|
ASSERT_EQ(memcpy_stats[i].num_stack_buf_memcpy,
|
|
expected_stats.memcpy_stats.num_stack_buf_memcpy);
|
|
ASSERT_EQ(memcpy_stats[i].num_heap_buf_memcpy,
|
|
expected_stats.memcpy_stats.num_heap_buf_memcpy);
|
|
ASSERT_EQ(memcpy_stats[i].num_compressed_buf_memcpy,
|
|
expected_stats.memcpy_stats.num_compressed_buf_memcpy);
|
|
|
|
if (kXpressCompression == compression_type) {
|
|
// XPRESS allocates memory internally, thus does not support for
|
|
// custom allocator verification
|
|
continue;
|
|
} else {
|
|
ASSERT_EQ(
|
|
heap_buf_allocators[i].GetNumAllocations(),
|
|
expected_stats.buf_allocation_stats.num_heap_buf_allocations);
|
|
ASSERT_EQ(compressed_buf_allocators[i].GetNumAllocations(),
|
|
expected_stats.buf_allocation_stats
|
|
.num_compressed_buf_allocations);
|
|
|
|
// The allocated buffers are not deallocated until
|
|
// the block content is deleted.
|
|
ASSERT_EQ(heap_buf_allocators[i].GetNumDeallocations(), 0);
|
|
ASSERT_EQ(compressed_buf_allocators[i].GetNumDeallocations(), 0);
|
|
blocks[i].allocation.reset();
|
|
ASSERT_EQ(
|
|
heap_buf_allocators[i].GetNumDeallocations(),
|
|
expected_stats.buf_allocation_stats.num_heap_buf_allocations);
|
|
ASSERT_EQ(compressed_buf_allocators[i].GetNumDeallocations(),
|
|
expected_stats.buf_allocation_stats
|
|
.num_compressed_buf_allocations);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
void SetMode(Mode mode) {
|
|
switch (mode) {
|
|
case Mode::kBufferedRead:
|
|
options_.use_direct_reads = false;
|
|
options_.allow_mmap_reads = false;
|
|
break;
|
|
case Mode::kBufferedMmap:
|
|
options_.use_direct_reads = false;
|
|
options_.allow_mmap_reads = true;
|
|
break;
|
|
case Mode::kDirectRead:
|
|
options_.use_direct_reads = true;
|
|
options_.allow_mmap_reads = false;
|
|
break;
|
|
case Mode::kNumModes:
|
|
assert(false);
|
|
}
|
|
}
|
|
|
|
private:
|
|
std::string test_dir_;
|
|
Env* env_;
|
|
std::shared_ptr<FileSystem> fs_;
|
|
BlockBasedTableFactory table_factory_;
|
|
Options options_;
|
|
|
|
std::string Path(const std::string& fname) { return test_dir_ + "/" + fname; }
|
|
|
|
void WriteToFile(const std::string& content, const std::string& filename) {
|
|
std::unique_ptr<FSWritableFile> f;
|
|
ASSERT_OK(fs_->NewWritableFile(Path(filename), FileOptions(), &f, nullptr));
|
|
ASSERT_OK(f->Append(content, IOOptions(), nullptr));
|
|
ASSERT_OK(f->Close(IOOptions(), nullptr));
|
|
}
|
|
|
|
void NewFileWriter(const std::string& filename,
|
|
std::unique_ptr<WritableFileWriter>* writer) {
|
|
std::string path = Path(filename);
|
|
FileOptions file_options;
|
|
ASSERT_OK(WritableFileWriter::Create(env_->GetFileSystem(), path,
|
|
file_options, writer, nullptr));
|
|
}
|
|
|
|
void NewFileReader(const std::string& filename, const FileOptions& opt,
|
|
std::unique_ptr<RandomAccessFileReader>* reader) {
|
|
std::string path = Path(filename);
|
|
std::unique_ptr<FSRandomAccessFile> f;
|
|
ASSERT_OK(fs_->NewRandomAccessFile(path, opt, &f, nullptr));
|
|
reader->reset(new RandomAccessFileReader(std::move(f), path,
|
|
env_->GetSystemClock().get()));
|
|
}
|
|
|
|
void NewTableReader(const ImmutableOptions& ioptions,
|
|
const FileOptions& foptions,
|
|
const InternalKeyComparator& comparator,
|
|
const std::string& table_name,
|
|
std::unique_ptr<BlockBasedTable>* table) {
|
|
std::unique_ptr<RandomAccessFileReader> file;
|
|
NewFileReader(table_name, foptions, &file);
|
|
|
|
uint64_t file_size = 0;
|
|
ASSERT_OK(env_->GetFileSize(Path(table_name), &file_size));
|
|
|
|
std::unique_ptr<TableReader> table_reader;
|
|
ReadOptions ro;
|
|
const auto* table_options =
|
|
table_factory_.GetOptions<BlockBasedTableOptions>();
|
|
ASSERT_NE(table_options, nullptr);
|
|
ASSERT_OK(BlockBasedTable::Open(ro, ioptions, EnvOptions(), *table_options,
|
|
comparator, std::move(file), file_size,
|
|
0 /* block_protection_bytes_per_key */,
|
|
&table_reader, 0 /* tail_size */));
|
|
|
|
table->reset(reinterpret_cast<BlockBasedTable*>(table_reader.release()));
|
|
}
|
|
|
|
std::string ToInternalKey(const std::string& key) {
|
|
InternalKey internal_key(key, 0, ValueType::kTypeValue);
|
|
return internal_key.Encode().ToString();
|
|
}
|
|
|
|
void ReadFooter(RandomAccessFileReader* file, Footer* footer) {
|
|
uint64_t file_size = 0;
|
|
ASSERT_OK(env_->GetFileSize(file->file_name(), &file_size));
|
|
IOOptions opts;
|
|
ASSERT_OK(ReadFooterFromFile(opts, file, *fs_,
|
|
nullptr /* prefetch_buffer */, file_size,
|
|
footer, kBlockBasedTableMagicNumber));
|
|
}
|
|
|
|
// NOTE: compression_type returns the compression type of the fetched block
|
|
// contents, so if the block is fetched and uncompressed, then it's
|
|
// kNoCompression.
|
|
void FetchBlock(RandomAccessFileReader* file, const BlockHandle& block,
|
|
BlockType block_type, bool compressed, bool do_uncompress,
|
|
MemoryAllocator* heap_buf_allocator,
|
|
MemoryAllocator* compressed_buf_allocator,
|
|
BlockContents* contents, MemcpyStats* stats,
|
|
CompressionType* compresstion_type) {
|
|
ImmutableOptions ioptions(options_);
|
|
ReadOptions roptions;
|
|
PersistentCacheOptions persistent_cache_options;
|
|
Footer footer;
|
|
ReadFooter(file, &footer);
|
|
std::unique_ptr<BlockFetcher> fetcher(new BlockFetcher(
|
|
file, nullptr /* prefetch_buffer */, footer, roptions, block, contents,
|
|
ioptions, do_uncompress, compressed, block_type,
|
|
UncompressionDict::GetEmptyDict(), persistent_cache_options,
|
|
heap_buf_allocator, compressed_buf_allocator));
|
|
|
|
ASSERT_OK(fetcher->ReadBlockContents());
|
|
|
|
stats->num_stack_buf_memcpy = fetcher->TEST_GetNumStackBufMemcpy();
|
|
stats->num_heap_buf_memcpy = fetcher->TEST_GetNumHeapBufMemcpy();
|
|
stats->num_compressed_buf_memcpy =
|
|
fetcher->TEST_GetNumCompressedBufMemcpy();
|
|
|
|
*compresstion_type = fetcher->get_compression_type();
|
|
}
|
|
|
|
// NOTE: expected_compression_type is the expected compression
|
|
// type of the fetched block content, if the block is uncompressed,
|
|
// then the expected compression type is kNoCompression.
|
|
void FetchFirstDataBlock(const std::string& table_name, bool compressed,
|
|
bool do_uncompress,
|
|
CompressionType expected_compression_type,
|
|
MemoryAllocator* heap_buf_allocator,
|
|
MemoryAllocator* compressed_buf_allocator,
|
|
BlockContents* block, std::string* result,
|
|
MemcpyStats* memcpy_stats) {
|
|
ImmutableOptions ioptions(options_);
|
|
InternalKeyComparator comparator(options_.comparator);
|
|
FileOptions foptions(options_);
|
|
|
|
// Get block handle for the first data block.
|
|
std::unique_ptr<BlockBasedTable> table;
|
|
NewTableReader(ioptions, foptions, comparator, table_name, &table);
|
|
|
|
std::unique_ptr<BlockBasedTable::IndexReader> index_reader;
|
|
ReadOptions ro;
|
|
ASSERT_OK(BinarySearchIndexReader::Create(
|
|
table.get(), ro, nullptr /* prefetch_buffer */, false /* use_cache */,
|
|
false /* prefetch */, false /* pin */, nullptr /* lookup_context */,
|
|
&index_reader));
|
|
|
|
std::unique_ptr<InternalIteratorBase<IndexValue>> iter(
|
|
index_reader->NewIterator(
|
|
ReadOptions(), false /* disable_prefix_seek */, nullptr /* iter */,
|
|
nullptr /* get_context */, nullptr /* lookup_context */));
|
|
ASSERT_OK(iter->status());
|
|
iter->SeekToFirst();
|
|
BlockHandle first_block_handle = iter->value().handle;
|
|
|
|
// Fetch first data block.
|
|
std::unique_ptr<RandomAccessFileReader> file;
|
|
NewFileReader(table_name, foptions, &file);
|
|
CompressionType compression_type;
|
|
FetchBlock(file.get(), first_block_handle, BlockType::kData, compressed,
|
|
do_uncompress, heap_buf_allocator, compressed_buf_allocator,
|
|
block, memcpy_stats, &compression_type);
|
|
ASSERT_EQ(compression_type, expected_compression_type);
|
|
result->assign(block->data.ToString());
|
|
}
|
|
};
|
|
|
|
// Skip the following tests in lite mode since direct I/O is unsupported.
|
|
|
|
// Fetch index block under both direct IO and non-direct IO.
|
|
// Expects:
|
|
// the index block contents are the same for both read modes.
|
|
TEST_F(BlockFetcherTest, FetchIndexBlock) {
|
|
for (CompressionType compression : GetSupportedCompressions()) {
|
|
std::string table_name =
|
|
"FetchIndexBlock" + CompressionTypeToString(compression);
|
|
CreateTable(table_name, compression);
|
|
|
|
CountedMemoryAllocator allocator;
|
|
MemcpyStats memcpy_stats;
|
|
BlockContents indexes[NumModes];
|
|
std::string index_datas[NumModes];
|
|
for (int i = 0; i < NumModes; ++i) {
|
|
SetMode(static_cast<Mode>(i));
|
|
FetchIndexBlock(table_name, &allocator, &allocator, &memcpy_stats,
|
|
&indexes[i], &index_datas[i]);
|
|
}
|
|
for (int i = 0; i < NumModes - 1; ++i) {
|
|
AssertSameBlock(index_datas[i], index_datas[i + 1]);
|
|
}
|
|
}
|
|
}
|
|
|
|
// Data blocks are not compressed,
|
|
// fetch data block under direct IO, mmap IO,and non-direct IO.
|
|
// Expects:
|
|
// 1. in non-direct IO mode, allocate a heap buffer and memcpy the block
|
|
// into the buffer;
|
|
// 2. in direct IO mode, allocate a heap buffer and memcpy from the
|
|
// direct IO buffer to the heap buffer.
|
|
TEST_F(BlockFetcherTest, FetchUncompressedDataBlock) {
|
|
TestStats expected_non_mmap_stats = {
|
|
{
|
|
0 /* num_stack_buf_memcpy */,
|
|
1 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
1 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
TestStats expected_mmap_stats = {{
|
|
0 /* num_stack_buf_memcpy */,
|
|
0 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
0 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
std::array<TestStats, NumModes> expected_stats_by_mode{{
|
|
expected_non_mmap_stats /* kBufferedRead */,
|
|
expected_mmap_stats /* kBufferedMmap */,
|
|
expected_non_mmap_stats /* kDirectRead */,
|
|
}};
|
|
TestFetchDataBlock("FetchUncompressedDataBlock", false, false,
|
|
expected_stats_by_mode);
|
|
}
|
|
|
|
// Data blocks are compressed,
|
|
// fetch data block under both direct IO and non-direct IO,
|
|
// but do not uncompress.
|
|
// Expects:
|
|
// 1. in non-direct IO mode, allocate a compressed buffer and memcpy the block
|
|
// into the buffer;
|
|
// 2. in direct IO mode, allocate a compressed buffer and memcpy from the
|
|
// direct IO buffer to the compressed buffer.
|
|
TEST_F(BlockFetcherTest, FetchCompressedDataBlock) {
|
|
TestStats expected_non_mmap_stats = {
|
|
{
|
|
0 /* num_stack_buf_memcpy */,
|
|
0 /* num_heap_buf_memcpy */,
|
|
1 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
0 /* num_heap_buf_allocations */,
|
|
1 /* num_compressed_buf_allocations */,
|
|
}};
|
|
TestStats expected_mmap_stats = {{
|
|
0 /* num_stack_buf_memcpy */,
|
|
0 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
0 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
std::array<TestStats, NumModes> expected_stats_by_mode{{
|
|
expected_non_mmap_stats /* kBufferedRead */,
|
|
expected_mmap_stats /* kBufferedMmap */,
|
|
expected_non_mmap_stats /* kDirectRead */,
|
|
}};
|
|
TestFetchDataBlock("FetchCompressedDataBlock", true, false,
|
|
expected_stats_by_mode);
|
|
}
|
|
|
|
// Data blocks are compressed,
|
|
// fetch and uncompress data block under both direct IO and non-direct IO.
|
|
// Expects:
|
|
// 1. in non-direct IO mode, since the block is small, so it's first memcpyed
|
|
// to the stack buffer, then a heap buffer is allocated and the block is
|
|
// uncompressed into the heap.
|
|
// 2. in direct IO mode mode, allocate a heap buffer, then directly uncompress
|
|
// and memcpy from the direct IO buffer to the heap buffer.
|
|
TEST_F(BlockFetcherTest, FetchAndUncompressCompressedDataBlock) {
|
|
TestStats expected_buffered_read_stats = {
|
|
{
|
|
1 /* num_stack_buf_memcpy */,
|
|
1 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
1 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
TestStats expected_mmap_stats = {{
|
|
0 /* num_stack_buf_memcpy */,
|
|
1 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
1 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
TestStats expected_direct_read_stats = {
|
|
{
|
|
0 /* num_stack_buf_memcpy */,
|
|
1 /* num_heap_buf_memcpy */,
|
|
0 /* num_compressed_buf_memcpy */,
|
|
},
|
|
{
|
|
1 /* num_heap_buf_allocations */,
|
|
0 /* num_compressed_buf_allocations */,
|
|
}};
|
|
std::array<TestStats, NumModes> expected_stats_by_mode{{
|
|
expected_buffered_read_stats,
|
|
expected_mmap_stats,
|
|
expected_direct_read_stats,
|
|
}};
|
|
TestFetchDataBlock("FetchAndUncompressCompressedDataBlock", true, true,
|
|
expected_stats_by_mode);
|
|
}
|
|
|
|
|
|
} // namespace
|
|
} // namespace ROCKSDB_NAMESPACE
|
|
|
|
int main(int argc, char** argv) {
|
|
ROCKSDB_NAMESPACE::port::InstallStackTraceHandler();
|
|
::testing::InitGoogleTest(&argc, argv);
|
|
return RUN_ALL_TESTS();
|
|
}
|