mirror of
https://github.com/facebook/rocksdb.git
synced 2024-12-04 20:02:50 +00:00
39f5846ec7
Summary: We want to know more about opportunities for better range filters, and the effectiveness of our own range filters. Currently the stats are very limited, essentially logging just hits and misses against prefix filters for range scans in BLOOM_FILTER_PREFIX_* without tracking the false positive rate. Perhaps confusingly, when prefix filters are used for point queries, the stats are currently going into the non-PREFIX tickers. This change does several things: * Introduce new stat tickers for seeks and related filtering, \*LEVEL_SEEK\* * Most importantly, allows us to see opportunities for range filtering. Specifically, we can count how many times a seek in an SST file accesses at least one data block, and how many times at least one value() is then accessed. If a data block was accessed but no value(), we can generally assume that the key(s) seen was(were) not of interest so could have been filtered with the right kind of filter, avoiding the data block access. * We can get the same level of detail when a filter (for now, prefix Bloom/ribbon) is used, or not. Specifically, we can infer a false positive rate for prefix filters (not available before) from the seek "false positive" rate: when a data block is accessed but no value() is called. (There can be other explanations for a seek false positive, but in typical iterator usage it would indicate a filter false positive.) * For efficiency, I wanted to avoid making additional calls to the prefix extractor (or key comparisons, etc.), which would be required if we wanted to more precisely detect filter false positives. I believe that instrumenting value() is the best balance of efficiency vs. accurately measuring what we are often interested in. * The stats are divided between last level and non-last levels, to help understand potential tiered storage use cases. * The old BLOOM_FILTER_PREFIX_* stats have a different meaning: no longer referring to iterators but to point queries using prefix filters. BLOOM_FILTER_PREFIX_TRUE_POSITIVE is added for computing the prefix false positive rate on point queries, which can be due to filter false positives as well as different keys with the same prefix. * Similarly, the non-PREFIX BLOOM_FILTER stats are now for whole key filtering only. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11460 Test Plan: unit tests updated, including updating many to pop the stat value since last read to improve test readability and maintainability. Performance test shows a consistent small improvement with these changes, both with clang and with gcc. CPU profile indicates that RecordTick is using less CPU, and this makes sense at least for a high filter miss rate. Before, we were recording two ticks per filter miss in iterators (CHECKED & USEFUL) and now recording just one (FILTERED). Create DB with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 ``` And run simultaneous before&after with ``` TEST_TMPDIR=/dev/shm ./db_bench -readonly -benchmarks=seekrandom[-X1000] -num=10000000 -bloom_bits=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -seek_nexts=1 -duration=20 -seed=43 -threads=8 -cache_size=1000000000 -statistics ``` Before: seekrandom [AVG 275 runs] : 189680 (± 222) ops/sec; 18.4 (± 0.0) MB/sec After: seekrandom [AVG 275 runs] : 197110 (± 208) ops/sec; 19.1 (± 0.0) MB/sec Reviewed By: ajkr Differential Revision: D46029177 Pulled By: pdillinger fbshipit-source-id: cdace79a2ea548d46c5900b068c5b7c3a02e5822
160 lines
4.8 KiB
C++
160 lines
4.8 KiB
C++
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
//
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
#include "rocksdb/slice_transform.h"
|
|
|
|
#include "rocksdb/db.h"
|
|
#include "rocksdb/env.h"
|
|
#include "rocksdb/filter_policy.h"
|
|
#include "rocksdb/statistics.h"
|
|
#include "rocksdb/table.h"
|
|
#include "test_util/testharness.h"
|
|
|
|
namespace ROCKSDB_NAMESPACE {
|
|
|
|
class SliceTransformTest : public testing::Test {};
|
|
|
|
TEST_F(SliceTransformTest, CapPrefixTransform) {
|
|
std::string s;
|
|
s = "abcdefge";
|
|
|
|
std::unique_ptr<const SliceTransform> transform;
|
|
|
|
transform.reset(NewCappedPrefixTransform(6));
|
|
ASSERT_EQ(transform->Transform(s).ToString(), "abcdef");
|
|
ASSERT_TRUE(transform->SameResultWhenAppended("123456"));
|
|
ASSERT_TRUE(transform->SameResultWhenAppended("1234567"));
|
|
ASSERT_TRUE(!transform->SameResultWhenAppended("12345"));
|
|
|
|
transform.reset(NewCappedPrefixTransform(8));
|
|
ASSERT_EQ(transform->Transform(s).ToString(), "abcdefge");
|
|
|
|
transform.reset(NewCappedPrefixTransform(10));
|
|
ASSERT_EQ(transform->Transform(s).ToString(), "abcdefge");
|
|
|
|
transform.reset(NewCappedPrefixTransform(0));
|
|
ASSERT_EQ(transform->Transform(s).ToString(), "");
|
|
|
|
transform.reset(NewCappedPrefixTransform(0));
|
|
ASSERT_EQ(transform->Transform("").ToString(), "");
|
|
}
|
|
|
|
class SliceTransformDBTest : public testing::Test {
|
|
private:
|
|
std::string dbname_;
|
|
Env* env_;
|
|
DB* db_;
|
|
|
|
public:
|
|
SliceTransformDBTest() : env_(Env::Default()), db_(nullptr) {
|
|
dbname_ = test::PerThreadDBPath("slice_transform_db_test");
|
|
EXPECT_OK(DestroyDB(dbname_, last_options_));
|
|
}
|
|
|
|
~SliceTransformDBTest() override {
|
|
delete db_;
|
|
EXPECT_OK(DestroyDB(dbname_, last_options_));
|
|
}
|
|
|
|
DB* db() { return db_; }
|
|
|
|
// Return the current option configuration.
|
|
Options* GetOptions() { return &last_options_; }
|
|
|
|
void DestroyAndReopen() {
|
|
// Destroy using last options
|
|
Destroy();
|
|
ASSERT_OK(TryReopen());
|
|
}
|
|
|
|
void Destroy() {
|
|
delete db_;
|
|
db_ = nullptr;
|
|
ASSERT_OK(DestroyDB(dbname_, last_options_));
|
|
}
|
|
|
|
Status TryReopen() {
|
|
delete db_;
|
|
db_ = nullptr;
|
|
last_options_.create_if_missing = true;
|
|
|
|
return DB::Open(last_options_, dbname_, &db_);
|
|
}
|
|
|
|
Options last_options_;
|
|
};
|
|
|
|
namespace {
|
|
uint64_t PopTicker(const Options& options, Tickers ticker_type) {
|
|
return options.statistics->getAndResetTickerCount(ticker_type);
|
|
}
|
|
} // namespace
|
|
|
|
TEST_F(SliceTransformDBTest, CapPrefix) {
|
|
last_options_.prefix_extractor.reset(NewCappedPrefixTransform(8));
|
|
last_options_.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
|
|
BlockBasedTableOptions bbto;
|
|
bbto.filter_policy.reset(NewBloomFilterPolicy(10, false));
|
|
bbto.whole_key_filtering = false;
|
|
last_options_.table_factory.reset(NewBlockBasedTableFactory(bbto));
|
|
ASSERT_OK(TryReopen());
|
|
|
|
ReadOptions ro;
|
|
FlushOptions fo;
|
|
WriteOptions wo;
|
|
|
|
ASSERT_OK(db()->Put(wo, "barbarbar", "foo"));
|
|
ASSERT_OK(db()->Put(wo, "barbarbar2", "foo2"));
|
|
ASSERT_OK(db()->Put(wo, "foo", "bar"));
|
|
ASSERT_OK(db()->Put(wo, "foo3", "bar3"));
|
|
ASSERT_OK(db()->Flush(fo));
|
|
|
|
std::unique_ptr<Iterator> iter(db()->NewIterator(ro));
|
|
|
|
iter->Seek("foo");
|
|
ASSERT_OK(iter->status());
|
|
ASSERT_TRUE(iter->Valid());
|
|
ASSERT_EQ(iter->value().ToString(), "bar");
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTER_MATCH), 1U);
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTERED), 0U);
|
|
|
|
iter->Seek("foo2");
|
|
ASSERT_OK(iter->status());
|
|
ASSERT_TRUE(!iter->Valid());
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTER_MATCH), 0U);
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTERED), 1U);
|
|
|
|
iter->Seek("barbarbar");
|
|
ASSERT_OK(iter->status());
|
|
ASSERT_TRUE(iter->Valid());
|
|
ASSERT_EQ(iter->value().ToString(), "foo");
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTER_MATCH), 1U);
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTERED), 0U);
|
|
|
|
iter->Seek("barfoofoo");
|
|
ASSERT_OK(iter->status());
|
|
ASSERT_TRUE(!iter->Valid());
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTER_MATCH), 0U);
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTERED), 1U);
|
|
|
|
iter->Seek("foobarbar");
|
|
ASSERT_OK(iter->status());
|
|
ASSERT_TRUE(!iter->Valid());
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTER_MATCH), 0U);
|
|
EXPECT_EQ(PopTicker(last_options_, NON_LAST_LEVEL_SEEK_FILTERED), 1U);
|
|
}
|
|
|
|
} // namespace ROCKSDB_NAMESPACE
|
|
|
|
int main(int argc, char** argv) {
|
|
ROCKSDB_NAMESPACE::port::InstallStackTraceHandler();
|
|
::testing::InitGoogleTest(&argc, argv);
|
|
return RUN_ALL_TESTS();
|
|
}
|