rocksdb/tools
Peter Dillinger 311a544c2a Use deleters to label cache entries and collect stats (#8297)
Summary:
This change gathers and publishes statistics about the
kinds of items in block cache. This is especially important for
profiling relative usage of cache by index vs. filter vs. data blocks.
It works by iterating over the cache during periodic stats dump
(InternalStats, stats_dump_period_sec) or on demand when
DB::Get(Map)Property(kBlockCacheEntryStats), except that for
efficiency and sharing among column families, saved data from
the last scan is used when the data is not considered too old.

The new information can be seen in info LOG, for example:

    Block cache LRUCache@0x7fca62229330 capacity: 95.37 MB collections: 8 last_copies: 0 last_secs: 0.00178 secs_since: 0
    Block cache entry stats(count,size,portion): DataBlock(7092,28.24 MB,29.6136%) FilterBlock(215,867.90 KB,0.888728%) FilterMetaBlock(2,5.31 KB,0.00544%) IndexBlock(217,180.11 KB,0.184432%) WriteBuffer(1,256.00 KB,0.262144%) Misc(1,0.00 KB,0%)

And also through DB::GetProperty and GetMapProperty (here using
ldb just for demonstration):

    $ ./ldb --db=/dev/shm/dbbench/ get_property rocksdb.block-cache-entry-stats
    rocksdb.block-cache-entry-stats.bytes.data-block: 0
    rocksdb.block-cache-entry-stats.bytes.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.bytes.index-block: 178992
    rocksdb.block-cache-entry-stats.bytes.misc: 0
    rocksdb.block-cache-entry-stats.bytes.other-block: 0
    rocksdb.block-cache-entry-stats.bytes.write-buffer: 0
    rocksdb.block-cache-entry-stats.capacity: 8388608
    rocksdb.block-cache-entry-stats.count.data-block: 0
    rocksdb.block-cache-entry-stats.count.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.count.index-block: 215
    rocksdb.block-cache-entry-stats.count.misc: 1
    rocksdb.block-cache-entry-stats.count.other-block: 0
    rocksdb.block-cache-entry-stats.count.write-buffer: 0
    rocksdb.block-cache-entry-stats.id: LRUCache@0x7f3636661290
    rocksdb.block-cache-entry-stats.percent.data-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.deprecated-filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-meta-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.index-block: 2.133751
    rocksdb.block-cache-entry-stats.percent.misc: 0.000000
    rocksdb.block-cache-entry-stats.percent.other-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.write-buffer: 0.000000
    rocksdb.block-cache-entry-stats.secs_for_last_collection: 0.000052
    rocksdb.block-cache-entry-stats.secs_since_last_collection: 0

Solution detail - We need some way to flag what kind of blocks each
entry belongs to, preferably without changing the Cache API.
One of the complications is that Cache is a general interface that could
have other users that don't adhere to whichever convention we decide
on for keys and values. Or we would pay for an extra field in the Handle
that would only be used for this purpose.

This change uses a back-door approach, the deleter, to indicate the
"role" of a Cache entry (in addition to the value type, implicitly).
This has the added benefit of ensuring proper code origin whenever we
recognize a particular role for a cache entry; if the entry came from
some other part of the code, it will use an unrecognized deleter, which
we simply attribute to the "Misc" role.

An internal API makes for simple instantiation and automatic
registration of Cache deleters for a given value type and "role".

Another internal API, CacheEntryStatsCollector, solves the problem of
caching the results of a scan and sharing them, to ensure scans are
neither excessive nor redundant so as not to harm Cache performance.

Because code is added to BlocklikeTraits, it is pulled out of
block_based_table_reader.cc into its own file.

This is a reformulation of https://github.com/facebook/rocksdb/issues/8276, without the type checking option
(could still be added), and with actual stat gathering.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8297

Test Plan: manual testing with db_bench, and a couple of basic unit tests

Reviewed By: ltamasi

Differential Revision: D28488721

Pulled By: pdillinger

fbshipit-source-id: 472f524a9691b5afb107934be2d41d84f2b129fb
2021-05-19 16:51:13 -07:00
..
advisor remediation of S205607 2020-07-17 17:20:49 -07:00
block_cache_analyzer Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
dump Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rdb Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
analyze_txn_stress_test.sh
auto_sanity_test.sh
backup_db.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
benchmark.sh
benchmark_leveldb.sh
blob_dump.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
check_all_python.py Allow missing "unversioned" python, as in CentOS 8 (#6883) 2020-05-29 11:29:23 -07:00
check_format_compatible.sh Add 6.18, 6.19 and 6.20 to check_format_compatible.sh (#8236) 2021-04-27 10:24:27 -07:00
CMakeLists.txt Mark dependencies as PRIVATE and fix missing dependencies in tools. (#6790) 2020-05-12 21:07:55 -07:00
db_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_bench_tool.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
db_bench_tool_test.cc Add support for the integrated BlobDB to db_bench (#7956) 2021-02-17 11:10:18 -08:00
db_crashtest.py Revert accidental enabling broken ClockCache in stress test (#8277) 2021-05-06 16:31:51 -07:00
db_repl_stress.cc Migrate away from Travis+Linux+amd64 (#7791) 2020-12-22 00:20:57 -08:00
db_sanity_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dbench_monitor
Dockerfile
generate_random_db.sh
ingest_external_sst.sh
io_tracer_parser.cc Add IO Tracer Parser (#7333) 2020-09-23 15:50:26 -07:00
io_tracer_parser_test.cc Add request_id in IODebugContext. (#8045) 2021-04-01 13:14:51 -07:00
io_tracer_parser_tool.cc Add request_id in IODebugContext. (#8045) 2021-04-01 13:14:51 -07:00
io_tracer_parser_tool.h Add IO Tracer Parser (#7333) 2020-09-23 15:50:26 -07:00
ldb.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
ldb_cmd.cc Check and handle failure in ldb (#8072) 2021-03-18 14:43:34 -07:00
ldb_cmd_impl.h Add support for custom file systems to ldb and sst_dump (#8010) 2021-03-09 20:49:15 -08:00
ldb_cmd_test.cc Extend file_checksum_dump ldb command and DB::GetLiveFilesChecksumInfo to blob files (#8179) 2021-04-15 09:38:13 -07:00
ldb_test.py aggregated-table-properties with GetMapProperty (#7779) 2020-12-19 08:00:14 -08:00
ldb_tool.cc Add support for custom file systems to ldb and sst_dump (#8010) 2021-03-09 20:49:15 -08:00
pflag
reduce_levels_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
regression_test.sh Update internal build script (#7957) 2021-02-11 14:55:43 -08:00
report_lite_binary_size.sh
restore_db.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
rocksdb_dump_test.sh
run_flash_bench.sh
run_leveldb.sh
sample-dump.dmp
simulated_hybrid_file_system.cc Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
simulated_hybrid_file_system.h Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
sst_dump.cc Implement a new subcommand "identify" for sst_dump (#6943) 2020-06-08 13:58:28 -07:00
sst_dump_test.cc Make it possible to apply only a subrange of table property collectors (#8298) 2021-05-17 18:28:39 -07:00
sst_dump_tool.cc Add support for custom file systems to ldb and sst_dump (#8010) 2021-03-09 20:49:15 -08:00
trace_analyzer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trace_analyzer_test.cc Refactor: add LineFileReader and Status::MustCheck (#8026) 2021-03-09 20:12:38 -08:00
trace_analyzer_tool.cc Refactor: add LineFileReader and Status::MustCheck (#8026) 2021-03-09 20:12:38 -08:00
trace_analyzer_tool.h Introduce a new trace file format (v 0.2) for better extension (#7977) 2021-02-18 23:05:35 -08:00
verify_random_db.sh
write_external_sst.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
write_stress.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
write_stress_runner.py Allow missing "unversioned" python, as in CentOS 8 (#6883) 2020-05-29 11:29:23 -07:00