Go to file
Peter Dillinger d79be3dca2 Changes and enhancements to compression stats, thresholds (#11388)
Summary:
## Option API updates
* Add new CompressionOptions::max_compressed_bytes_per_kb, which corresponds to 1024.0 / min allowable compression ratio. This avoids the hard-coded minimum ratio of 8/7.
* Remove unnecessary constructor for CompressionOptions.
* Document undocumented CompressionOptions. Use idiom for default values shown clearly in one place (not precariously repeated).

 ## Stat API updates
* Deprecate the BYTES_COMPRESSED, BYTES_DECOMPRESSED histograms. Histograms incur substantial extra space & time costs compared to tickers, and the distribution of uncompressed data block sizes tends to be uninteresting. If we're interested in that distribution, I don't see why it should be limited to blocks stored as compressed.
* Deprecate the NUMBER_BLOCK_NOT_COMPRESSED ticker, because the name is very confusing.
* New or existing tickers relevant to compression:
  * BYTES_COMPRESSED_FROM
  * BYTES_COMPRESSED_TO
  * BYTES_COMPRESSION_BYPASSED
  * BYTES_COMPRESSION_REJECTED
  * COMPACT_WRITE_BYTES + FLUSH_WRITE_BYTES (both existing)
  * NUMBER_BLOCK_COMPRESSED (existing)
  * NUMBER_BLOCK_COMPRESSION_BYPASSED
  * NUMBER_BLOCK_COMPRESSION_REJECTED
  * BYTES_DECOMPRESSED_FROM
  * BYTES_DECOMPRESSED_TO

We can compute a number of things with these stats:
* "Successful" compression ratio: BYTES_COMPRESSED_FROM / BYTES_COMPRESSED_TO
* Compression ratio of data on which compression was attempted: (BYTES_COMPRESSED_FROM + BYTES_COMPRESSION_REJECTED) / (BYTES_COMPRESSED_TO + BYTES_COMPRESSION_REJECTED)
* Compression ratio of data that could be eligible for compression: (BYTES_COMPRESSED_FROM + X) / (BYTES_COMPRESSED_TO + X) where X = BYTES_COMPRESSION_REJECTED + NUMBER_BLOCK_COMPRESSION_REJECTED
* Overall SST compression ratio (compression disabled vs. actual): (Y - BYTES_COMPRESSED_TO + BYTES_COMPRESSED_FROM) / Y where Y = COMPACT_WRITE_BYTES + FLUSH_WRITE_BYTES

Keeping _REJECTED separate from _BYPASSED helps us to understand "wasted" CPU time in compression.

 ## BlockBasedTableBuilder
Various small refactorings, optimizations, and name clean-ups.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11388

Test Plan:
unit tests added

* `options_settable_test.cc`: use non-deprecated idiom for configuring CompressionOptions from string. The old idiom is tested elsewhere and does not need to be updated to support the new field.

Reviewed By: ajkr

Differential Revision: D45128202

Pulled By: pdillinger

fbshipit-source-id: 5a652bf5c022b7ec340cf79018cccf0686962803
2023-04-21 21:57:40 -07:00
.circleci Revert enabling IO uring in db_stress (#11242) 2023-02-21 12:53:55 -08:00
.github/workflows
buckifier Cleanup, improve, stress test LockWAL() (#11143) 2023-01-30 22:52:30 -08:00
build_tools Remove platform009 and default to platform010 (#11333) 2023-03-30 09:56:37 -07:00
cache fix bad implementation of ShardedCache::GetOccupancyCount (#11325) 2023-04-07 18:28:19 -07:00
cmake
coverage Remove platform009 and default to platform010 (#11333) 2023-03-30 09:56:37 -07:00
db Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
db_stress_tool Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
docs Remove docs/Gemfile.lock and update github-pages version (#11173) 2023-02-14 12:17:23 -08:00
env Ignore async_io ReadOption if FileSystem doesn't support it (#11296) 2023-03-17 14:57:09 -07:00
examples Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
file Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
fuzz
include/rocksdb Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
java Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
logging Disabling some IO error assertion in EnvLogger (#11314) 2023-03-20 13:23:29 -07:00
memory Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
memtable Add `SetAllowStall()` (#11335) 2023-03-30 09:43:33 -07:00
microbench Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
monitoring Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
options Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
plugin
port Ignore async_io ReadOption if FileSystem doesn't support it (#11296) 2023-03-17 14:57:09 -07:00
table Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
test_util Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
third-party fix optimization-disabled test builds with platform010 (#11361) 2023-04-10 13:59:44 -07:00
tools Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
trace_replay Fix error maybe-uninitialized #11100 (#11101) 2023-01-19 13:59:48 -08:00
util Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
utilities Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
.clang-format
.gitignore
.lgtm.yml
.watchmanconfig
AUTHORS
CMakeLists.txt New stat rocksdb.{cf|db}-write-stall-stats exposed in a structural way (#11300) 2023-03-18 09:51:58 -07:00
CODE_OF_CONDUCT.md
CONTRIBUTING.md
COPYING
DEFAULT_OPTIONS_HISTORY.md
DUMP_FORMAT.md
HISTORY.md Changes and enhancements to compression stats, thresholds (#11388) 2023-04-21 21:57:40 -07:00
INSTALL.md Add instructions for installing googlebenchmark (#11282) 2023-03-09 13:11:00 -08:00
LANGUAGE-BINDINGS.md
LICENSE.Apache
LICENSE.leveldb
Makefile util/ribbon_test.cc: avoid ambiguous reversed operator error in c++20 (#11371) 2023-04-12 13:24:34 -07:00
PLUGINS.md
README.md Remove deprecated integration tests from README.md (#11354) 2023-04-07 16:52:50 -07:00
TARGETS New stat rocksdb.{cf|db}-write-stall-stats exposed in a structural way (#11300) 2023-03-18 09:51:58 -07:00
USERS.md Add PaxosStore to USERS (#11357) 2023-04-07 16:55:35 -07:00
Vagrantfile
WINDOWS_PORT.md
common.mk
crash_test.mk Allow a custom DB cleanup command to be passed to db_crashtest.py (#10883) 2022-10-27 19:47:01 -07:00
issue_template.md
rocksdb.pc.in
src.mk New stat rocksdb.{cf|db}-write-stall-stats exposed in a structural way (#11300) 2023-03-18 09:51:58 -07:00
thirdparty.inc

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.