Find a file
Peter Dillinger 003e72b201 Use size_t for filter APIs, protect against overflow (#7726)
Summary:
Deprecate CalculateNumEntry and replace with
ApproximateNumEntries (better name) using size_t instead of int and
uint32_t, to minimize confusing casts and bad overflow behavior
(possible though probably not realistic). Bloom sizes are now explicitly
capped at max size supported by implementations: just under 4GiB for
fv=5 Bloom, and just under 512MiB for fv<5 Legacy Bloom. This
hardening could help to set up for fuzzing.

Also, since RocksDB only uses this information as an approximation
for trying to hit certain sizes for partitioned filters, it's more important
that the function be reasonably fast than for it to be completely
accurate. It's hard enough to be 100% accurate for Ribbon (currently
reversing CalculateSpace) that adding optimize_filters_for_memory
into the mix is just not worth trying to be 100% accurate for num
entries for bytes.

Also:
- Cleaned up filter_policy.h to remove MSVC warning handling and
potentially unsafe use of exception for "not implemented"
- Correct the number of entries limit beyond which current Ribbon
implementation falls back on Bloom instead.
- Consistently use "num_entries" rather than "num_entry"
- Remove LegacyBloomBitsBuilder::CalculateNumEntry as it's essentially
obsolete from general implementation
BuiltinFilterBitsBuilder::CalculateNumEntries.
- Fix filter_bench to skip some tests that don't make sense when only
one or a small number of filters has been generated.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7726

Test Plan:
expanded existing unit tests for CalculateSpace /
ApproximateNumEntries. Also manually used filter_bench to verify Legacy and
fv=5 Bloom size caps work (much too expensive for unit test). Note that
the actual bits per key is below requested due to space cap.

    $ ./filter_bench -impl=0 -bits_per_key=20 -average_keys_per_filter=256000000 -vary_key_count_ratio=0 -m_keys_total_max=256 -allow_bad_fp_rate
    ...
    Total size (MB): 511.992
    Bits/key stored: 16.777
    ...
    $ ./filter_bench -impl=2 -bits_per_key=20 -average_keys_per_filter=2000000000 -vary_key_count_ratio=0 -m_keys_total_max=2000
    ...
    Total size (MB): 4096
    Bits/key stored: 17.1799
    ...
    $

Reviewed By: jay-zhuang

Differential Revision: D25239800

Pulled By: pdillinger

fbshipit-source-id: f94e6d065efd31e05ec630ae1a82e6400d8390c4
2020-12-11 22:18:12 -08:00
.circleci Fix pyenv-version-name not found issue (#7768) 2020-12-10 20:37:51 -08:00
.github/workflows Update clang-format-diff.py (#7609) 2020-11-04 16:09:01 -08:00
buckifier Fix use of positional args in BUCK rules (#7760) 2020-12-09 19:25:31 -08:00
build_tools RocksJava static lib dependencies should support MacOS 10.12+ (#7683) 2020-11-17 15:34:05 -08:00
cache Fix typos in comments (#7687) 2020-11-19 13:32:50 -08:00
cmake Add find_dependency() in cmake config file. (#6791) 2020-05-12 21:18:29 -07:00
coverage Find the correct gcov (#6904) 2020-06-01 16:33:05 -07:00
db Fix memory leak for ColumnFamily drop with live iterator (#7749) 2020-12-11 11:18:21 -08:00
db_stress_tool Experimental (production candidate) SST schema for Ribbon filter (#7658) 2020-11-12 20:46:14 -08:00
docs Update github-pages to v207 (#7235) 2020-08-12 09:26:24 -07:00
env Eliminate possible race between LockFile() vs UnlockFile() (#7721) 2020-12-10 09:35:11 -08:00
examples Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
file Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
fuzz Update SstFileWriter fuzzer to iterate and check all key-value pairs (#7761) 2020-12-11 16:09:10 -08:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Use size_t for filter APIs, protect against overflow (#7726) 2020-12-11 22:18:12 -08:00
java Fix a recovery corner case (#7621) 2020-11-07 22:23:27 -08:00
logging Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
memory slightly improve jemalloc allocator API header (#7592) 2020-10-28 13:47:12 -07:00
memtable Test for LoadLatestOptions (#7554) 2020-10-14 22:28:55 -07:00
monitoring Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
options Hack to load OPTIONS file for read_amp_bytes_per_bit (#7659) 2020-11-13 11:52:50 -08:00
port Warn about practically unfixable TSAN warnings in stack trace (#7723) 2020-12-01 10:17:56 -08:00
table Use size_t for filter APIs, protect against overflow (#7726) 2020-12-11 22:18:12 -08:00
test_util Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
third-party Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
tools add 6.15.fb to check_format_compatible.sh (#7738) 2020-12-03 12:45:14 -08:00
trace_replay Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
util Use size_t for filter APIs, protect against overflow (#7726) 2020-12-11 22:18:12 -08:00
utilities Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
.clang-format
.gitignore Fuzzing RocksDB (#7685) 2020-11-17 12:56:48 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Migrate away from broken macos on Travis (#7745) 2020-12-04 23:24:27 -08:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt LockTree library, originally from PerconaFT (#7753) 2020-12-09 12:10:57 -08:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Use size_t for filter APIs, protect against overflow (#7726) 2020-12-11 22:18:12 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Add RestoreDBFromLatestBackup to C API, add new C# package (#7092) 2020-07-08 11:56:41 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
README.md Fix the CI badge for ppc64le Jenkins (#7561) 2020-10-16 09:00:56 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk LockTree library, originally from PerconaFT (#7753) 2020-12-09 12:10:57 -08:00
TARGETS Fix use of positional args in BUCK rules (#7760) 2020-12-09 19:25:31 -08:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md add ArangoDB to USERS.md, and fix typos in that file (#7675) 2020-11-16 18:29:51 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.