Go to file
Andrew Ryan Chang af2a36d2c7 Record newest_key_time as a table property (#13083)
Summary:
This PR does two things:
1. Adds a new table property `newest_key_time`
2. Uses this property to improve TTL and temperature change compaction.

### Context

The current `creation_time` table property should really be named `oldest_ancestor_time`. For flush output files, this is the oldest key time in the file. For compaction output files, this is the minimum among all oldest key times in the input files.

The problem with using the oldest ancestor time for TTL compaction is that we may end up dropping files earlier than we should. What we really want is the newest (i.e. "youngest") key time. Right now we take a roundabout way to estimate this value -- we take the value of the _oldest_ key time for the _next_ (newer) SST file. This is also why the current code has checks for `index >= 1`.

Our new property `newest_key_time` is set to the file creation time during flushes, and the max over all input files for compactions.

There were some additional smaller changes that I had to make for testing purposes:
- Refactoring the mock table reader to support specifying my own table properties
- Refactoring out a test utility method `GetLevelFileMetadatas`  that would otherwise be copy/pasted in 3 places

Credit to cbi42 for the problem explanation and proposed solution

### Testing

- Added a dedicated unit test to my `newest_key_time` logic in isolation (i.e. are we populating the property on flush and compaction)
- Updated the existing unit tests (for TTL/temperate change compaction), which were comprehensive enough to break when I first made my code changes. I removed the test setup code which set the file metadata `oldest_ancestor_time`, so we know we are actually only using the new table property instead.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13083

Reviewed By: cbi42

Differential Revision: D65298604

Pulled By: archang19

fbshipit-source-id: 898ef91b692ab33f5129a2a16b64ecadd4c32432
2024-11-01 10:08:35 -07:00
.circleci Enable io_uring in stress test (#12313) 2024-01-31 12:37:42 -08:00
.github More valgrind fixes (#12990) 2024-09-06 10:11:34 -07:00
buckifier internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
build_tools internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
cache Make simple BlockBasedTableOptions mutable (#10021) 2024-10-14 17:49:26 -07:00
cmake Fix zstd typo in cmake (#12309) 2024-02-22 14:39:05 -08:00
coverage internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
db Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
db_stress_tool Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
docs Java FFI blog post - Post-publication issues with images (2) (#12372) 2024-02-22 15:01:55 -08:00
env Add missing RemapFileSystem::ReopenWritableFile (#12941) 2024-09-17 13:08:25 -07:00
examples Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
file Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
fuzz Block per key-value checksum (#11287) 2023-04-25 12:08:23 -07:00
include/rocksdb Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
java Steps toward deprecating implicit prefix seek, related fixes (#13026) 2024-09-20 15:54:19 -07:00
logging Fix data race in AutoRollLogger (#12436) 2024-03-14 14:28:33 -07:00
memory Set optimize_filters_for_memory by default (#12377) 2024-04-30 08:33:31 -07:00
memtable Re-implement GetApproximateMemTableStats for skip lists (#13047) 2024-10-02 14:25:50 -07:00
microbench internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
monitoring Add ticker stats for read corruption retries (#12923) 2024-08-12 15:32:07 -07:00
options Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Add a temporary hook for custom yielding in long-running op (#13103) 2024-10-30 20:37:28 -07:00
table Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
test_util Steps toward making IDENTITY file obsolete (#13019) 2024-09-19 14:05:21 -07:00
third-party fix optimization-disabled test builds with platform010 (#11361) 2023-04-10 13:59:44 -07:00
tools Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
trace_replay Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
unreleased_history Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
util Fix Compaction Stats (#13071) 2024-10-16 19:20:37 -07:00
utilities Call PrepareValue on the base iterator in BaseDeltaIterator (#13105) 2024-10-31 14:20:33 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore add gtags files ignore (#12747) 2024-06-12 21:46:40 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Correct CMake minimum required version (#13056) 2024-10-07 14:43:20 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Fix a HISTORY entry for 9.8.0 (#13097) 2024-10-28 21:27:42 -07:00
INSTALL.md fix out of date macos instructions in INSTALL.md (#12393) 2024-02-28 12:38:15 -08:00
LANGUAGE-BINDINGS.md Add grocksdb in Go language bindings (#10498) 2022-08-23 15:02:10 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Update folly Github hash (#13017) 2024-09-17 17:47:10 -07:00
PLUGINS.md Add encfs plugin link (#12070) 2023-11-14 07:33:21 -08:00
README.md Remove deprecated integration tests from README.md (#11354) 2023-04-07 16:52:50 -07:00
TARGETS Remove last user of AutoHeaders.RECURSIVE_GLOB 2024-09-17 13:21:57 -07:00
USERS.md Add Qdrant to USERS.md (#12072) 2023-11-16 10:35:08 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00
common.mk Clean up variables for temporary directory (#9961) 2022-05-06 16:38:06 -07:00
crash_test.mk Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
rocksdb.pc.in build: fix pkg-config file generation (#9953) 2022-05-30 12:46:40 -07:00
src.mk Fix folly build (#12795) 2024-06-22 15:15:02 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.