rocksdb/tools
Andrew Ryan Chang af2a36d2c7 Record newest_key_time as a table property (#13083)
Summary:
This PR does two things:
1. Adds a new table property `newest_key_time`
2. Uses this property to improve TTL and temperature change compaction.

### Context

The current `creation_time` table property should really be named `oldest_ancestor_time`. For flush output files, this is the oldest key time in the file. For compaction output files, this is the minimum among all oldest key times in the input files.

The problem with using the oldest ancestor time for TTL compaction is that we may end up dropping files earlier than we should. What we really want is the newest (i.e. "youngest") key time. Right now we take a roundabout way to estimate this value -- we take the value of the _oldest_ key time for the _next_ (newer) SST file. This is also why the current code has checks for `index >= 1`.

Our new property `newest_key_time` is set to the file creation time during flushes, and the max over all input files for compactions.

There were some additional smaller changes that I had to make for testing purposes:
- Refactoring the mock table reader to support specifying my own table properties
- Refactoring out a test utility method `GetLevelFileMetadatas`  that would otherwise be copy/pasted in 3 places

Credit to cbi42 for the problem explanation and proposed solution

### Testing

- Added a dedicated unit test to my `newest_key_time` logic in isolation (i.e. are we populating the property on flush and compaction)
- Updated the existing unit tests (for TTL/temperate change compaction), which were comprehensive enough to break when I first made my code changes. I removed the test setup code which set the file metadata `oldest_ancestor_time`, so we know we are actually only using the new table property instead.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13083

Reviewed By: cbi42

Differential Revision: D65298604

Pulled By: archang19

fbshipit-source-id: 898ef91b692ab33f5129a2a16b64ecadd4c32432
2024-11-01 10:08:35 -07:00
..
advisor internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
block_cache_analyzer internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
dump internal_repo_rocksdb (435146444452818992) (#12115) 2023-12-01 11:15:17 -08:00
CMakeLists.txt
Dockerfile
analyze_txn_stress_test.sh
auto_sanity_test.sh
backup_db.sh
benchmark.sh optimize file size statistics in benchmark script (#12363) 2024-02-21 15:45:18 -08:00
benchmark_ci.py internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
benchmark_compare.sh Fix file modes (#10815) 2022-10-13 09:00:37 -07:00
benchmark_leveldb.sh
blob_dump.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
check_all_python.py internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
check_format_compatible.sh Start version 9.9.0 (#13093) 2024-10-25 13:47:29 -07:00
db_bench.cc Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
db_bench_tool.cc Re-implement GetApproximateMemTableStats for skip lists (#13047) 2024-10-02 14:25:50 -07:00
db_bench_tool_test.cc Group SST write in flush, compaction and db open with new stats (#11910) 2023-12-29 15:29:23 -08:00
db_crashtest.py Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
db_repl_stress.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
db_sanity_test.cc Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
dbench_monitor
generate_random_db.sh
ingest_external_sst.sh
io_tracer_parser.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
io_tracer_parser_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
io_tracer_parser_tool.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
io_tracer_parser_tool.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
ldb.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
ldb_cmd.cc Fix a bug for surfacing write unix time (#13057) 2024-10-08 11:31:51 -07:00
ldb_cmd_impl.h Fix a bug for surfacing write unix time (#13057) 2024-10-08 11:31:51 -07:00
ldb_cmd_test.cc Remove `bottommost_temperature` (#12389) 2024-02-27 14:48:00 -08:00
ldb_test.py internal_repo_rocksdb 2024-10-14 03:01:20 -07:00
ldb_tool.cc Add LDB command and option for follower instances (#12682) 2024-05-28 23:21:32 -07:00
pflag
reduce_levels_test.cc Make option `level_compaction_dynamic_level_bytes` true by default (#11525) 2023-06-15 21:12:39 -07:00
regression_test.sh Fix regression script for async_io benchmarks (#11462) 2023-05-22 15:32:12 -07:00
restore_db.sh
rocksdb_dump_test.sh
run_blob_bench.sh add exe and script path check (#11621) 2023-07-19 12:05:24 -07:00
run_flash_bench.sh
run_leveldb.sh
sample-dump.dmp
simulated_hybrid_file_system.cc Group SST write in flush, compaction and db open with new stats (#11910) 2023-12-29 15:29:23 -08:00
simulated_hybrid_file_system.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
sst_dump.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
sst_dump_test.cc Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
sst_dump_tool.cc Augment sst_dump tool to verify num_entries in table property (#12322) 2024-02-01 14:35:03 -08:00
trace_analyzer.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
trace_analyzer_test.cc internal_repo_rocksdb (435146444452818992) (#12115) 2023-12-01 11:15:17 -08:00
trace_analyzer_tool.cc Trace analyzer: replace number with enumeration type (#10827) 2023-12-27 10:38:53 -08:00
trace_analyzer_tool.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
verify_random_db.sh Fix some bugs in verify_random_db.sh (#10112) 2022-06-03 16:35:13 -07:00
write_external_sst.sh
write_stress.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
write_stress_runner.py internal_repo_rocksdb 2024-10-14 03:01:20 -07:00