Commit graph

24 commits

Author SHA1 Message Date
Peter Dillinger c96d9a0fbb Allow TablePropertiesCollectorFactory to return null collector (#12129)
Summary:
As part of building another feature, I wanted this:
* Custom implementations of `TablePropertiesCollectorFactory` may now return a `nullptr` collector to decline processing a file, reducing callback overheads in such cases.
* Polished, clarified some related API comments.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12129

Test Plan: unit test added

Reviewed By: ltamasi

Differential Revision: D51966667

Pulled By: pdillinger

fbshipit-source-id: 2991c08fe6ce3a8c9f14c68f1495f5a17bca2770
2023-12-11 12:02:56 -08:00
Jay Huh ddb7df10ef Update HISTORY.md and version.h for 8.9.fb release (#12074)
Summary:
Creating cut for 8.9 release

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12074

Test Plan: CI

Reviewed By: ajkr

Differential Revision: D51435289

Pulled By: jaykorean

fbshipit-source-id: 3918a8250032839e5b71f67f26c8ba01cbc17a41
2023-11-21 18:07:19 -08:00
Yu Zhang c6c683a0ca Remove the default force behavior for EnableFileDeletion API (#12001)
Summary:
Disabling file deletion can be critical for operations like making a backup, recovery from manifest IO error (for now). Ideally as long as there is one caller requesting file deletion disabled, it should be kept disabled until all callers agree to re-enable it. So this PR removes the default forcing behavior for the `EnableFileDeletion` API, and users need to explicitly pass the argument if they insisted on doing so knowing the consequence of what can be potentially disrupted.

This PR removes the API's default argument value so it will cause breakage for all users that are relying on the default value, regardless of whether the forcing behavior is critical for them.  When fixing this breakage, it's good to check if the forcing behavior is indeed needed and potential disruption is OK.

This PR also makes unit test that do not need force behavior to do a regular enable file deletion.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12001

Reviewed By: ajkr

Differential Revision: D51214683

Pulled By: jowlyzhang

fbshipit-source-id: ca7b1ebf15c09eed00f954da2f75c00d2c6a97e4
2023-11-10 14:35:54 -08:00
Yueh-Hsuan Chiang 73d223c4e2 Add auto_tuned option to RateLimiter C API (#12058)
Summary:
#### Problem
While the RocksDB C API does have the RateLimiter API, it does not
expose the auto_tuned option.

#### Summary of Change
This PR exposes auto_tuned RateLimiter option in RocksDB C API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12058

Test Plan: Augment the C API existing test to cover the new API.

Reviewed By: cbi42

Differential Revision: D51201933

Pulled By: ajkr

fbshipit-source-id: 5bc595a9cf9f88f50fee797b729ba96f09ed8266
2023-11-10 09:53:09 -08:00
马越 8e1adab5ce add RocksDB#clipColumnFamily to Java API (#11868)
Summary:
### main change:

- add java clipColumnFamily api in Rocksdb.java
The method signature of the new API is
 ```
 public void clipColumnFamily(final ColumnFamilyHandle columnFamilyHandle, final byte[] beginKey,
      final byte[] endKey)
```
### Test
add unit test RocksDBTest#clipColumnFamily()

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11868

Reviewed By: jaykorean

Differential Revision: D50889783

Pulled By: cbi42

fbshipit-source-id: 7f545171ad9adb9c20bdd92efae2e6bc55d5703f
2023-11-02 08:00:08 -07:00
anand76 52be8f54f2 Add APIs to query secondary cache capacity and usage for TieredCache (#12011)
Summary:
In `TieredCache`, the underlying compressed secondary cache is hidden from the user. So we need a way to query the capacity, as well as the portion of cache reservation charged to the compressed secondary cache.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12011

Test Plan: Update the unit tests

Reviewed By: akankshamahajan15

Differential Revision: D50651943

Pulled By: anand1976

fbshipit-source-id: 06d1cb5edb75a790c919bce718e2ff65f5908220
2023-10-25 16:54:50 -07:00
Hui Xiao ab15d33566 Update history, version and format testing for 8.8 (#12004)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/12004

Reviewed By: cbi42

Differential Revision: D50586984

Pulled By: hx235

fbshipit-source-id: 1480a8c2757340ebf83510557104aaa0e437b3ae
2023-10-24 12:03:07 -07:00
Andrew Kryczka 3c4cc6c2cc flip default DBOptions::fail_if_options_file_error (#11800)
Summary:
Changed `DBOptions::fail_if_options_file_error` default from `false` to
`true`. It is safer to fail an operation by default when it encounters
an error.

Also changed the API doc to list items in the conventional way for listing items in a sentence. The slashes weren't working well as one got dropped, probably because it looked like a typo.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11800

Test Plan: rely on CI

Reviewed By: jowlyzhang

Differential Revision: D49030532

Pulled By: ajkr

fbshipit-source-id: e606062aa25f9063d8c6fb0d03aebca5c2bc56d3
2023-09-29 15:15:32 -07:00
Changyu Bi 49da91ec09 Update files for version 8.8 (#11878)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11878

Reviewed By: ajkr

Differential Revision: D49568389

Pulled By: cbi42

fbshipit-source-id: b2022735799be9b5e81e03dfb418f8b104632ecf
2023-09-23 11:02:19 -07:00
anand76 48589b961f Fix updating the capacity of a tiered cache (#11873)
Summary:
Updating the tiered cache (cache allocated using ```NewTieredCache()```) by calling ```SetCapacity()``` on it was not working properly. The initial creation would set the primary cache capacity to the combined primary and compressed secondary cache capacity. But ```SetCapacity()``` would just set the primary cache capacity, with no way to change the secondary cache capacity. Additionally, the API was confusing, since the primary and compressed secondary capacities would be specified separately during creation, but ```SetCapacity``` took the combined capacity.

With this fix, the user always specifies the total budget and compressed secondary cache ratio on creation. Subsequently, `SetCapacity` will distribute the new capacity across the two caches by the same ratio. The `NewTieredCache` API has been changed to take the total cache capacity (inclusive of both the primary and the compressed secondary cache) and the ratio of total capacity to allocate to the compressed cache. These are specified in `TieredCacheOptions`. Any capacity specified in `LRUCacheOptions`, `HyperClockCacheOptions` and `CompressedSecondaryCacheOptions` is ignored. A new API, `UpdateTieredCache` is provided to dynamically update the total capacity, ratio of compressed cache, and admission policy.

Tests:
New unit tests

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11873

Reviewed By: akankshamahajan15

Differential Revision: D49562250

Pulled By: anand1976

fbshipit-source-id: 57033bc713b68d5da6292207765a6b3dbe539ddf
2023-09-22 18:07:46 -07:00
anand76 269478ee46 Support compressed and local flash secondary cache stacking (#11812)
Summary:
This PR implements support for a three tier cache - primary block cache, compressed secondary cache, and a nvm (local flash) secondary cache. This allows more effective utilization of the nvm cache, and minimizes the number of reads from local flash by caching compressed blocks in the compressed secondary cache.

The basic design is as follows -
1. A new secondary cache implementation, ```TieredSecondaryCache```, is introduced. It keeps the compressed and nvm secondary caches and manages the movement of blocks between them and the primary block cache. To setup a three tier cache, we allocate a ```CacheWithSecondaryAdapter```, with a ```TieredSecondaryCache``` instance as the secondary cache.
2. The table reader passes both the uncompressed and compressed block to ```FullTypedCacheInterface::InsertFull```, allowing the block cache to optionally store the compressed block.
3. When there's a miss, the block object is constructed and inserted in the primary cache, and the compressed block is inserted into the nvm cache by calling ```InsertSaved```. This avoids the overhead of recompressing the block, as well as avoiding putting more memory pressure on the compressed secondary cache.
4. When there's a hit in the nvm cache, we attempt to insert the block in the compressed secondary cache and the primary cache, subject to the admission policy of those caches (i.e admit on second access). Blocks/items evicted from any tier are simply discarded.

We can easily implement additional admission policies if desired.

Todo (In a subsequent PR):
1. Add to db_bench and run benchmarks
2. Add to db_stress

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11812

Reviewed By: pdillinger

Differential Revision: D49461842

Pulled By: anand1976

fbshipit-source-id: b40ac1330ef7cd8c12efa0a3ca75128e602e3a0b
2023-09-21 20:30:53 -07:00
Hui Xiao 089070cb36 Expose more info about input files in CompactionFilter::Context (#11857)
Summary:
**Context:**
As requested, lowest level as well as a map from input file to its table properties among all input files used in table creation  (if any) are exposed in `CompactionFilter::Context`.

**Summary:**
This PR contains two commits:
(1) [Refactory](0012777f0e) to make resonating/using what is in `Compaction:: table_properties_` easier
- Separate `Compaction:: table_properties_` into `Compaction:: input_table_properties_` and `Compaction:: output_table_properties_`
- Separate the "set input table properties" logic into `Compaction:: SetInputTableProperties()`) from `Compaction:: GetInputTableProperties`
- Call `Compaction:: SetInputTableProperties()` as soon as possible, which is right after `Compaction::SetInputVersion()`. Bundle these two functions into one `Compaction::FinalizeInputInfo()` to minimize missing one or the other

(2) [Expose more info about input files:](6093e7dfba) `CompactionFilter::Context::input_start_level/input_table_properties`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11857

Test Plan:
- Modify existing UT `
TEST_F(DBTestCompactionFilter, CompactionFilterContextManual)` to cover new logics

Reviewed By: ajkr

Differential Revision: D49402540

Pulled By: hx235

fbshipit-source-id: 469fff50fa0e5964ffa5ea8db0743f61438ea392
2023-09-20 13:34:39 -07:00
Andrew Kryczka 4196ad81e3 LZ4 set acceleration parameter (#11844)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11844

Test Plan:
Command:
```
for level in 1 1234 32767 -1 -10 -100 -1000 -10000 -100000 -1000000; do echo -n "level=$level " && ./db_bench -benchmarks=compress -compression_type=lz4 -compression_level=$level |& awk '/^compress / {print $5, $6}' ; done
```

Output:
```
level=1 181340 ops/sec
level=1234 183197 ops/sec
level=32767 181480 ops/sec
level=-1 181053 ops/sec
level=-10 662858 ops/sec
level=-100 2611516 ops/sec
level=-1000 3043125 ops/sec
level=-10000 3001351 ops/sec
level=-100000 2861834 ops/sec
level=-1000000 2906413 ops/sec
```

Reviewed By: cbi42

Differential Revision: D49331443

Pulled By: ajkr

fbshipit-source-id: c8909708c3b2b9b83bf2bda2d3f24b8a92d4c2ea
2023-09-18 09:26:29 -07:00
Hui Xiao 05daa12332 Change compaction_readahead_size default value to 2MB (#11762)
Summary:
**Context/Summary:**
After https://github.com/facebook/rocksdb/pull/11631, we rely on `compaction_readahead_size` for how much to read ahead for compaction read under non-direct IO case. https://github.com/facebook/rocksdb/pull/11658 therefore also sanitized 0 `compaction_readahead_size` to 2MB under non-direct IO, which is consistent with the existing sanitization with direct IO.

However, this makes disabling compaction readahead impossible as well as add one more scenario to the inconsistent effects between `Options.compaction_readahead_size=0` during DB open and `SetDBOptions("compaction_readahead_size", "0")` .
- `SetDBOptions("compaction_readahead_size", "0")` will disable compaction readahead as its logic never goes through sanitization above while `Options.compaction_readahead_size=0` will go through sanitization.

Therefore we decided to do this PR.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11762

Test Plan: Modified existing UTs to cover this PR

Reviewed By: ajkr

Differential Revision: D48759560

Pulled By: hx235

fbshipit-source-id: b3f85e58bda362a6fa1dc26bd8a87aa0e171af79
2023-08-30 14:57:08 -07:00
anand76 4b53520709 Update HISTORY.md and version.h for 8.6 (#11728)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11728

Reviewed By: jaykorean, jowlyzhang

Differential Revision: D48527100

Pulled By: anand1976

fbshipit-source-id: c48baa44e538fb6bfd3fe7f19046746d3540763f
2023-08-21 13:25:04 -07:00
Hui Xiao 09882a52d6 Prepare for deprecation of Options::access_hint_on_compaction_start (#11658)
Summary:
**Context/Summary:**
After https://github.com/facebook/rocksdb/pull/11631, file hint is not longer needed for compaction read. Therefore we can deprecate `Options::access_hint_on_compaction_start`. As this is a public API change, we should first mark the relevant APIs (including the Java's) deprecated and remove it in next major release 9.0.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11658

Test Plan: No code change

Reviewed By: ajkr

Differential Revision: D47997856

Pulled By: hx235

fbshipit-source-id: 16e015ae7728c224b1caef73143aa9915668f4ac
2023-08-03 17:23:02 -07:00
Andrew Kryczka cf95821fb6 Update for 8.5.fb branch cut (#11642)
Summary:
Updated the main branch for the 8.5.fb branch cut. Also made unreleased_history/release.sh backdate to the last commit instead of the current date in case the release manager is a laggard like myself.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11642

Reviewed By: cbi42

Differential Revision: D47783574

Pulled By: ajkr

fbshipit-source-id: 4e2a80f5ccd542dc7dd0d22dfd7e59cb136325a1
2023-08-02 12:34:11 -07:00
Andrew Kryczka 05c3b8ecac Prepare for specialized interface for row cache (#11620)
Summary:
An internal user wants to implement a key-aware row cache policy. For that, they need to know the components of the cache key, especially the user key component. With a specialized `RowCache` interface, we will be able to tell them the components so they won't have to make assumptions about our internal key schema.

This PR prepares for the specialized `RowCache` interface by updating the migration plan of https://github.com/facebook/rocksdb/issues/11450. I added a release note for the removed APIs and didn't mention the added ones for now.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11620

Reviewed By: pdillinger

Differential Revision: D47536962

Pulled By: ajkr

fbshipit-source-id: bbee0fc4ad67fc699a66b8f2b4ea4544dd003691
2023-07-18 19:12:58 -07:00
akankshamahajan 94c247bff8 Update HISTORY.md for branch cut for 8.4.fb (#11565)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11565

Reviewed By: jowlyzhang, cbi42

Differential Revision: D47027788

Pulled By: akankshamahajan15

fbshipit-source-id: e5e8db2eb21f8aa68fe072f0e1b63b83ba7beb9f
2023-06-26 13:26:15 -07:00
akankshamahajan fbd2f563bb Add an interface to provide support for underlying FS to pass their own buffer during reads (#11324)
Summary:
1. Public API change: Replace `use_async_io`  API in file_system with `SupportedOps` API which is used by underlying FileSystem to indicate to upper layers whether the FileSystem supports different operations introduced in `enum FSSupportedOps `. Right now operations are `async_io` and whether FS will provide its own buffer during reads or not. The api is changed to extend it to various FileSystem operations in one API rather than creating a separate API for each operation.

2. Provide support for underlying FS to pass their own buffer during Reads (async and sync read) instead of using RocksDB provided `scratch` (buffer) in `FSReadRequest`. Currently only MultiRead supports it and later will be extended to other reads as well (point lookup, scan etc). More details in how to enable in file_system.h

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11324

Test Plan: Tested locally

Reviewed By: anand1976

Differential Revision: D44465322

Pulled By: akankshamahajan15

fbshipit-source-id: 9ec9e08f839b5cc815e75d5dade6cd549998d0ec
2023-06-23 11:48:49 -07:00
Alexandre Lavigne 2926e0718c Add missing parameter in C API (#11542)
Summary:
The class `NewCompactOnDeletionCollectorFactory` exposes the parameter `delete_ratio`.

The C API `rocksdb_options_add_compact_on_deletion_collector_factory` does not allow a user to pass a delete ration to be passed down the the C++ class bellow.

The class has default value for the delete ratio which makes it pass the compilation and the tests.

closes https://github.com/facebook/rocksdb/issues/11541

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11542

Reviewed By: ajkr

Differential Revision: D46770908

Pulled By: cbi42

fbshipit-source-id: 7b5162fe459896052e392e2d85a8f6c01db3b464
2023-06-20 13:18:04 -07:00
mayue.fight fa878a0107 Support to create a CF by importing multiple non-overlapping CFs (#11378)
Summary:
The original Feature Request is from [https://github.com/facebook/rocksdb/issues/11317](https://github.com/facebook/rocksdb/issues/11317).
Flink uses rocksdb as the state backend,  all DB options are the same, and the keys of each DB instance are adjacent and there is no key overlap between two db instances.
In the Flink rescaling scenario, it is necessary to quickly split the DB according to a certain key range or quickly merge multiple DBs into one.

This PR is mainly used to quickly merge multiple DBs into one.

We hope to extend the function of `CreateColumnFamilyWithImports` to support creating ColumnFamily by importing multiple ColumnFamily with no overlapping keys.

The import logic is almost the same as `CreateColumnFamilyWithImport`, but it will check whether there is key overlap between CF when importing. The import will fail if there are key overlaps.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11378

Reviewed By: ajkr

Differential Revision: D46413709

Pulled By: cbi42

fbshipit-source-id: 846d0049fad11c59cf460fa846c345b26c658dfb
2023-06-15 12:25:04 -07:00
Ignat Loskutov 7c67aee4a0 statistics.cc: fix mistype (#11509)
Summary:
Add new tickers: `rocksdb.error.handler.bg.error.count`, `rocksdb.error.handler.bg.io.error.count`, `rocksdb.error.handler.bg.retryable.io.error.count` to replace the misspelled ones: `rocksdb.error.handler.bg.errro.count`, `rocksdb.error.handler.bg.io.errro.count`, `rocksdb.error.handler.bg.retryable.io.errro.count` ('error' instead of 'errro'). Users should switch to use the new tickers before 9.0 release as the misspelled old tickers will be completely removed then.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11509

Reviewed By: ltamasi

Differential Revision: D46542809

Pulled By: jowlyzhang

fbshipit-source-id: a2a6d8354af46a060de81d40ef6f5336a80bd32e
2023-06-09 13:25:57 -07:00
Peter Dillinger 8848ec92dd Better management of unreleased HISTORY (#11481)
Summary:
See new NOTE in HISTORY.md and unreleased_history/README.txt

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11481

Test Plan: some manual testing on my CentOS 8 system

Reviewed By: jaykorean

Differential Revision: D46233342

Pulled By: pdillinger

fbshipit-source-id: daf59cf3dc907f450b469090dcc481a30a7d7c0d
2023-05-30 16:42:49 -07:00