Commit Graph

8 Commits

Author SHA1 Message Date
Yu Zhang 15053f3ab4 Logically strip timestamp during flush (#11557)
Summary:
Logically strip the user-defined timestamp when L0 files are created during flush when `AdvancedColumnFamilyOptions.persist_user_defined_timestamps` is false. Logically stripping timestamp here means replacing the original user-defined timestamp with a mininum timestamp, which for now is hard coded to be all zeros bytes.

While working on this, I caught a missing piece on the `BlockBuilder` level for this feature. The current quick path `std::min(buffer_size, last_key_size)` needs a bit tweaking to work for this feature. When user-defined timestamp is stripped during block building, on writing first entry or right after resetting, `buffer` is empty and `buffer_size` is zero as usual. However, in follow-up writes, depending on the size of the stripped user-defined timestamp, and the size of the value, what's in `buffer` can sometimes be smaller than `last_key_size`, leading `std::min(buffer_size, last_key_size)` to truncate the `last_key`. Previous test doesn't caught the bug because in those tests, the size of the stripped user-defined timestamps bytes is smaller than the length of the value. In order to avoid the conditional operation, this PR changed the original trivial `std::min` operation into an arithmetic operation. Since this is a change in a hot and performance critical path, I did the following benchmark to check no observable regression is introduced.
```TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000```
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec
                       PR  vs base:
Round 1: 350652 vs 349055
Round 2: 365733 vs 364308
Round 3: 355681 vs 354475

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11557

Test Plan:
New timestamp specific test added or existing tests augmented, both are parameterized with `UserDefinedTimestampTestMode`:
`UserDefinedTimestampTestMode::kNormal` -> UDT feature enabled, write / read with min timestamp
`UserDefinedTimestampTestMode::kStripUserDefinedTimestamps` -> UDT feature enabled, write / read with min timestamp, set Options.persist_user_defined_timestamps to false.

```
make all check
./db_wal_test --gtest_filter="*WithTimestamp*"
./flush_job_test --gtest_filter="*WithTimestamp*"
./repair_test --gtest_filter="*WithTimestamp*"
./block_based_table_reader_test
```

Reviewed By: pdillinger

Differential Revision: D47027664

Pulled By: jowlyzhang

fbshipit-source-id: e729193b6334dfc63aaa736d684d907a022571f5
2023-06-29 15:50:50 -07:00
Yu Zhang d1ae7f6c41 Add support to strip / pad timestamp when writing / reading a block (#11472)
Summary:
This patch adds support in `BlockBuilder` to strip user-defined timestamp from the `key` added via `Add(key, value)` and its equivalent APIs. The stripping logic is different when the key is either a user key or an internal key, so the `BlockBuilder` is created with a flag to indicate that. This patch also add support on the read path to APIs `NewIndexIterator`, `NewDataIterator` to support pad a min timestamp.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11472

Test Plan:
Three test modes are added to parameterize existing tests:
UserDefinedTimestampTestMode::kNone -> UDT feature is not enabled
UserDefinedTimestampTestMode::kNormal -> UDT feature enabled, write / read with min timestamp
UserDefinedTimestampTestMode::kStripUserDefinedTimestamps -> UDT feature enabled, write / read with min timestamp, set `persist_user_defined_timestamps` where it applies to false.
The tests read/write with min timestamp so that point read and range scan can correctly read values in all three test modes.

`block_test` are parameterized to run with above three test modes and some additional parameteriazation

```
make all check
./block_test --gtest_filter="P/BlockTest*"
./block_test --gtest_filter="P/IndexBlockTest*"
```

Reviewed By: ltamasi

Differential Revision: D46200539

Pulled By: jowlyzhang

fbshipit-source-id: 59f5d6b584639976b69c2943eba723bd47d9b3c0
2023-05-25 15:41:32 -07:00
Andrew Kryczka 0a774a102f Clarify `SstFileWriter::DeleteRange()` ordering requirements (#11390)
Summary:
As titled

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11390

Reviewed By: cbi42

Differential Revision: D45148830

Pulled By: ajkr

fbshipit-source-id: 9a8dfd040514bae3d8ed9e97a79cae7683f2749a
2023-04-20 13:02:16 -07:00
anand76 727bad78b8 Format files under table/ by clang-format (#10852)
Summary:
Run clang-format on files under the `table` directory.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10852

Reviewed By: ajkr

Differential Revision: D40650732

Pulled By: anand1976

fbshipit-source-id: 2023a958e37fd6274040c5181130284600c9e0ef
2022-10-25 11:50:38 -07:00
Peter Dillinger b234a3f569 Improve data block construction performance (#9040)
Summary:
... by bypassing tracking of last_key in BlockBuilder when
last_key is already known (for BlockBasedTableBuilder::data_block).

I tried extracting a base class of BlockBuilder without the last_key
tracking at all, but that became complicated by NewFlushBlockPolicy() in
the public API referencing BlockBuilder, which would need to be the base
class, and I don't want to replace nearly all the internal references to
BlockBuilder.

Possible follow-up:
* Investigate / consider using AddWithLastKey in more places

This improvement should stack with https://github.com/facebook/rocksdb/issues/9039

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9040

Test Plan:
TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1: 278929 vs. 267799 (+4.2%)
Run 2: 281836 vs. 267432 (+5.4%)
Run 3: 278279 vs. 270454 (+2.9%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)

Reviewed By: mrambacher

Differential Revision: D31706033

Pulled By: pdillinger

fbshipit-source-id: 8a50fe6fefdd67b6d7665ffa687bbdcf5ad0d5ec
2021-10-19 12:36:21 -07:00
Ziyue Yang 03a781a90c Add pipelined & parallel compression optimization (#6262)
Summary:
This PR adds support for pipelined & parallel compression optimization for `BlockBasedTableBuilder`. This optimization makes block building, block compression and block appending a pipeline, and uses multiple threads to accelerate block compression. Users can set `CompressionOptions::parallel_threads` greater than 1 to enable compression parallelism.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6262

Reviewed By: ajkr

Differential Revision: D20651306

fbshipit-source-id: 62125590a9c15b6d9071def9dc72589c1696a4cb
2020-04-01 16:40:18 -07:00
sdong fdf882ded2 Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433)
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433

Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.

Differential Revision: D19977691

fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
2020-02-20 12:09:57 -08:00
Vijay Nadimpalli 50e470791d Organizing rocksdb/table directory by format
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5373

Differential Revision: D15559425

Pulled By: vjnadimpalli

fbshipit-source-id: 5d6d6d615582bedd96a4b879bb25d429a6de8b55
2019-05-30 14:51:11 -07:00
Renamed from table/block_builder.h (Browse further)