Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12630
The patch cleans up, improves, and brings into sync (to the extent possible without API signature changes) the sanity checks around the `GetEntity` / `MultiGetEntity` family of APIs, including the read-your-own-writes (`WriteBatchWithIndex`) and transaction layers. The checks are centralized in two main sets of entry points, namely in `DB(Impl)` and the "main" `GetEntityFromBatchAndDB` / `MultiGetEntityFromBatchAndDB` overloads in `WriteBatchWithIndex`. This eliminates the need to duplicate the checks in the transaction classes.
Reviewed By: jaykorean
Differential Revision: D57125741
fbshipit-source-id: 4dd059ef644a9b173fbba767538943397e4cc6cd
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12557
Unlike for other sequence containers, the C++ standard allows moving an `std::string` to invalidate pointers/iterators/references. In practice, this happens with short strings which are stored "inline" in the `std::string` object (small string optimization). Since `PinnableSlice` uses `std::string` as its internal buffer, and `PinnableWideColumns` in turn is implemented in terms of `PinnableSlice`, this means that the default compiler-generated move operations can invalidate the column index stored in `PinnableWideColumns::columns_`. The PR fixes this by providing custom move constructor/move assignment implementations for `PinnableWideColumns` that recreate the `columns_` index upon move.
Reviewed By: jaykorean
Differential Revision: D56275054
fbshipit-source-id: e8648c003dbcf1c39ec122ad229780c28138e730
Summary:
- Add missing null check for ColumnFamilyHandle in `GetEntity()`
- `FailIfCfHasTs()` now returns `Status::InvalidArgument()` if `column_family` is null. `MultiGetEntity()` can rely on this for cfh null check.
- Added `DeleteRange` API using Default Column Family to be consistent with other major APIs (This was also causing Java Test failure after the `FailIfCfHasTs()` change)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12057
Test Plan:
- Updated `DBWideBasicTest::GetEntityAsPinnableAttributeGroups` to include null CF case
- Updated `DBWideBasicTest::MultiCFMultiGetEntityAsPinnableAttributeGroups` to include null CF case
Reviewed By: jowlyzhang
Differential Revision: D51167445
Pulled By: jaykorean
fbshipit-source-id: 1c1e44fd7b7df4d2dc3bb2d7d251da85bad7d664
Summary:
Implementation of `GetEntity()` API that returns wide-column entities as AttributeGroups from multiple column families for a single key. Regarding the definition of Attribute groups, please see the detailed example description in PR https://github.com/facebook/rocksdb/issues/11925
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11943
Test Plan:
- `DBWideBasicTest::GetEntityAsPinnableAttributeGroups` added
will enable the new API in the `db_stress` after merging
Reviewed By: ltamasi
Differential Revision: D50195794
Pulled By: jaykorean
fbshipit-source-id: 218d54841ac7e337de62e13b1233b0a99bd91af3
Summary:
Introducing the notion of AttributeGroup by adding the `MultiGetEntity()` API retrieving `PinnableAttributeGroups`.
An "attribute group" refers to a logical grouping of wide-column entities within RocksDB. These attribute groups are implemented using column families.
Users can store WideColumns in different CFs for various reasons (e.g. similar access patterns, same types, etc.). This new API `MultiGetEntity()` takes keys and `PinnableAttributeGroups` per key. `PinnableAttributeGroups` is just a list of `PinnableAttributeGroup`s in which we have `ColumnFamilyHandle*`, `Status`, and `PinnableWideColumns`.
Let's say a user stored "hot" wide columns in column family "hot_data_cf" and "cold" wide columns in column family "cold_data_cf" and all other columns in "common_cf".
Prior to this PR, if the user wants to query for two keys, "key_1" and "key_2" and but only interested in "common_cf" and "hot_data_cf" for "key_1", and "common_cf" and "cold_data_cf" for "key_2", the user would have to construct input like `keys = ["key_1", "key_1", "key_2", "key_2"]`, `column_families = ["common_cf", "hot_data_cf", "common_cf", "cold_data_cf"]` and get the flat list of `PinnableWideColumns` to find the corresponding <key,CF> combo.
With the new `MultiGetEntity()` introduced in this PR, users can now query only `["common_cf", "hot_data_cf"]` for `"key_1"`, and only `["common_cf", "cold_data_cf"]` for `"key_2"`. The user will get `PinnableAttributeGroups` for each key, and `PinnableAttributeGroups` gives a list of `PinnableAttributeGroup`s where the user can find column family and corresponding `PinnableWideColumns` and the `Status`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11925
Test Plan:
- `DBWideBasicTest::MultiCFMultiGetEntityAsPinnableAttributeGroups` added
will enable this new API in the `db_stress` in a separate PR
Reviewed By: ltamasi
Differential Revision: D50017414
Pulled By: jaykorean
fbshipit-source-id: 643611d1273c574bc81b94c6f5aeea24b40c4586
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11896
The patch extends the test coverage of the wide column aware merge logic by adding two new tests that perform general transformations during merge by implementing the `FullMergeV3` interface. The first one uses a merge operator that produces a wide-column entity as result in all cases (i.e. even if the base value is a plain key-value, or if there is no base value). The second one uses a merge operator that results in a plain key-value in all cases.
Reviewed By: jaykorean
Differential Revision: D49665946
fbshipit-source-id: 419b9e557c064525b659685eb8c09ae446656439
Summary:
`GetEntity` API support for ReadOnly DB and Secondary DB.
- Introduced `GetImpl()` with `GetImplOptions` in `db_impl_readonly` and refactored current `Get()` logic into `GetImpl()` so that look up logic can be reused for `GetEntity()` (Following the same pattern as `DBImpl::Get()` and `DBImpl::GetEntity()`)
- Introduced `GetImpl()` with `GetImplOptions` in `db_impl_secondary` and refactored current `GetImpl()` logic. This is to make `DBImplSecondary::Get/GetEntity` consistent with `DBImpl::Get/GetEntity` and `DBImplReadOnly::Get/GetEntity`
- `GetImpl()` in `db_impl` is now virtual. both `db_impl_readonly` and `db_impl_secondary`'s `Get()` override are no longer needed since all three dbs now have the same `Get()` which calls `GetImpl()` internally.
- `GetImpl()` in `DBImplReadOnly` and `DBImplSecondary` now pass in `columns` instead of `nullptr` in lookup functions like `memtable->get()`
- Introduced `GetEntity()` API in `DBImplReadOnly` and `DBImplSecondary` which simply calls `GetImpl()` with `columns` set in `GetImplOptions`.
- Introduced `Env::IOActivity::kGetEntity` and set read_options.io_activity to `Env::IOActivity::kGetEntity` for `GetEntity()` operations (in db_impl)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11799
Test Plan:
**Unit Tests**
- Added verification in `DBWideBasicTest::PutEntity` by Reopening DB as ReadOnly with the same setup.
- Added verification in `DBSecondaryTest::ReopenAsSecondary` by calling `PutEntity()` and `GetEntity()` on top of existing `Put()` and `Get()`
- `make -j64 check`
**Crash Tests**
- `python3 tools/db_crashtest.py blackbox --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --inter
val=10`
- `python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 `
- `python3 tools/db_crashtest.py blackbox --cf_consistency --max_key=25000000 --write_buffer_size=4194304 --max_bytes_for_level_base=2097152 --target_file_size_base=2097152 --periodic_compaction_seconds=0 --use_put_entity_one_in=10 --use_get_entity=1 --duration=60 --inter
val=10`
Reviewed By: ltamasi
Differential Revision: D49037040
Pulled By: jaykorean
fbshipit-source-id: a0648253ded6e91af7953de364ed3c6bf163626b
Summary:
The new `MultiGetEntity` API can be used to get a consistent view of
a batch of keys, with the results presented as wide-column entities.
Similarly to `GetEntity` and the iterator's `columns` API, if the entry
corresponding to the key is a wide-column entity to start with, it is
returned as-is, and if it is a plain key-value, it is wrapped into an entity
with a single default column.
Implementation-wise, the new API shares the logic of the batched `MultiGet`
API (via the `MultiGetCommon` methods). Both single-CF and multi-CF
`MultiGetEntity` APIs are provided, and blobs are also supported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11222
Test Plan: `make check`
Reviewed By: akankshamahajan15
Differential Revision: D43256950
Pulled By: ltamasi
fbshipit-source-id: 47fb2cb7e2d0470e3580f43fdb2fe9e51f0e7005
Summary:
The patch adds compaction filter support for wide-column entities by introducing
a new `CompactionFilter` API called `FilterV3`. This API is called for regular
key-values, merge operands, and wide-column entities as well. It is passed the
existing value/operand or wide-column structure and it can update the value or
columns or keep/delete/etc. the key-value as usual. For compatibility, the default
implementation of `FilterV3` keeps all wide-column entities and falls back to calling
`FilterV2` for plain old key-values and merge operands.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11196
Test Plan: `make check`
Reviewed By: akankshamahajan15
Differential Revision: D43094147
Pulled By: ltamasi
fbshipit-source-id: 75acabe9a35254f7f404ba6173ee9c2774382ebd
Summary:
The patch extends the compaction logic to handle `Merge`s in conjunction with wide-column entities. As usual, the merge operation is applied to the anonymous default column, and any other columns are unaffected.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10946
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D41233722
Pulled By: ltamasi
fbshipit-source-id: dfd9b1362222f01bafcecb139eb48480eb279fed
Summary:
The patch adds `Merge` support for wide-column entities in `DBIter`. As before, the `Merge` operation is applied to the default column of the entity; any other columns are unchanged. As a small cleanup, the PR also changes the signature of `DBIter::Merge` to simply return a boolean instead of the `Merge` operation's `Status` since the actual `Status` is already stored in a member variable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10941
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D41195471
Pulled By: ltamasi
fbshipit-source-id: 362cf555897296e252c3de5ddfbd569ef34f85ef
Summary:
The patch adds `Merge` support for wide-column entities to the point lookup
APIs, i.e. `Get`, `MultiGet`, `GetEntity`, and `GetMergeOperands`. (I plan to
update the iterator and compaction logic in separate PRs.) In terms of semantics,
the `Merge` operation is applied to the default (anonymous) column; any other
columns in the entity are unaffected.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10916
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D40962311
Pulled By: ltamasi
fbshipit-source-id: 244bc9d172be1af2f204796b2f89104e4d2fa373
Summary:
The PR fixes the handling of `Merge`s in `GetEntity`. Note that `Merge` is not yet
supported for wide-column entities written using `PutEntity`; this change is
about returning correct (i.e. consistent with `Get`) results in cases like when the
base value is a plain old key-value written using `Put` or when there is no real base
value because we hit either a tombstone or the beginning of history.
Implementation-wise, the patch introduces a new wrapper around the existing
`MergeHelper::TimedFullMerge` that can store the merge result in either a string
(for the purposes of `Get`) or a `PinnableWideColumns` instance (for `GetEntity`).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10894
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D40782708
Pulled By: ltamasi
fbshipit-source-id: 3d700d56b2ef81f02ba1e2d93f6481bf13abcc90
Summary:
The patch extends the iterator API with a new `columns` method which
can be used to retrieve all wide columns for the current key. Similarly to
the `Get` and `GetEntity` APIs, the classic `value` API returns the value
of the default (anonymous) column for wide-column entities, and `columns`
returns an entity with a single default column for plain old key-values.
(The goal here is to maintain the invariant that `value()` is the same as
the value of the default column in `columns()`.) The patch also involves a
smaller refactoring: historically, `value()` was implemented using a bunch
of conditions, that is, the `Slice` to be returned was decided based on the
direction of the iteration, whether a merge had been done etc. when the
method was called; with the patch, the value to be exposed is stored in a
member `Slice value_` when the iterator lands on a new key, and `value()`
simply returns this `Slice`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10670
Test Plan: Ran `make check` and a simple blackbox crash test.
Reviewed By: riversand963
Differential Revision: D39475551
Pulled By: ltamasi
fbshipit-source-id: 29e7a6ed9ef340841aab36803b832b7c8f668b0b
Summary:
The patch adds a new API `GetEntity` that can be used to perform
wide-column point lookups. It also extends the `Get` code path and
the `MemTable` / `MemTableList` and `Version` / `GetContext` logic
accordingly so that wide-column entities can be served from both
memtables and SSTs. If the result of a lookup is a wide-column entity
(`kTypeWideColumnEntity`), it is passed to the application in deserialized
form; if it is a plain old key-value (`kTypeValue`), it is presented as a
wide-column entity with a single default (anonymous) column.
(In contrast, regular `Get` returns plain old key-values as-is, and
returns the value of the default column for wide-column entities, see
https://github.com/facebook/rocksdb/issues/10483 .)
The result of `GetEntity` is a self-contained `PinnableWideColumns` object.
`PinnableWideColumns` contains a `PinnableSlice`, which either stores the
underlying data in its own buffer or holds on to a cache handle. It also contains
a `WideColumns` instance, which indexes the contents of the `PinnableSlice`,
so applications can access the values of columns efficiently.
There are several pieces of functionality which are currently not supported
for wide-column entities: there is currently no `MultiGetEntity` or wide-column
iterator; also, `Merge` and `GetMergeOperands` are not supported, and there
is no `GetEntity` implementation for read-only and secondary instances.
We plan to implement these in future PRs.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10540
Test Plan: `make check`
Reviewed By: akankshamahajan15
Differential Revision: D38847474
Pulled By: ltamasi
fbshipit-source-id: 42311a34ccdfe88b3775e847a5e2a5296e002b5b
Summary:
The patch adds support for wide-column entities to the existing query
APIs (`Get`, `MultiGet`, and iterator). Namely, when during a query a
wide-column entity is encountered, we will return the value of the default
(anonymous) column as the result. Later, we plan to add wide-column
specific query APIs which will enable retrieving entire wide-column entities
or a subset of their columns.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10483
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D38441881
Pulled By: ltamasi
fbshipit-source-id: 6444e79a31aff2470e866698e3a97985bc2b3543
Summary:
The patch builds on https://github.com/facebook/rocksdb/pull/9915 and adds
a new API called `PutEntity` that can be used to write a wide-column entity
to the database. The new API is added to both `DB` and `WriteBatch`. Note
that currently there is no way to retrieve these entities; more precisely, all
read APIs (`Get`, `MultiGet`, and iterator) return `NotSupported` when they
encounter a wide-column entity that is required to answer a query. Read-side
support (as well as other missing functionality like `Merge`, compaction filter,
and timestamp support) will be added in later PRs.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10242
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D37369748
Pulled By: ltamasi
fbshipit-source-id: 7f5e412359ed7a400fd80b897dae5599dbcd685d