rocksdb/db/coalescing_iterator.h
Jay Huh d34712e0ac MultiCfIterator - AttributeGroupIter Impl & CoalescingIter Optimization (#12534)
Summary:
Continuing from the previous MultiCfIterator Implementations - (https://github.com/facebook/rocksdb/issues/12422, https://github.com/facebook/rocksdb/issues/12480 #12465), this PR completes the `AttributeGroupIterator` by implementing `AttributeGroupIteratorImpl::AddToAttributeGroups()`. While implementing the `AttributeGroupIterator`, we had to make some changes in `MultiCfIteratorImpl` and found an opportunity to improve `Coalesce()` in `CoalescingIterator`.

Lifting `UNDER CONSTRUCTION - DO NOT USE` comment by replacing it with `EXPERIMENTAL`

Here are some implementation details:
- `IteratorAttributeGroups` is introduced to avoid having to copy all `WideColumn` objects during iteration.
- `PopulateIterator()` no longer advances non-top iterators that have the same key as the top iterator in the heap.
- `AdvanceIterator()` needs to advance the non-top iterators when they have the same key as the top iterator in the heap.
- Instead of populating one by one, `PopulateIterator()` now collects all items with the same key and calls `populate_func(items)` at once.
- This allowed optimization in `Coalesce()` such that we no longer do K-1 rounds of 2-way merge, but do one K-way merge instead.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12534

Test Plan:
Uncommented the assertions in `verifyAttributeGroupIterator()`

```
./multi_cf_iterator_test
```

Reviewed By: ltamasi

Differential Revision: D56089019

Pulled By: jaykorean

fbshipit-source-id: 6b0b4247e221f69b40b147d41492008cc9b15054
2024-04-16 08:45:38 -07:00

80 lines
2.4 KiB
C++

// Copyright (c) Meta Platforms, Inc. and affiliates.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#pragma once
#include "db/multi_cf_iterator_impl.h"
namespace ROCKSDB_NAMESPACE {
// EXPERIMENTAL
class CoalescingIterator : public Iterator {
public:
CoalescingIterator(const Comparator* comparator,
const std::vector<ColumnFamilyHandle*>& column_families,
const std::vector<Iterator*>& child_iterators)
: impl_(
comparator, column_families, child_iterators, [this]() { Reset(); },
[this](const autovector<MultiCfIteratorInfo>& items) {
Coalesce(items);
}) {}
~CoalescingIterator() override {}
// No copy allowed
CoalescingIterator(const CoalescingIterator&) = delete;
CoalescingIterator& operator=(const CoalescingIterator&) = delete;
bool Valid() const override { return impl_.Valid(); }
void SeekToFirst() override { impl_.SeekToFirst(); }
void SeekToLast() override { impl_.SeekToLast(); }
void Seek(const Slice& target) override { impl_.Seek(target); }
void SeekForPrev(const Slice& target) override { impl_.SeekForPrev(target); }
void Next() override { impl_.Next(); }
void Prev() override { impl_.Prev(); }
Slice key() const override { return impl_.key(); }
Status status() const override { return impl_.status(); }
Slice value() const override {
assert(Valid());
return value_;
}
const WideColumns& columns() const override {
assert(Valid());
return wide_columns_;
}
void Reset() {
value_.clear();
wide_columns_.clear();
}
private:
MultiCfIteratorImpl impl_;
Slice value_;
WideColumns wide_columns_;
struct WideColumnWithOrder {
const WideColumn* column;
int order;
};
class WideColumnWithOrderComparator {
public:
explicit WideColumnWithOrderComparator() {}
bool operator()(const WideColumnWithOrder& a,
const WideColumnWithOrder& b) const {
int c = a.column->name().compare(b.column->name());
return c == 0 ? a.order - b.order > 0 : c > 0;
}
};
using MinHeap =
BinaryHeap<WideColumnWithOrder, WideColumnWithOrderComparator>;
void Coalesce(const autovector<MultiCfIteratorInfo>& items);
};
} // namespace ROCKSDB_NAMESPACE