rocksdb/table/merging_iterator.h
Changyu Bi 749b849a34 Fix memtable-only iterator regression (#10705)
Summary:
when there is a single memtable without range tombstones and no SST files in the database, DBIter should wrap memtable iterator directly. Currently we create a merging iterator on top of the memtable iterator, and have DBIter wrap around it. This causes iterator regression and this PR fixes this issue.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10705

Test Plan:
- `make check`
- Performance:
  - Set up: `./db_bench -benchmarks=filluniquerandom -write_buffer_size=$((1 << 30)) -num=10000`
  - Benchmark: `./db_bench -benchmarks=seekrandom -use_existing_db=true -avoid_flush_during_recovery=true -write_buffer_size=$((1 << 30)) -num=10000 -threads=16 -duration=60 -seek_nexts=$seek_nexts`
```
seek_nexts    main op/sec    https://github.com/facebook/rocksdb/issues/10705      RocksDB v7.6
0             5746568        5749033     5786180
30            2411690        3006466     2837699
1000          102556         128902      124667
```

Reviewed By: ajkr

Differential Revision: D39644221

Pulled By: cbi42

fbshipit-source-id: 8063ff611ba31b0e5670041da3927c8c54b2097d
2022-09-21 09:49:31 -07:00

92 lines
3.8 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#pragma once
#include "db/range_del_aggregator.h"
#include "rocksdb/slice.h"
#include "rocksdb/types.h"
namespace ROCKSDB_NAMESPACE {
class Arena;
class ArenaWrappedDBIter;
class InternalKeyComparator;
template <class TValue>
class InternalIteratorBase;
using InternalIterator = InternalIteratorBase<Slice>;
// Return an iterator that provided the union of the data in
// children[0,n-1]. Takes ownership of the child iterators and
// will delete them when the result iterator is deleted.
//
// The result does no duplicate suppression. I.e., if a particular
// key is present in K child iterators, it will be yielded K times.
//
// REQUIRES: n >= 0
extern InternalIterator* NewMergingIterator(
const InternalKeyComparator* comparator, InternalIterator** children, int n,
Arena* arena = nullptr, bool prefix_seek_mode = false);
class MergingIterator;
// A builder class to build a merging iterator by adding iterators one by one.
// User should call only one of AddIterator() or AddPointAndTombstoneIterator()
// exclusively for the same builder.
class MergeIteratorBuilder {
public:
// comparator: the comparator used in merging comparator
// arena: where the merging iterator needs to be allocated from.
explicit MergeIteratorBuilder(const InternalKeyComparator* comparator,
Arena* arena, bool prefix_seek_mode = false);
~MergeIteratorBuilder();
// Add iter to the merging iterator.
void AddIterator(InternalIterator* iter);
// Add a point key iterator and a range tombstone iterator.
// `tombstone_iter_ptr` should and only be set by LevelIterator.
// *tombstone_iter_ptr will be set to where the merging iterator stores
// `tombstone_iter` when MergeIteratorBuilder::Finish() is called. This is
// used by LevelIterator to update range tombstone iters when switching to a
// different SST file. If a single point iterator with a nullptr range
// tombstone iterator is provided, and the point iterator is not a level
// iterator, then this builder will return the point iterator directly,
// instead of creating a merging iterator on top of it. Internally, if all
// point iterators are not LevelIterator, then range tombstone iterator is
// only added to the merging iter if there is a non-null `tombstone_iter`.
void AddPointAndTombstoneIterator(
InternalIterator* point_iter, TruncatedRangeDelIterator* tombstone_iter,
TruncatedRangeDelIterator*** tombstone_iter_ptr = nullptr);
// Get arena used to build the merging iterator. It is called one a child
// iterator needs to be allocated.
Arena* GetArena() { return arena; }
// Return the result merging iterator.
// If db_iter is not nullptr, then db_iter->SetMemtableRangetombstoneIter()
// will be called with pointer to where the merging iterator
// stores the memtable range tombstone iterator.
// This is used for DB iterator to refresh memtable range tombstones.
InternalIterator* Finish(ArenaWrappedDBIter* db_iter = nullptr);
private:
MergingIterator* merge_iter;
InternalIterator* first_iter;
bool use_merging_iter;
Arena* arena;
// Used to set LevelIterator.range_tombstone_iter_.
// See AddRangeTombstoneIterator() implementation for more detail.
std::vector<std::pair<size_t, TruncatedRangeDelIterator***>>
range_del_iter_ptrs_;
};
} // namespace ROCKSDB_NAMESPACE