Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
|
|
|
|
#ifndef LEVELDB_UTILITIES_TTL_DB_TTL_H_
|
|
|
|
#define LEVELDB_UTILITIES_TTL_DB_TTL_H_
|
|
|
|
|
2013-07-22 23:49:55 +00:00
|
|
|
#include "leveldb/db.h"
|
|
|
|
#include "leveldb/env.h"
|
|
|
|
#include "leveldb/compaction_filter.h"
|
|
|
|
#include "leveldb/merge_operator.h"
|
2013-08-06 00:55:44 +00:00
|
|
|
#include "utilities/utility_db.h"
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
#include "db/db_impl.h"
|
|
|
|
|
|
|
|
namespace leveldb {
|
|
|
|
|
2013-08-06 00:55:44 +00:00
|
|
|
class DBWithTTL : public StackableDB {
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
public:
|
|
|
|
DBWithTTL(const int32_t ttl,
|
|
|
|
const Options& options,
|
|
|
|
const std::string& dbname,
|
2013-05-10 21:19:47 +00:00
|
|
|
Status& st,
|
|
|
|
bool read_only);
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
|
|
|
virtual ~DBWithTTL();
|
|
|
|
|
|
|
|
virtual Status Put(const WriteOptions& o,
|
|
|
|
const Slice& key,
|
|
|
|
const Slice& val);
|
|
|
|
|
|
|
|
virtual Status Get(const ReadOptions& options,
|
|
|
|
const Slice& key,
|
|
|
|
std::string* value);
|
|
|
|
|
2013-06-05 18:22:38 +00:00
|
|
|
virtual std::vector<Status> MultiGet(const ReadOptions& options,
|
|
|
|
const std::vector<Slice>& keys,
|
|
|
|
std::vector<std::string>* values);
|
|
|
|
|
2013-07-26 19:57:01 +00:00
|
|
|
virtual bool KeyMayExist(ReadOptions& options,
|
|
|
|
const Slice& key,
|
|
|
|
std::string* value,
|
|
|
|
bool* value_found = nullptr);
|
2013-07-06 01:49:18 +00:00
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
virtual Status Delete(const WriteOptions& wopts, const Slice& key);
|
|
|
|
|
2013-03-21 22:59:47 +00:00
|
|
|
virtual Status Merge(const WriteOptions& options,
|
|
|
|
const Slice& key,
|
|
|
|
const Slice& value);
|
|
|
|
|
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
virtual Status Write(const WriteOptions& opts, WriteBatch* updates);
|
|
|
|
|
|
|
|
virtual Iterator* NewIterator(const ReadOptions& opts);
|
|
|
|
|
|
|
|
virtual const Snapshot* GetSnapshot();
|
|
|
|
|
|
|
|
virtual void ReleaseSnapshot(const Snapshot* snapshot);
|
|
|
|
|
|
|
|
virtual bool GetProperty(const Slice& property, std::string* value);
|
|
|
|
|
|
|
|
virtual void GetApproximateSizes(const Range* r, int n, uint64_t* sizes);
|
|
|
|
|
2013-06-30 06:21:36 +00:00
|
|
|
virtual void CompactRange(const Slice* begin, const Slice* end,
|
|
|
|
bool reduce_level = false);
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
|
|
|
virtual int NumberLevels();
|
|
|
|
|
|
|
|
virtual int MaxMemCompactionLevel();
|
|
|
|
|
|
|
|
virtual int Level0StopWriteTrigger();
|
|
|
|
|
|
|
|
virtual Status Flush(const FlushOptions& fopts);
|
|
|
|
|
|
|
|
virtual Status DisableFileDeletions();
|
|
|
|
|
|
|
|
virtual Status EnableFileDeletions();
|
|
|
|
|
|
|
|
virtual Status GetLiveFiles(std::vector<std::string>& vec, uint64_t* mfs);
|
|
|
|
|
|
|
|
virtual SequenceNumber GetLatestSequenceNumber();
|
|
|
|
|
|
|
|
virtual Status GetUpdatesSince(SequenceNumber seq_number,
|
|
|
|
unique_ptr<TransactionLogIterator>* iter);
|
|
|
|
|
|
|
|
// Simulate a db crash, no elegant closing of database.
|
|
|
|
void TEST_Destroy_DBWithTtl();
|
|
|
|
|
2013-08-06 00:55:44 +00:00
|
|
|
virtual DB* GetRawDB() {
|
|
|
|
return db_;
|
|
|
|
}
|
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
static bool IsStale(const Slice& value, int32_t ttl);
|
|
|
|
|
|
|
|
static Status AppendTS(const Slice& val, std::string& val_with_ts);
|
|
|
|
|
2013-06-20 18:50:33 +00:00
|
|
|
static Status SanityCheckTimestamp(const Slice& str);
|
2013-05-10 00:33:27 +00:00
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
static Status StripTS(std::string* str);
|
|
|
|
|
|
|
|
static Status GetCurrentTime(int32_t& curtime);
|
|
|
|
|
2013-08-03 07:40:22 +00:00
|
|
|
static const uint32_t kTSLength = sizeof(int32_t); // size of timestamp
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2013-06-19 02:57:54 +00:00
|
|
|
static const int32_t kMinTimestamp = 1368146402; // 05/09/2013:5:40PM GMT-8
|
|
|
|
|
|
|
|
static const int32_t kMaxTimestamp = 2147483647; // 01/18/2038:7:14PM GMT-8
|
2013-05-10 00:33:27 +00:00
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
private:
|
|
|
|
DB* db_;
|
|
|
|
int32_t ttl_;
|
2013-07-22 23:49:55 +00:00
|
|
|
unique_ptr<MergeOperator> ttl_merge_op_;
|
2013-08-03 07:40:22 +00:00
|
|
|
unique_ptr<CompactionFilter> ttl_comp_filter_;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
};
|
|
|
|
|
2013-06-19 02:57:54 +00:00
|
|
|
class TtlIterator : public Iterator {
|
|
|
|
|
|
|
|
public:
|
2013-06-20 18:50:33 +00:00
|
|
|
explicit TtlIterator(Iterator* iter)
|
|
|
|
: iter_(iter) {
|
2013-06-19 02:57:54 +00:00
|
|
|
assert(iter_);
|
|
|
|
}
|
|
|
|
|
|
|
|
~TtlIterator() {
|
|
|
|
delete iter_;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool Valid() const {
|
|
|
|
return iter_->Valid();
|
|
|
|
}
|
|
|
|
|
|
|
|
void SeekToFirst() {
|
|
|
|
iter_->SeekToFirst();
|
|
|
|
}
|
|
|
|
|
|
|
|
void SeekToLast() {
|
|
|
|
iter_->SeekToLast();
|
|
|
|
}
|
|
|
|
|
|
|
|
void Seek(const Slice& target) {
|
|
|
|
iter_->Seek(target);
|
|
|
|
}
|
|
|
|
|
|
|
|
void Next() {
|
|
|
|
iter_->Next();
|
|
|
|
}
|
|
|
|
|
|
|
|
void Prev() {
|
|
|
|
iter_->Prev();
|
|
|
|
}
|
|
|
|
|
|
|
|
Slice key() const {
|
|
|
|
return iter_->key();
|
|
|
|
}
|
|
|
|
|
2013-06-20 18:50:33 +00:00
|
|
|
int32_t timestamp() const {
|
|
|
|
return DecodeFixed32(
|
2013-06-19 02:57:54 +00:00
|
|
|
iter_->value().data() + iter_->value().size() - DBWithTTL::kTSLength);
|
|
|
|
}
|
|
|
|
|
|
|
|
Slice value() const {
|
2013-06-20 18:50:33 +00:00
|
|
|
//TODO: handle timestamp corruption like in general iterator semantics
|
|
|
|
assert(DBWithTTL::SanityCheckTimestamp(iter_->value()).ok());
|
2013-06-19 02:57:54 +00:00
|
|
|
Slice trimmed_value = iter_->value();
|
2013-06-20 18:50:33 +00:00
|
|
|
trimmed_value.size_ -= DBWithTTL::kTSLength;
|
2013-06-19 02:57:54 +00:00
|
|
|
return trimmed_value;
|
|
|
|
}
|
|
|
|
|
|
|
|
Status status() const {
|
|
|
|
return iter_->status();
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
Iterator* iter_;
|
|
|
|
};
|
|
|
|
|
2013-08-03 07:40:22 +00:00
|
|
|
class TtlCompactionFilter : public CompactionFilter {
|
|
|
|
|
|
|
|
public:
|
|
|
|
TtlCompactionFilter(int32_t ttl, const CompactionFilter* comp_filter)
|
|
|
|
: ttl_(ttl),
|
|
|
|
user_comp_filter_(comp_filter) {
|
|
|
|
// Unlike the merge operator, compaction filter is necessary for TTL, hence
|
|
|
|
// this would be called even if user doesn't specify any compaction-filter
|
|
|
|
}
|
|
|
|
|
|
|
|
virtual bool Filter(int level,
|
|
|
|
const Slice& key,
|
|
|
|
const Slice& old_val,
|
|
|
|
std::string* new_val,
|
|
|
|
bool* value_changed) const override {
|
|
|
|
if (DBWithTTL::IsStale(old_val, ttl_)) {
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
if (user_comp_filter_ == nullptr) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
assert(old_val.size() >= DBWithTTL::kTSLength);
|
|
|
|
Slice old_val_without_ts(old_val.data(),
|
|
|
|
old_val.size() - DBWithTTL::kTSLength);
|
|
|
|
if (user_comp_filter_->Filter(level, key, old_val_without_ts, new_val,
|
|
|
|
value_changed)) {
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
if (*value_changed) {
|
|
|
|
new_val->append(old_val.data() + old_val.size() - DBWithTTL::kTSLength,
|
|
|
|
DBWithTTL::kTSLength);
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
virtual const char* Name() const override {
|
|
|
|
return "Delete By TTL";
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
int32_t ttl_;
|
|
|
|
const CompactionFilter* user_comp_filter_;
|
|
|
|
};
|
|
|
|
|
2013-07-22 23:49:55 +00:00
|
|
|
class TtlMergeOperator : public MergeOperator {
|
|
|
|
|
|
|
|
public:
|
|
|
|
explicit TtlMergeOperator(const MergeOperator* merge_op)
|
|
|
|
: user_merge_op_(merge_op) {
|
|
|
|
assert(merge_op);
|
|
|
|
}
|
|
|
|
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
virtual bool Merge(const Slice& key,
|
2013-07-22 23:49:55 +00:00
|
|
|
const Slice* existing_value,
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
const std::deque<std::string>& operands,
|
2013-07-22 23:49:55 +00:00
|
|
|
std::string* new_value,
|
2013-08-03 07:40:22 +00:00
|
|
|
Logger* logger) const override {
|
2013-08-06 02:20:58 +00:00
|
|
|
const uint32_t ts_len = DBWithTTL::kTSLength;
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
if (existing_value && existing_value->size() < ts_len) {
|
|
|
|
Log(logger, "Error: Could not remove timestamp from existing value.");
|
|
|
|
return false;
|
|
|
|
// TODO: Change Merge semantics and add a counter here
|
|
|
|
}
|
|
|
|
|
|
|
|
// Extract time-stamp from each operand to be passed to user_merge_op_
|
|
|
|
std::deque<std::string> operands_without_ts;
|
|
|
|
for (auto it = operands.begin(); it != operands.end(); ++it) {
|
|
|
|
if (it->size() < ts_len) {
|
|
|
|
Log(logger, "Error: Could not remove timestamp from operand value.");
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
operands_without_ts.push_back(it->substr(0, it->size() - ts_len));
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
// Apply the user merge operator (store result in *new_value)
|
2013-07-22 23:49:55 +00:00
|
|
|
if (existing_value) {
|
|
|
|
Slice existing_value_without_ts(existing_value->data(),
|
|
|
|
existing_value->size() - ts_len);
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
user_merge_op_->Merge(key, &existing_value_without_ts,
|
|
|
|
operands_without_ts, new_value, logger);
|
|
|
|
} else {
|
|
|
|
user_merge_op_->Merge(key, nullptr, operands_without_ts, new_value,
|
|
|
|
logger);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Augment the *new_value with the ttl time-stamp
|
|
|
|
int32_t curtime;
|
|
|
|
if (!DBWithTTL::GetCurrentTime(curtime).ok()) {
|
|
|
|
Log(logger, "Error: Could not get current time to be attached internally "
|
|
|
|
"to the new value.");
|
|
|
|
return false;
|
2013-07-22 23:49:55 +00:00
|
|
|
} else {
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
char ts_string[ts_len];
|
|
|
|
EncodeFixed32(ts_string, curtime);
|
|
|
|
new_value->append(ts_string, ts_len);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
virtual bool PartialMerge(const Slice& key,
|
|
|
|
const Slice& left_operand,
|
|
|
|
const Slice& right_operand,
|
|
|
|
std::string* new_value,
|
|
|
|
Logger* logger) const override {
|
2013-08-06 18:02:19 +00:00
|
|
|
const uint32_t ts_len = DBWithTTL::kTSLength;
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
if (left_operand.size() < ts_len || right_operand.size() < ts_len) {
|
|
|
|
Log(logger, "Error: Could not remove timestamp from value.");
|
|
|
|
return false;
|
|
|
|
//TODO: Change Merge semantics and add a counter here
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
// Apply the user partial-merge operator (store result in *new_value)
|
|
|
|
assert(new_value);
|
|
|
|
Slice left_without_ts(left_operand.data(), left_operand.size() - ts_len);
|
|
|
|
Slice right_without_ts(right_operand.data(), right_operand.size() - ts_len);
|
|
|
|
user_merge_op_->PartialMerge(key, left_without_ts, right_without_ts,
|
|
|
|
new_value, logger);
|
|
|
|
|
|
|
|
// Augment the *new_value with the ttl time-stamp
|
2013-07-22 23:49:55 +00:00
|
|
|
int32_t curtime;
|
|
|
|
if (!DBWithTTL::GetCurrentTime(curtime).ok()) {
|
|
|
|
Log(logger, "Error: Could not get current time to be attached internally "
|
|
|
|
"to the new value.");
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return false;
|
2013-07-22 23:49:55 +00:00
|
|
|
} else {
|
|
|
|
char ts_string[ts_len];
|
|
|
|
EncodeFixed32(ts_string, curtime);
|
|
|
|
new_value->append(ts_string, ts_len);
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return true;
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
|
|
|
|
2013-08-03 07:40:22 +00:00
|
|
|
virtual const char* Name() const override {
|
2013-07-22 23:49:55 +00:00
|
|
|
return "Merge By TTL";
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
const MergeOperator* user_merge_op_;
|
|
|
|
};
|
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
}
|
|
|
|
#endif // LEVELDB_UTILITIES_TTL_DB_TTL_H_
|