Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
|
2013-10-05 05:32:05 +00:00
|
|
|
#pragma once
|
2014-04-22 18:27:33 +00:00
|
|
|
|
|
|
|
#ifndef ROCKSDB_LITE
|
2014-03-25 00:57:13 +00:00
|
|
|
#include <deque>
|
|
|
|
#include <string>
|
2014-04-22 18:27:33 +00:00
|
|
|
#include <vector>
|
2014-03-25 00:57:13 +00:00
|
|
|
|
2013-08-23 15:38:13 +00:00
|
|
|
#include "rocksdb/db.h"
|
|
|
|
#include "rocksdb/env.h"
|
|
|
|
#include "rocksdb/compaction_filter.h"
|
|
|
|
#include "rocksdb/merge_operator.h"
|
2014-07-23 14:21:38 +00:00
|
|
|
#include "rocksdb/utilities/utility_db.h"
|
|
|
|
#include "rocksdb/utilities/db_ttl.h"
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
#include "db/db_impl.h"
|
|
|
|
|
2013-10-04 04:49:15 +00:00
|
|
|
namespace rocksdb {
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
class DBWithTTLImpl : public DBWithTTL {
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
public:
|
2014-05-02 14:13:51 +00:00
|
|
|
static void SanitizeOptions(int32_t ttl, ColumnFamilyOptions* options,
|
|
|
|
Env* env);
|
2013-12-07 00:10:43 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
explicit DBWithTTLImpl(DB* db);
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual ~DBWithTTLImpl();
|
|
|
|
|
|
|
|
Status CreateColumnFamilyWithTtl(const ColumnFamilyOptions& options,
|
|
|
|
const std::string& column_family_name,
|
|
|
|
ColumnFamilyHandle** handle,
|
|
|
|
int ttl) override;
|
|
|
|
|
|
|
|
Status CreateColumnFamily(const ColumnFamilyOptions& options,
|
|
|
|
const std::string& column_family_name,
|
|
|
|
ColumnFamilyHandle** handle) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::Put;
|
|
|
|
virtual Status Put(const WriteOptions& options,
|
2014-02-11 01:04:44 +00:00
|
|
|
ColumnFamilyHandle* column_family, const Slice& key,
|
2013-12-07 00:10:43 +00:00
|
|
|
const Slice& val) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::Get;
|
|
|
|
virtual Status Get(const ReadOptions& options,
|
2014-02-11 01:04:44 +00:00
|
|
|
ColumnFamilyHandle* column_family, const Slice& key,
|
2013-12-07 00:10:43 +00:00
|
|
|
std::string* value) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::MultiGet;
|
2013-12-07 00:10:43 +00:00
|
|
|
virtual std::vector<Status> MultiGet(
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
const ReadOptions& options,
|
2014-02-11 01:04:44 +00:00
|
|
|
const std::vector<ColumnFamilyHandle*>& column_family,
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
const std::vector<Slice>& keys,
|
2013-12-07 00:10:43 +00:00
|
|
|
std::vector<std::string>* values) override;
|
2013-06-05 18:22:38 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::KeyMayExist;
|
2013-08-19 18:42:47 +00:00
|
|
|
virtual bool KeyMayExist(const ReadOptions& options,
|
2014-02-11 01:04:44 +00:00
|
|
|
ColumnFamilyHandle* column_family, const Slice& key,
|
|
|
|
std::string* value,
|
2013-08-19 18:42:47 +00:00
|
|
|
bool* value_found = nullptr) override;
|
2013-07-06 01:49:18 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::Merge;
|
|
|
|
virtual Status Merge(const WriteOptions& options,
|
2014-02-11 01:04:44 +00:00
|
|
|
ColumnFamilyHandle* column_family, const Slice& key,
|
|
|
|
const Slice& value) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2013-12-07 00:10:43 +00:00
|
|
|
virtual Status Write(const WriteOptions& opts, WriteBatch* updates) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
[RocksDB] [Column Family] Interface proposal
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
2013-12-03 19:14:09 +00:00
|
|
|
using StackableDB::NewIterator;
|
|
|
|
virtual Iterator* NewIterator(const ReadOptions& opts,
|
2014-02-11 01:04:44 +00:00
|
|
|
ColumnFamilyHandle* column_family) override;
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual DB* GetBaseDB() { return db_; }
|
2013-08-06 00:55:44 +00:00
|
|
|
|
2014-05-02 14:13:51 +00:00
|
|
|
static bool IsStale(const Slice& value, int32_t ttl, Env* env);
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2014-05-02 14:13:51 +00:00
|
|
|
static Status AppendTS(const Slice& val, std::string* val_with_ts, Env* env);
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2013-06-20 18:50:33 +00:00
|
|
|
static Status SanityCheckTimestamp(const Slice& str);
|
2013-05-10 00:33:27 +00:00
|
|
|
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
static Status StripTS(std::string* str);
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
static const uint32_t kTSLength = sizeof(int32_t); // size of timestamp
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
static const int32_t kMinTimestamp = 1368146402; // 05/09/2013:5:40PM GMT-8
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
static const int32_t kMaxTimestamp = 2147483647; // 01/18/2038:7:14PM GMT-8
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
};
|
|
|
|
|
2013-06-19 02:57:54 +00:00
|
|
|
class TtlIterator : public Iterator {
|
|
|
|
|
|
|
|
public:
|
2014-04-29 03:34:20 +00:00
|
|
|
explicit TtlIterator(Iterator* iter) : iter_(iter) { assert(iter_); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
~TtlIterator() { delete iter_; }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
bool Valid() const { return iter_->Valid(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
void SeekToFirst() { iter_->SeekToFirst(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
void SeekToLast() { iter_->SeekToLast(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
void Seek(const Slice& target) { iter_->Seek(target); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
void Next() { iter_->Next(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
void Prev() { iter_->Prev(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
Slice key() const { return iter_->key(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
2013-06-20 18:50:33 +00:00
|
|
|
int32_t timestamp() const {
|
2014-04-29 03:34:20 +00:00
|
|
|
return DecodeFixed32(iter_->value().data() + iter_->value().size() -
|
|
|
|
DBWithTTLImpl::kTSLength);
|
2013-06-19 02:57:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
Slice value() const {
|
2014-04-29 03:34:20 +00:00
|
|
|
// TODO: handle timestamp corruption like in general iterator semantics
|
|
|
|
assert(DBWithTTLImpl::SanityCheckTimestamp(iter_->value()).ok());
|
2013-06-19 02:57:54 +00:00
|
|
|
Slice trimmed_value = iter_->value();
|
2014-04-29 03:34:20 +00:00
|
|
|
trimmed_value.size_ -= DBWithTTLImpl::kTSLength;
|
2013-06-19 02:57:54 +00:00
|
|
|
return trimmed_value;
|
|
|
|
}
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
Status status() const { return iter_->status(); }
|
2013-06-19 02:57:54 +00:00
|
|
|
|
|
|
|
private:
|
|
|
|
Iterator* iter_;
|
|
|
|
};
|
|
|
|
|
2013-08-03 07:40:22 +00:00
|
|
|
class TtlCompactionFilter : public CompactionFilter {
|
|
|
|
public:
|
2013-08-13 17:56:20 +00:00
|
|
|
TtlCompactionFilter(
|
2014-05-02 14:13:51 +00:00
|
|
|
int32_t ttl, Env* env, const CompactionFilter* user_comp_filter,
|
2014-04-29 03:34:20 +00:00
|
|
|
std::unique_ptr<const CompactionFilter> user_comp_filter_from_factory =
|
|
|
|
nullptr)
|
|
|
|
: ttl_(ttl),
|
2014-05-02 14:13:51 +00:00
|
|
|
env_(env),
|
2014-04-29 03:34:20 +00:00
|
|
|
user_comp_filter_(user_comp_filter),
|
|
|
|
user_comp_filter_from_factory_(
|
|
|
|
std::move(user_comp_filter_from_factory)) {
|
2013-08-03 07:40:22 +00:00
|
|
|
// Unlike the merge operator, compaction filter is necessary for TTL, hence
|
|
|
|
// this would be called even if user doesn't specify any compaction-filter
|
2013-08-13 17:56:20 +00:00
|
|
|
if (!user_comp_filter_) {
|
|
|
|
user_comp_filter_ = user_comp_filter_from_factory_.get();
|
|
|
|
}
|
2013-08-03 07:40:22 +00:00
|
|
|
}
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual bool Filter(int level, const Slice& key, const Slice& old_val,
|
|
|
|
std::string* new_val, bool* value_changed) const
|
|
|
|
override {
|
2014-05-02 14:13:51 +00:00
|
|
|
if (DBWithTTLImpl::IsStale(old_val, ttl_, env_)) {
|
2013-08-03 07:40:22 +00:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
if (user_comp_filter_ == nullptr) {
|
|
|
|
return false;
|
|
|
|
}
|
2014-04-29 03:34:20 +00:00
|
|
|
assert(old_val.size() >= DBWithTTLImpl::kTSLength);
|
2013-08-03 07:40:22 +00:00
|
|
|
Slice old_val_without_ts(old_val.data(),
|
2014-04-29 03:34:20 +00:00
|
|
|
old_val.size() - DBWithTTLImpl::kTSLength);
|
2013-08-03 07:40:22 +00:00
|
|
|
if (user_comp_filter_->Filter(level, key, old_val_without_ts, new_val,
|
|
|
|
value_changed)) {
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
if (*value_changed) {
|
2014-04-29 03:34:20 +00:00
|
|
|
new_val->append(
|
|
|
|
old_val.data() + old_val.size() - DBWithTTLImpl::kTSLength,
|
|
|
|
DBWithTTLImpl::kTSLength);
|
2013-08-03 07:40:22 +00:00
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual const char* Name() const override { return "Delete By TTL"; }
|
2013-08-03 07:40:22 +00:00
|
|
|
|
|
|
|
private:
|
|
|
|
int32_t ttl_;
|
2014-05-02 14:13:51 +00:00
|
|
|
Env* env_;
|
2013-08-03 07:40:22 +00:00
|
|
|
const CompactionFilter* user_comp_filter_;
|
2013-08-13 17:56:20 +00:00
|
|
|
std::unique_ptr<const CompactionFilter> user_comp_filter_from_factory_;
|
|
|
|
};
|
|
|
|
|
|
|
|
class TtlCompactionFilterFactory : public CompactionFilterFactory {
|
2014-04-29 03:34:20 +00:00
|
|
|
public:
|
|
|
|
TtlCompactionFilterFactory(
|
2014-05-02 14:13:51 +00:00
|
|
|
int32_t ttl, Env* env,
|
|
|
|
std::shared_ptr<CompactionFilterFactory> comp_filter_factory)
|
|
|
|
: ttl_(ttl), env_(env), user_comp_filter_factory_(comp_filter_factory) {}
|
2014-04-29 03:34:20 +00:00
|
|
|
|
|
|
|
virtual std::unique_ptr<CompactionFilter> CreateCompactionFilter(
|
|
|
|
const CompactionFilter::Context& context) {
|
|
|
|
return std::unique_ptr<TtlCompactionFilter>(new TtlCompactionFilter(
|
2014-05-02 14:13:51 +00:00
|
|
|
ttl_, env_, nullptr,
|
2014-04-29 03:34:20 +00:00
|
|
|
std::move(user_comp_filter_factory_->CreateCompactionFilter(context))));
|
|
|
|
}
|
2013-08-13 17:56:20 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual const char* Name() const override {
|
|
|
|
return "TtlCompactionFilterFactory";
|
|
|
|
}
|
2013-08-13 17:56:20 +00:00
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
private:
|
|
|
|
int32_t ttl_;
|
2014-05-02 14:13:51 +00:00
|
|
|
Env* env_;
|
2014-04-29 03:34:20 +00:00
|
|
|
std::shared_ptr<CompactionFilterFactory> user_comp_filter_factory_;
|
2013-08-03 07:40:22 +00:00
|
|
|
};
|
|
|
|
|
2013-07-22 23:49:55 +00:00
|
|
|
class TtlMergeOperator : public MergeOperator {
|
|
|
|
|
|
|
|
public:
|
2014-09-26 17:35:20 +00:00
|
|
|
explicit TtlMergeOperator(const std::shared_ptr<MergeOperator>& merge_op,
|
2014-05-02 14:13:51 +00:00
|
|
|
Env* env)
|
|
|
|
: user_merge_op_(merge_op), env_(env) {
|
2013-07-22 23:49:55 +00:00
|
|
|
assert(merge_op);
|
2014-05-02 14:13:51 +00:00
|
|
|
assert(env);
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual bool FullMerge(const Slice& key, const Slice* existing_value,
|
2013-08-19 18:42:47 +00:00
|
|
|
const std::deque<std::string>& operands,
|
2014-04-29 03:34:20 +00:00
|
|
|
std::string* new_value, Logger* logger) const
|
|
|
|
override {
|
|
|
|
const uint32_t ts_len = DBWithTTLImpl::kTSLength;
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
if (existing_value && existing_value->size() < ts_len) {
|
2014-10-30 00:57:00 +00:00
|
|
|
Log(InfoLogLevel::ERROR_LEVEL, logger,
|
|
|
|
"Error: Could not remove timestamp from existing value.");
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Extract time-stamp from each operand to be passed to user_merge_op_
|
|
|
|
std::deque<std::string> operands_without_ts;
|
2014-04-29 03:34:20 +00:00
|
|
|
for (const auto& operand : operands) {
|
2013-08-09 06:07:36 +00:00
|
|
|
if (operand.size() < ts_len) {
|
2014-10-30 00:57:00 +00:00
|
|
|
Log(InfoLogLevel::ERROR_LEVEL, logger,
|
|
|
|
"Error: Could not remove timestamp from operand value.");
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return false;
|
|
|
|
}
|
2013-08-09 06:07:36 +00:00
|
|
|
operands_without_ts.push_back(operand.substr(0, operand.size() - ts_len));
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
// Apply the user merge operator (store result in *new_value)
|
2013-08-19 18:42:47 +00:00
|
|
|
bool good = true;
|
2013-07-22 23:49:55 +00:00
|
|
|
if (existing_value) {
|
|
|
|
Slice existing_value_without_ts(existing_value->data(),
|
|
|
|
existing_value->size() - ts_len);
|
2013-08-19 18:42:47 +00:00
|
|
|
good = user_merge_op_->FullMerge(key, &existing_value_without_ts,
|
|
|
|
operands_without_ts, new_value, logger);
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
} else {
|
2013-08-19 18:42:47 +00:00
|
|
|
good = user_merge_op_->FullMerge(key, nullptr, operands_without_ts,
|
|
|
|
new_value, logger);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Return false if the user merge operator returned false
|
|
|
|
if (!good) {
|
|
|
|
return false;
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// Augment the *new_value with the ttl time-stamp
|
2013-11-19 04:56:21 +00:00
|
|
|
int64_t curtime;
|
2014-05-02 14:13:51 +00:00
|
|
|
if (!env_->GetCurrentTime(&curtime).ok()) {
|
2014-10-30 00:57:00 +00:00
|
|
|
Log(InfoLogLevel::ERROR_LEVEL, logger,
|
2014-04-29 03:34:20 +00:00
|
|
|
"Error: Could not get current time to be attached internally "
|
|
|
|
"to the new value.");
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return false;
|
2013-07-22 23:49:55 +00:00
|
|
|
} else {
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
char ts_string[ts_len];
|
2013-11-19 04:56:21 +00:00
|
|
|
EncodeFixed32(ts_string, (int32_t)curtime);
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
new_value->append(ts_string, ts_len);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-03-25 00:57:13 +00:00
|
|
|
virtual bool PartialMergeMulti(const Slice& key,
|
|
|
|
const std::deque<Slice>& operand_list,
|
|
|
|
std::string* new_value, Logger* logger) const
|
|
|
|
override {
|
2014-04-29 03:34:20 +00:00
|
|
|
const uint32_t ts_len = DBWithTTLImpl::kTSLength;
|
2014-03-25 00:57:13 +00:00
|
|
|
std::deque<Slice> operands_without_ts;
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
2014-03-25 00:57:13 +00:00
|
|
|
for (const auto& operand : operand_list) {
|
|
|
|
if (operand.size() < ts_len) {
|
2014-10-30 00:57:00 +00:00
|
|
|
Log(InfoLogLevel::ERROR_LEVEL, logger,
|
|
|
|
"Error: Could not remove timestamp from value.");
|
2014-03-25 00:57:13 +00:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
operands_without_ts.push_back(
|
|
|
|
Slice(operand.data(), operand.size() - ts_len));
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
// Apply the user partial-merge operator (store result in *new_value)
|
|
|
|
assert(new_value);
|
2014-03-25 00:57:13 +00:00
|
|
|
if (!user_merge_op_->PartialMergeMulti(key, operands_without_ts, new_value,
|
|
|
|
logger)) {
|
2013-08-19 18:42:47 +00:00
|
|
|
return false;
|
|
|
|
}
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
|
|
|
|
// Augment the *new_value with the ttl time-stamp
|
2013-11-19 04:56:21 +00:00
|
|
|
int64_t curtime;
|
2014-05-02 14:13:51 +00:00
|
|
|
if (!env_->GetCurrentTime(&curtime).ok()) {
|
2014-10-30 00:57:00 +00:00
|
|
|
Log(InfoLogLevel::ERROR_LEVEL, logger,
|
2014-04-29 03:34:20 +00:00
|
|
|
"Error: Could not get current time to be attached internally "
|
|
|
|
"to the new value.");
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return false;
|
2013-07-22 23:49:55 +00:00
|
|
|
} else {
|
|
|
|
char ts_string[ts_len];
|
2013-11-19 04:56:21 +00:00
|
|
|
EncodeFixed32(ts_string, (int32_t)curtime);
|
2013-07-22 23:49:55 +00:00
|
|
|
new_value->append(ts_string, ts_len);
|
[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.
I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.
This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.
NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!
Test Plan:
-This is a preliminary diff. I have only just begun testing/debugging it.
-I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
-I will be "desk-checking" and walking through the code with the help gdb.
-I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
-I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
-A lot of feedback from the reviewers.
Reviewers: haobo, dhruba, zshao, emayanke
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D11499
2013-08-06 03:14:32 +00:00
|
|
|
return true;
|
2013-07-22 23:49:55 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-04-29 03:34:20 +00:00
|
|
|
virtual const char* Name() const override { return "Merge By TTL"; }
|
2013-07-22 23:49:55 +00:00
|
|
|
|
|
|
|
private:
|
2013-08-20 20:35:28 +00:00
|
|
|
std::shared_ptr<MergeOperator> user_merge_op_;
|
2014-05-02 14:13:51 +00:00
|
|
|
Env* env_;
|
2013-07-22 23:49:55 +00:00
|
|
|
};
|
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
2013-04-15 20:33:13 +00:00
|
|
|
}
|
2014-04-15 20:39:26 +00:00
|
|
|
#endif // ROCKSDB_LITE
|