Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
Summary: The list of info log files of a db can be obtained using the new function.
Test Plan: New test in db_test.cc passed.
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: IslamAbdelRahman, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41715
Summary:
This diff adds CompactOnDeletionCollector in utilities/table_properties_collectors,
which applies a sliding window to a sst file and mark this file as need-compaction
when it observe enough deletion entries within the consecutive keys covered by
the sliding window.
Test Plan: compact_on_deletion_collector_test
Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yoshinorim, sdong
Reviewed By: sdong
Subscribers: maykov, dhruba
Differential Revision: https://reviews.facebook.net/D41175
Summary:
Support RollbackToSavePoint() in WriteBatch and WriteBatchWithIndex. Support for partial transaction rollback is needed for MyRocks.
An alternate implementation of Transaction::RollbackToSavePoint() exists in D40869. However, the other implementation is messier because it is implemented outside of WriteBatch. This implementation is much cleaner and also exposes a potentially useful feature to WriteBatch.
Test Plan: Added unit tests
Reviewers: IslamAbdelRahman, kradhakrishnan, maykov, yoshinorim, hermanlee4, spetrunia, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42723
Summary: Directly using TMPDIR can cause problems when running tests using parallel option. Fix them.
Test Plan: Run all tests in parallel
Reviewers: kradhakrishnan, yhchiang, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42807
Summary: Add new CheckFileExists method. Considered changing the FileExists api but didn't want to break anyone's builds.
Test Plan: unit tests
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42003
Summary:
Skipping these tests in ROCKSDB_LITE since they are not supported
json_document_test
wal_manager_test
ttl_test
sst_dump_test
deletefile_test
compact_files_test
prefix_test
checkpoint_test
Test Plan:
json_document_test
wal_manager_test
ttl_test
sst_dump_test
deletefile_test
compact_files_test
prefix_test
checkpoint_test
Reviewers: igor, sdong, yhchiang, kradhakrishnan, anthony
Reviewed By: anthony
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42573
Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
Test Plan: Run all existing unit tests.
Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42321
Summary: DocumentDB is not supported in ROCKSDB_LITE
Test Plan: document_db_test
Reviewers: yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42177
Summary:
When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
We have a special case that when you set max_background_flushes to 0, we
schedule the flush to execute on the compaction thread.
Test Plan: make check (there might be some unit tests that depend on this behavior)
Reviewers: IslamAbdelRahman, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41931
Summary:
MergeUntil was not reporting a success when merging an operand with
a Value/Deletion despite the comments in MergeHelper and CompactionJob
indicating otherwise. This lead to operands being written to the compaction
output unnecessarily:
M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M2 M3 M4 M5 (before the diff)
M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M4 M5 (after the diff)
In addition, the code handling Values/Deletion was basically identical.
This patch unifies the code. Finally, this patch also adds testing for
merge_helper.
Test Plan: make && make check
Reviewers: sdong, rven, yhchiang, tnovak, igor
Reviewed By: igor
Subscribers: tnovak, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42351
- Remove make file defines from public headers and use _WIN32 because it is compiler defined
- use __GNUC__ and __clang__ to guard non-portable attributes
- add #include "port/port.h" to some new .cc files.
- minor changes in CMakeLists to reflect recent changes
Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.
Test Plan: unit tests. Also compiled a link to the old location to verify the _Pragma works.
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42201
Summary:
Couple of changes here:
* NewBackupEngine() and NewReadOnlyBackupEngine() are now removed. They were deprecated since RocksDB 3.8. Changing these to new functions should be pretty straight-forward. As a followup, I'll fix all fbcode callsights
* Instead of initializing backup engine in the constructor, we initialize it in a separate function now. That way, we can catch all errors and return appropriate status code.
* We catch all errors during initializations and return them to the client properly.
* Added new tests to backupable_db_test, to make sure that we can't open BackupEngine when there are Env errors.
* Transitioned backupable_db_test to use BackupEngine rather than BackupableDB. From the two available APIs, judging by the current use-cases, it looks like BackupEngine API won. It's much more flexible since it doesn't require StackableDB.
Test Plan: Added a new unit test to backupable_db_test
Reviewers: yhchiang, sdong, AaronFeldman
Reviewed By: AaronFeldman
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41925
Summary: Seems like the cleanest way to resolve this is to move CompactedDBImpl into db/. CompactedDBImpl should probably live in the same place as DBImplReadonly since Opening the latter could end up instantiating the former. Both DBImplReadonly and CompactedDBImpl inherit from DBImpl access protected members of DBImpl( and the latter access friendly private methods).
Test Plan: unit tests
Reviewers: sdong, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42027
Summary: Block spatial_db_test in ROCKSDB_LITE as SpatialDB is not supported
Test Plan: spatial_db_test
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41991
Summary: Adding a main for optimistic_transaction_test that report that it was skipped when using ROCKSDB_LITE
Test Plan: optimistic_transaction_test
Reviewers: yhchiang, sdong, igor, anthony
Reviewed By: anthony
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41979
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
Test Plan: Build it.
Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41961
Summary: see title
Test Plan: run 'make unity'
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41079
Summary:
Add a new field: BackupableDBOptions.max_background_copies.
CreateNewBackup() and RestoreDBFromBackup() will use this number of threads to perform copies.
If there is a backup rate limit, then max_background_copies must be 1.
Update backupable_db_test.cc to test multi-threaded backup and restore.
Update backupable_db_test.cc to test backups when the backup environment is not the same as the database environment.
Test Plan:
Run ./backupable_db_test
Run valgrind ./backupable_db_test
Run with TSAN and ASAN
Reviewers: yhchiang, rven, anthony, sdong, igor
Reviewed By: igor
Subscribers: yhchiang, anthony, sdong, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40725
invalid suffix on literal
no return statement in function returning non-void CuckooStep::operator=
extra qualification ‘rocksdb::spatial::Variant::
dereferencing type-punned pointer will break strict-aliasing rules
Summary: Make RocksDb build and run on Windows to be functionally
complete and performant. All existing test cases run with no
regressions. Performance numbers are in the pull-request.
Test plan: make all of the existing unit tests pass, obtain perf numbers.
Co-authored-by: Praveen Rao praveensinghrao@outlook.com
Co-authored-by: Sherlock Huang baihan.huang@gmail.com
Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
Summary: Make stringappend_test runnable in ROCKSDB_LITE
Test Plan: stringappend_test
Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40593
Summary: Fixed a valgrind issue in checkpoint_test
Test Plan: valgrind on checkpoint_test
Reviewers: igor, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40455
Summary:
When there are multiple column families, the flush in
GetLiveFiles is not atomic, so that there are entries in the wal files
which are needed to get a consisten RocksDB. We now add the log files to
the checkpoint.
Test Plan:
CheckpointCF - This test forces more data to be written to
the other column families after the flush of the first column family but
before the second.
Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40323
Summary:
Block geodb_test in ROCKSDB_LITE as geodb is not supported
in ROCKSDB_LITE
Test Plan: geodb_test
Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40335
Summary:
Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.
This patch fails DB::Open() if we don't support the compression that is specified in the options.
Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.
Reviewers: sdong, MarkCallaghan, rven, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D39687
Summary:
This diff update DB::CompactRange to use RangeCompactionOptions instead of using multiple parameters
Old CompactRange is still available but deprecated
Test Plan:
make all check
make rocksdbjava
USE_CLANG=1 make all
OPT=-DROCKSDB_LITE make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D40209
Summary:
[This is the resubmit of D39813. Tests were failing, so I reverted the diff. I found the bug and I'm now resubmitting]
If we don't do this, any calls to Entry() after WBWI mutation will result in undefined behavior. We need to re-fetch the offset from the skip list and regenerate the new pointer (because string's base pointer can change while mutating).
Test Plan: COMPILE_WITH_ASAN=1 make write_batch_with_index_test && ./write_batch_with_index_test
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D39897
Summary: If we don't do this, any calls to Entry() after WBWI mutation will result in undefined behavior. We need to re-fetch the offset from the skip list and regenerate the new pointer (because string's base pointer can change while mutating).
Test Plan: COMPILE_WITH_ASAN=1 make write_batch_with_index_test && ./write_batch_with_index_test
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D39813
Summary: Some test and benchmark codes don't build for CYGWIN. Fix it.
Test Plan: Build "make all" with TARGET_OS=Cygwin on cygwin and make sure it passes.
Reviewers: rven, yhchiang, anthony, igor, kradhakrishnan
Reviewed By: igor, kradhakrishnan
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D39711
Summary: We should support use-cases that mutate WBWI while they're iterating it. This diff adds a unit test to check this behavior.
Test Plan: this is a test
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D39501
Summary:
Replacing the default value for compaction_filter_factory and compaction_filter_factory_v2 to be nullptr instead of DefaultCompactionFilterFactory / DefaultCompactionFilterFactoryV2
The reason for this is to be able to determine easily if we have compaction filter factory or not without depending on RTTI
Test Plan: make check
Reviewers: yoshinorim, ott, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D39693
Summary: Change from one std::to_string() to ToString() for Cygwin build
Test Plan: Build it under cygwin
Reviewers: rven, anthony, IslamAbdelRahman, igor, kradhakrishnan
Reviewed By: igor, kradhakrishnan
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D39657
Summary: In DB::CompactRange(), change parameter "reduce_level" to "change_level". Users can compact all data to the last level if needed. By doing it, users can migrate the DB to options.level_compaction_dynamic_level_bytes=true.
Test Plan: Add a unit test for it.
Reviewers: yhchiang, anthony, kradhakrishnan, igor, rven
Reviewed By: rven
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D39099
Summary: Optimistic transactions supporting begin/commit/rollback semantics. Currently relies on checking the memtable to determine if there are any collisions at commit time. Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty. You should probably start with transaction.h to get an overview of what is currently supported.
Test Plan: Added a new test, but still need to look into stress testing.
Reviewers: yhchiang, igor, rven, sdong
Reviewed By: sdong
Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D33435
Summary:
Added a couple functions to WriteBatchWithIndex to make it easier to query the value of a key including reading pending writes from a batch. (This is needed for transactions).
I created write_batch_with_index_internal.h to use to store an internal-only helper function since there wasn't a good place in the existing class hierarchy to store this function (and it didn't seem right to stick this function inside WriteBatchInternal::Rep).
Since I needed to access the WriteBatchEntryComparator, I moved some helper classes from write_batch_with_index.cc into write_batch_with_index_internal.h/.cc. WriteBatchIndexEntry, ReadableWriteBatch, and WriteBatchEntryComparator are all unchanged (just moved to a different file(s)).
Test Plan: Added new unit tests.
Reviewers: rven, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D38037
Summary:
In D28521 we removed GarbageCollect() from BackupEngine's constructor. The reason was that opening BackupEngine on HDFS was very slow and in most cases we didn't have any garbage. We allowed the user to call GarbageCollect() when it detects some garbage files in his backup directory.
Unfortunately, this left us vulnerable to an interesting issue. Let's say we started a backup and copied files {1, 3} but the backup failed. On another host, we restore DB from backup and generate {1, 3, 5}. Since {1, 3} is already there, we will not overwrite. However, these files might be from a different database so their contents might be different. See internal task t6781803 for more info.
Now, when we're copying files and we discover a file already there, we check:
1. if the file is not referenced from any backups, we overwrite the file.
2. if the file is referenced from other backups AND the checksums don't match, we fail the backup. This will only happen if user is using a single backup directory for backing up two different databases.
3. if the file is referenced from other backups AND the checksums match, it's all good. We skip the copy and go copy the next file.
Test Plan: Added new test to backupable_db_test. The test fails before this patch.
Reviewers: sdong, rven, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D37599
Summary:
Make it build for CYGWIN.
Need to define "-std=gnu++11" instead of "-std=c++11" and use some replacement functions.
Test Plan: Build it and run some unit tests in CYGWIN
Reviewers: yhchiang, rven, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D37605
Summary:
The usage I'm fixing here caused trouble on Fedora 21 when
compiling with the current gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC):
db/write_controller_test.cc: In member function ‘virtual void rocksdb::WriteControllerTest_SanityTest_Test::TestBody()’:
db/write_controller_test.cc:23:165: error: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Werror=conversion-null]
ASSERT_EQ(false, controller.IsStopped());
^
This change was induced mechanically via:
git grep -l -E 'ASSERT_EQ\(false'|xargs perl -pi -e 's/ASSERT_EQ\(false, /ASSERT_FALSE(/'
git grep -l -E 'ASSERT_EQ\(true'|xargs perl -pi -e 's/ASSERT_EQ\(true, /ASSERT_TRUE(/'
Except for the three in utilities/backupable/backupable_db_test.cc for which
I ended up reformatting (joining lines) in the result.
As for why this problem is exhibited with that version of gcc, and none
of the others I've used (from 4.8.1 through gcc-5.0.0 and newer), I suspect
it's a bug in F21's gcc that has been fixed in gcc-5.0.0.
Test Plan:
"make" now succeed on Fedora 21
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37329
Summary:
Fix the following compilation error in flashcache.cc on Mac
Undefined symbols for architecture x86_64:
"rocksdb::NewFlashcacheAwareEnv(rocksdb::Env*, int)", referenced from:
rocksdb::Benchmark::Open(rocksdb::Options*) in db_bench.o
Test Plan: make db_bench
Reviewers: sdong, igor, rven
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D36657
Summary:
There are some cases when flachcache file descriptor was
already allocated (i.e. fb-MySQL). Then NewFlashcacheAwareEnv returns an
error at open() because fd was already assigned. This diff adds another
function to instantiate FlashcacheAwareEnv, with pre-allocated fd cachedev_fd.
Test Plan: Tested with MyRocks using this function, then worked
Reviewers: sdong, igor
Reviewed By: igor
Subscribers: dhruba, MarkCallaghan, rven
Differential Revision: https://reviews.facebook.net/D36447
Summary:
This diff fixes a crash found when an empty database is opened in readonly mode.
We now check the number of levels before we open the DB as a compacted DB.
Test Plan: DBTest.EmptyCompactedDB
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D36327
Summary: Most of the approach is copied from WebSQL's MySQL branch. It's nice that we can do this without touching core RocksDB code.
Test Plan: Compiles and runs. Didn't test flashback code, as I don't have flashback device and most if it is c/p
Reviewers: MarkCallaghan, sdong
Reviewed By: sdong
Subscribers: rven, lgalanis, kradhakrishnan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35391
Summary:
We have addded new stats and perf_context for measuring the merge and filter operation time consumption.
We have bounded all the merge operations within the GUARD statment and collected the total time for these operations in the DB.
Test Plan: WIP
Reviewers: rven, yhchiang, kradhakrishnan, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D34377
Summary: WriteBatch and WriteBatchWithIndex now both inherit from a common abstract base class. This makes it easier to write code that is agnostic toward the implementation of the particular write batch. In particular, I plan on utilizing this abstraction to allow transactions to support using either implementation of a write batch.
Test Plan: modified existing WriteBatchWithIndex tests to test new functions. Running all tests.
Reviewers: igor, rven, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34017
Summary:
Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different.
In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest.
There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides.
```lang=bash
% cat ~/transform
#!/bin/sh
files=$(git ls-files '*test\.cc')
for file in $files
do
if grep -q "rocksdb::test::RunAllTests()" $file
then
if grep -Eq '^class \w+Test {' $file
then
perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file
perl -pi -e 's/^(TEST)/${1}_F/g' $file
fi
perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file
perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file
fi
done
% sh ~/transform
% make format
```
Second iteration of this diff contains only scripted changes.
Third iteration contains manual changes to fix last errors and make it compilable.
Test Plan:
Build and notice no errors.
```lang=bash
% USE_CLANG=1 make check -j55
```
Tests are still testing.
Reviewers: meyering, sdong, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35157
Summary: Some suggestions for cleanup from Igor.
Test Plan: Regression tests.
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35169
Summary:
gtest does not use exceptions to fail a unit test by design, and `ASSERT*`s are implemented using `return`. As a consequence we cannot use `ASSERT*` in a function that does not return `void` value ([[ https://code.google.com/p/googletest/wiki/AdvancedGuide#Assertion_Placement | 1]]), and have to fix our existing code. This diff does this in a generic way, with no manual changes.
In order to detect all existing `ASSERT*` that are used in functions that doesn't return void value, I change the code to generate compile errors for such cases.
In `util/testharness.h` I defined `EXPECT*` assertions, the same way as `ASSERT*`, and redefined `ASSERT*` to return `void`. Then executed:
```lang=bash
% USE_CLANG=1 make all -j55 -k 2> build.log
% perl -naF: -e 'print "-- -number=".$F[1]." ".$F[0]."\n" if /: error:/' \
build.log | xargs -L 1 perl -spi -e 's/ASSERT/EXPECT/g if $. == $number'
% make format
```
After that I reverted back change to `ASSERT*` in `util/testharness.h`. But preserved introduced `EXPECT*`, which is the same as `ASSERT*`. This will be deleted once switched to gtest.
This diff is independent and contains manual changes only in `util/testharness.h`.
Test Plan:
Make sure all tests are passing.
```lang=bash
% USE_CLANG=1 make check
```
Reviewers: igor, lgalanis, sdong, yufei.zhu, rven, meyering
Reviewed By: meyering
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33333
Summary: These changes are necessary to make tests look more generic, and avoid feature conflicts with gtest.
Test Plan:
Make sure no build errors, and all test are passing.
```
% make check
```
Reviewers: igor, meyering
Reviewed By: meyering
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35145
Summary:
Provide an API which enables users to infor Rocksdb that it is
shutting down.
Test Plan: db_test
Reviewers: sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34617
Summary: I was pretty sure I compiled this before landing, sorry :/
Test Plan: compiles
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34173
Summary:
This diff fixes a bug introduced by D28521. Read-only backup engine can delete a backup that is later than the latest -- we never check the condition.
I also added a bunch of logging that will help with debugging cases like this in the future.
See more discussion at t6218248.
Test Plan: Added a unit test that was failing before the change. Also, see new LOG file contents: https://phabricator.fb.com/P19738984
Reviewers: benj, sanketh, sumeet, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33897
Summary: as title, we have unused variables. this is a short-term solution
Test Plan: compiles
Reviewers: IslamAbdelRahman, sdong, rven
Reviewed By: rven
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34125
Summary: mac compile is fixed in fbson, so it can be returned back from 7ce1b2c
Test Plan:
make all check
make valgrind_check
Reviewers: golovachalexander, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33855
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.
Prerequisites: bear and clang 3.5 build with extra tools
```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```
Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```
Reviewers: igor, kradhakrishnan, rven, meyering, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34077
Summary: Replaced rapidjson with fbson
Test Plan:
make all check
make valgrind_check
Reviewers: golovachalexander, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D32733
Summary: I'm moving mongo to a single column family, so I need DeltaBase iterator with default column family.
Test Plan: Added unit test
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D32589
Summary: It looks like ASAN with gcc 4.9 works better than 4.8.1. It detected this possibility of heap buffer overflow. This was in our codebase for a year :)
Test Plan: COMPILE_WITH_ASAN=1 make backupable_db && ./backupable_db
Reviewers: yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D32085
Summary:
While running cross-functional tests for weak iterators, I
encountered a bug in GeoDB. GeoDB reads a key from the database and
tries to use it after doing a Seek. Fixing it by storing the key locally
so that it is still visible after the Seek.
Test Plan: Run geodb_test
Reviewers: sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31599
Summary: We keep checksum functions in util/, there is no reason for compression to be in port/
Test Plan: compiles
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31281
* Use emplace when possible.
* Make FileInfo shared among all BackupMeta, instead of storing filenames.
* Make checksum_value in FileInfo constant.
* Reserve space beforehand if container size is known.
* Make FileInfo and BackupMeta non-copyable and non-assignable to prevent future logic errors.
It is very dangerous to copy BackupMeta without careful handling refcounts of FileInfo.
* Remove a copy of BackupMeta when detected corrupt backup.
* Use strtoul() and strtoull() instead of sscanf().
glibc's sscanf() will do a implicit strlen().
* Move implicit construction of Slice("crc32 ") out of loop.
Summary:
GetThreadList() feature depends on the thread creation and destruction, which is currently handled under Env.
This patch moves GetThreadList() feature under Env to better manage the dependency of GetThreadList() feature
on thread creation and destruction.
Renamed ThreadStatusImpl to ThreadStatusUpdater. Add ThreadStatusUtil, which is a static class contains
utility functions for ThreadStatusUpdater.
Test Plan: run db_test, thread_list_test and db_bench and verify the life cycle of Env and ThreadStatusUpdater is properly managed.
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: ljin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D30057
Summary: This is a feature request from rocksdb's user. I didn't even realize we don't support multigets on TTL DB :)
Test Plan: added a unit test
Reviewers: yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D30561
Summary: stringSplit is not how we name our functions. Also, we had two StringSplit's in the codebase
Test Plan: make check
Reviewers: yhchiang, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D29361
Summary: Created a CompatibleOptions object that can be used as a LevelDB Options object and then converted to a RocksDB Options object using the ConvertOptions() method.
Test Plan: Unit test included in diff.
Reviewers: ljin
Reviewed By: ljin
Subscribers: sdong, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28893
Summary:
Add GetThreadList API, which allows developer to track the
status of each process. Currently, calling GetThreadList will
only get the list of background threads in RocksDB with their
thread-id and thread-type (priority) set. Will add more support
on this in the later diffs.
ThreadStatus currently has the following properties:
// An unique ID for the thread.
const uint64_t thread_id;
// The type of the thread, it could be ROCKSDB_HIGH_PRIORITY,
// ROCKSDB_LOW_PRIORITY, and USER_THREAD
const ThreadType thread_type;
// The name of the DB instance where the thread is currently
// involved with. It would be set to empty string if the thread
// does not involve in any DB operation.
const std::string db_name;
// The name of the column family where the thread is currently
// It would be set to empty string if the thread does not involve
// in any column family.
const std::string cf_name;
// The event that the current thread is involved.
// It would be set to empty string if the information about event
// is not currently available.
Test Plan:
./thread_list_test
export ROCKSDB_TESTS=GetThreadList
./db_test
Reviewers: rven, igor, sdong, ljin
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D25047
Summary:
CorruptionTest for backupable_db_test did not call
GarbageCollect() after deleting a corrupt backup,
which sometimes lead to test failures as the newly created backup
would reuse the same backup ID and files and fail the consistency
check.
Moved around some of the test logic to ensure that GarbageCollect()
is called at the right time.
Test Plan:
Run backupable_db_test eight times and make sure
it passes repeatedly. Also run make check to make sure other
tests don't fail.
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28863
Summary: This will speed up our import times
Test Plan: Added simple unit test just to get code coverage
Reviewers: sdong, ljin, yhchiang, rven, mohaps
Reviewed By: mohaps
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28869
Summary:
Improve the backup engine by not deleting the corrupted
backup when it is detected; instead leaving it to the client
to delete the corrupted backup.
Also add a BackupEngine::Open() call.
Test Plan:
Add check to CorruptionTest inside backupable_db_test
to check that the corrupt backups are not deleted. The previous
version of the code failed this test as backups were deleted,
but after the changes in this commit, this test passes.
Run make check to ensure that no other tests fail.
Reviewers: sdong, benj, sanketh, sumeet, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28521
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
Test Plan: compiles
Reviewers: ljin, rven, yhchiang, sdong
Reviewed By: yhchiang
Subscribers: bobbaldwin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28689
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.
Test Plan: compiles
Reviewers: ljin, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28131
Summary: Enforce the accessier naming convention in functions in version_set.h
Test Plan: make all check
Reviewers: ljin, yhchiang, rven, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D28143
Summary: Move header file so it can be referenced externally.
Test Plan: Rebuild.
Reviewers: ljin
Reviewed By: ljin
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D28095
Summary:
...and fix all the errors :)
Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.
Test Plan: compiles
Reviewers: yhchiang, rven, sdong, ljin
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D27711
Summary:
Rename Version::Builder to VersionBuilder and expose its definition to a header.
Make VerisonBuilder not reference Version or ColumnFamilyData, only working with VersionStorageInfo.
Add version_builder_test which has a simple test.
Test Plan: make all check
Reviewers: rven, yhchiang, igor, ljin
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D27969
Summary: Apply InfoLogLevel to the logs in utilities/ttl/db_ttl_impl.h
Test Plan: make
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D27885
Summary:
Apply InfoLogLevel to the logs and add missing copy-right information
to utilities/merge_operators/uint64add.cc.
Test Plan: make
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D27897
Summary:
Make compaction picker easier to test.
The basic idea is to separate a minimum subcomponent of Version to VersionStorageInfo, which just responsible to LSM tree. A stub VersionStorageInfo can then be easily created and passed into compaction picker so that we can check the outputs.
It now passes most tests. Still two things need to be done:
(1) deal with the FIFO compaction's file size.
(2) write an example test to make sure the interface can do the job.
Add a compaction_picker_test to make sure compaction picker codes can be easily unit tested.
Test Plan:
Pass all unit tests and compaction_picker_test
Reviewers: yhchiang, rven, igor, ljin
Reviewed By: ljin
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D27639
Summary:
We have several different types of data structures for file information.
FileLevel is kinda of confusing since it only contains file range and
fd. Rename it to LevelFilesBrief to make it clear.
Unfriend CompactedDBImpl as a by product
Test Plan:
make release / make all
will run full test with all stacked diffs
Reviewers: sdong, yhchiang, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D27585
Summary: WriteBatchWithIndex's iterator's SeekToFirst() and SeekToLast() use offset=0 to indicate it is smaller than all the keys, which is wrong. offset=0 will decode a key "" (the header decodes like that). It could be larger than other keys in non-default comparators. Fix it by using a special flag of offset to indicate searching to the beginning of the CF.
Test Plan: Add a unit test that used to fail. Also, add some more tests to related cases, though they don't fail for now.
Reviewers: igor
Reviewed By: igor
Subscribers: rven, yhchiang, ljin, leveldb
Differential Revision: https://reviews.facebook.net/D24873
Summary: Add an iterator that combines base_iterator of type Iterator* with delta iterator of type WBWIIterator*.
Test Plan: nothing yet. work in progress
Reviewers: ljin, igor
Reviewed By: igor
Subscribers: rven, yhchiang, leveldb
Differential Revision: https://reviews.facebook.net/D24741
Summary: Support SeekToFirst(), SeekToLast() and Prev() in WBWIIterator, returned by WriteBatchWithIndex::NewIterator().
Test Plan: Write unit test cases to cover the case.
Reviewers: ljin, igor
Reviewed By: igor
Subscribers: rven, yhchiang, leveldb
Differential Revision: https://reviews.facebook.net/D24765
Summary: With a new option, when accepting a new key, WriteBatchWithIndex will find an existing index of the same key, and replace the content of it.
Test Plan: Add a unit test case.
Reviewers: ljin, yhchiang, rven, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24753
Summary: Allow accepting Options as a string of key/value pairs
Test Plan: unit test
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24597
Check for lhs and not twice for rhs.
Fix for:
[utilities/document/document_db.cc:36] ->
[utilities/document/document_db.cc:36]: (style) Same expression on both
sides of '&&'.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Use empty() since it should be prefered as it has, following
the standard, a constant time complexity regardless of the
containter type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Fix for:
[utilities/ttl/db_ttl_impl.h:209]: (performance) Function parameter
'merge_op' should be passed by reference.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Summary:
info_log from supplied Options can be nullptr. Using the one from
db_impl. Also call flush after that since no more loggging will happen
and LOG can contain partial output
Test Plan: verified with db_bench
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24183
Summary:
Intead of passing callback function pointer and its arg on Table::Get()
interface, passing GetContext. This makes the interface cleaner and
possible better perf. Also adding a fast pass for SaveValue()
Test Plan: make all check
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24057
Summary:
Add the MultiGet API to allow prefetching.
With file size of 1.5G, I configured it to have 0.9 hash ratio that can
fill With 115M keys and result in 2 hash functions, the lookup QPS is
~4.9M/s vs. 3M/s for Get().
It is tricky to set the parameters right. Since files size is determined
by power-of-two factor, that means # of keys is fixed in each file. With
big file size (thus smaller # of files), we will have more chance to
waste lot of space in the last file - lower space utilization as a
result. Using smaller file size can improve the situation, but that
harms lookup speed.
Test Plan: db_bench
Reviewers: yhchiang, sdong, igor
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23673
Summary:
Add a CompactedDBImpl that will enabled when calling OpenForReadOnly()
and the DB only has one level (>0) of files. As a performan comparison,
CuckooTable performs 2.1M/s with CompactedDBImpl vs. 1.78M/s with
ReadOnlyDBImpl.
Test Plan: db_bench
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23553
Summary:
Previously, one single column family is given to WriteBatchWithIndex to index keys for all column families. An extra map from column family ID to comparator is maintained which can override the default comparator given in the constructor. A WriteBatchWithIndex::SetComparatorForCF() is added for user to add comparators per column family.
Also move more codes into anonymous namespace.
Test Plan: Add a unit test
Reviewers: ljin, igor
Reviewed By: igor
Subscribers: dhruba, leveldb, yhchiang
Differential Revision: https://reviews.facebook.net/D23355
Summary:
Fixed a signed-unsigned comparison in spatial_db.cc
utilities/spatialdb/spatial_db.cc:542:38: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
cc1plus: all warnings being treated as errors
make: *** [utilities/spatialdb/spatial_db.o] Error 1
Test Plan:
make spatial_db_test
./spatial_db_test
Reviewers: ljin, sdong, reddragon, igor
Reviewed By: reddragon
Subscribers: reddragon, leveldb
Differential Revision: https://reviews.facebook.net/D23565
Summary:
Added the following statistics to BackupableDB:
1. Number of successful and failed backups in class BackupStatistics
2. Time taken to do a backup
3. Number of files in a backup
1 is implemented in the BackupStatistics class
2 and 3 are added in the BackupMeta and BackupInfo class
Test Plan:
1 can be tested using BackupStatistics::ToString(),
2 and 3 can be tested in the BackupInfo class
Reviewers: sdong, igor2, ljin, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D22785
1, const qualifiers on return types make no sense and will trigger a compile warning: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
2, class HistogramImpl has virtual functions and thus should have a virtual destructor
3, with some toolchain, the macro __STDC_FORMAT_MACROS is predefined and thus should be checked before define
Change-Id: I69747a03bfae88671bfbb2637c80d17600159c99
Signed-off-by: liuhuahang <liuhuahang@zerus.co>
Summary:
Two things:
1. Use hash-based index for data column family
2. Use Get() instead of Iterator Seek() when DB is opened read-only
Test Plan: added read-only test in unit test
Reviewers: yinwang
Reviewed By: yinwang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22323
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21915
Summary: We need to start compression at level 1, while OptimizeForLevelComapaction() only sets up rocksdb to start compressing at level 2. I also adjusted some other things.
Test Plan: compiles
Reviewers: yinwang
Reviewed By: yinwang
Differential Revision: https://reviews.facebook.net/D22203
Summary:
Add WriteBatchWithIndex so that a user can query data out of a WriteBatch, to support MongoDB's read-its-own-write.
WriteBatchWithIndex uses a skiplist to store the binary index. The index stores the offset of the entry in the write batch. When searching for a key, the key for the entry is read by read the entry from the write batch from the offset.
Define a new iterator class for querying data out of WriteBatchWithIndex. A user can create an iterator of the write batch for one column family, seek to a key and keep calling Next() to see next entries.
I will add more unit tests if people are OK about this API.
Test Plan:
make all check
Add unit tests.
Reviewers: yhchiang, igor, MarkCallaghan, ljin
Reviewed By: ljin
Subscribers: dhruba, leveldb, xjin
Differential Revision: https://reviews.facebook.net/D21381
* Script for building the unity.cc file via Makefile
* Unity executable Makefile target for testing builds
* Source code changes to fix compilation of unity build
Summary:
Since we have enough memory to hold all primary keys loaded from spatial index, it is better if we first load all of them (store them in unordered_set for deduplication) and then query on primary key column family one by one.
We need to dedup all IDs, so we'll end up storing all of them in memory even with the current approach.
Test Plan: ./spatial_db_test is happy
Reviewers: yinwang
Reviewed By: yinwang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20949
Summary: This will help debugging
Test Plan: ran, observed output
Reviewers: yinwang
Reviewed By: yinwang
Differential Revision: https://reviews.facebook.net/D20937
Summary: I changed SpatialDB API so that we only specify list of indexes when we create the database. That way, whoever is querying the DB doesn't need to know the full list of indexes and their options.
Test Plan: spatial_db_test
Reviewers: yinwang
Reviewed By: yinwang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20571
Summary:
This diff is adding spatial index support to RocksDB.
When creating the DB user specifies a list of spatial indexes. Spatial indexes can cover different areas and have different resolution (i.e. number of tiles). This is useful for supporting different zoom levels.
Each element inserted into SpatialDB has:
* a bounding box, which determines how will the element be indexed
* string blob, which will usually be WKB representation of the polygon (http://en.wikipedia.org/wiki/Well-known_text)
* feature set, which is a map of key-value pairs, where value can be int, double, bool, null or a string. FeatureSet will be a set of tags associated with geo elements (for example, 'road': 'highway' and similar)
* a list of indexes to insert the element in. For example, small river element will be inserted in index for high zoom level, while country border will be inserted in all indexes (including the index for low zoom level).
Each query is executed on single spatial index. Query guarantees that it will return all elements intersecting the specified bounding box, but it might also return some extra non-intersecting elements.
Test Plan: Added bunch of unit tests in spatial_db_test
Reviewers: dhruba, yinwang
Reviewed By: yinwang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20361
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:
#include <rocksdb/db.h>
#include <utilities/backupable_db.h> // still our public header!
Also, internally, we include:
#include "utilities/backupable/backupable_db.h" // internal header
#include "utilities/backupable_db.h" // public header
which is confusing.
This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.
Test Plan: compiles
Reviewers: dhruba, ljin, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20409
Summary:
This is a rough sketch of our new document API. Would like to get some thoughts and comments about the high-level architecture and API.
I didn't optimize for performance at all. Leaving some low-hanging fruit so that we can be happy when we fix them! :)
Currently, bunch of features are not supported at all. Indexes can be only specified when creating database. There is no query planner whatsoever. This will all be added in due time.
Test Plan: Added a simple unit test
Reviewers: haobo, yhchiang, dhruba, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D18747
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
Summary:
After evaluating options for JSON storage, I decided to implement our own. The reason is that we'll be able to optimize it better and we get to reduce unnecessary dependencies (which is what we'd get with folly).
I also plan to write a serializer/deserializer for JSONDocument with our own binary format similar to BSON. That way we'll store binary JSON format in RocksDB instead of the plain-text JSON. This means less storage and faster deserialization.
There are still some inefficiencies left here. I plan to optimize them after we develop a functioning DocumentDB. That way we can move and iterate faster.
Test Plan: added a unit test
Reviewers: dhruba, haobo, sdong, ljin, yhchiang
Reviewed By: haobo
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D18831
Summary: added a new option to BackupEngine: if share_files_with_checksum is set to true, sst files are stored in shared_checksum/ and are identified by the triple (file name, checksum, file size) instead of just the file name. This option is targeted at distributed databases that want to backup their primary replica.
Test Plan: unit tests and tested backup and restore on a distributed rocksdb
Reviewers: igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D18393
Summary:
Our valgrind tests are failing because ttl_test is kind of flakey. This diff should fix valgrind issue and make ttl_test less flakey and much faster.
Instead of relying on Env::Default() for getting current time, I expose `Env*` to all TTL functions that are interested in time. That way, I can insert a custom test Env which is then used to provide exactly the times we need. That way, we don't need to sleep anymore -- we control the time.
Test Plan: ttl_test in normal and valgrind run
Reviewers: dhruba, haobo, sdong, yhchiang
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18399
Summary:
This enables user to add a TTL column family to normal DB.
Next step should be to expand StackableDB and create StackableColumnFamily, such that users can for example add geo-spatial column families to normal DB.
Test Plan: added a test
Reviewers: dhruba, haobo, ljin
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18201
Summary: Read-only BackupEngine can connect to the same backup directory that is already running BackupEngine. That enables some interesting use-cases (i.e. restoring replica from primary's backup directory)
Test Plan: added a unit test
Reviewers: dhruba, haobo, ljin
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18297
Summary:
This will enable people using TTL DB to do so with multiple column families. They can also specify different TTLs for each one.
TODO: Implement CreateColumnFamily() in TTL world.
Test Plan: Added a very simple sanity test.
Reviewers: dhruba, haobo, ljin, sdong, yhchiang
Reviewed By: haobo
CC: leveldb, alberts
Differential Revision: https://reviews.facebook.net/D17859
Summary:
Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
Test Plan: compiles :)
Reviewers: dhruba, haobo, ljin, sdong, yhchiang
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17835
Summary: Compiling for iOS has by default turned on -Wmissing-prototypes, which causes rocksdb to fail compiling. This diff turns on -Wmissing-prototypes in our compile options and cleans up all functions with missing prototypes.
Test Plan: compiles
Reviewers: dhruba, haobo, ljin, sdong
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17649
Summary: Fix some signed and unsigned comparisons to make some other build script happy.
Test Plan: Build and run those changed tests
Reviewers: ljin, igor, haobo
Reviewed By: igor
CC: yhchiang, dhruba, kailiu, leveldb
Differential Revision: https://reviews.facebook.net/D17463
Summary: The previous change D15087 changed existing compaction filter, which makes the commonly used class not backward compatible. Revert the older interface. Use a new interface for V2 instead.
Test Plan: make all check
Reviewers: haobo, yhchiang, igor
CC: danguo, dhruba, ljin, igor, leveldb
Differential Revision: https://reviews.facebook.net/D17223
Summary:
This patch stores gps locations in rocksdb.
Each object is uniquely identified by an id. Each object has
a gps (latitude, longitude) associated with it. The geodb
supports looking up an object either by its gps location
or by its id. There is a method to retrieve all objects
within a circular radius centered at a specified gps location.
Test Plan: Simple unit-test attached.
Reviewers: leveldb, haobo
Reviewed By: haobo
CC: leveldb, tecbot, haobo
Differential Revision: https://reviews.facebook.net/D15567
Summary:
This diff adds a new CompactionFilterV2 API that roll up the
decisions of kv pairs during compactions. These kv pairs must share the
same key prefix. They are buffered inside the db.
typedef std::vector<Slice> SliceVector;
virtual std::vector<bool> Filter(int level,
const SliceVector& keys,
const SliceVector& existing_values,
std::vector<std::string>* new_values,
std::vector<bool>* values_changed
) const = 0;
Application can override the Filter() function to operate
on the buffered kv pairs. More details in the inline documentation.
Test Plan:
make check. Added unit tests to make sure Keep, Delete,
Change all works.
Reviewers: haobo
CCs: leveldb
Differential Revision: https://reviews.facebook.net/D15087
Summary:
* PartialMerge api now takes a list of operands instead of two operands.
* Add min_pertial_merge_operands to Options, indicating the minimum
number of operands to trigger partial merge.
* This diff is based on Schalk's previous diff (D14601), but it also
includes necessary changes such as updating the pure C api for
partial merge.
Test Plan:
* make check all
* develop tests for cases where partial merge takes more than two
operands.
TODOs (from Schalk):
* Add test with min_partial_merge_operands > 2.
* Perform benchmarks to measure the performance improvements (can probably
use results of task #2837810.)
* Add description of problem to doc/index.html.
* Change wiki pages to reflect the interface changes.
Reviewers: haobo, igor, vamsi
Reviewed By: haobo
CC: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D16815
Summary: Might be useful if client doesn't want to effect running system during backup too much.
Test Plan: added a test case
Reviewers: dhruba, haobo, ljin
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17091
Summary:
Added an option to BackupableDB implementation that allows users to persist in-memory databases. When the restore happens with keep_log_files = true, it will
*) Not delete existing log files in wal_dir
*) Move log files from archive directory to wal_dir, so that DB can replay them if necessary
Test Plan: Added an unit test
Reviewers: dhruba, ljin
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16941
Summary:
I had this diff for a while to test column families implementation. Last night, I ran it sucessfully for 10 hours with the command:
time ./db_stress --threads=30 --ops_per_thread=200000000 --max_key=5000 --column_families=20 --clear_column_family_one_in=3000000 --verify_before_write=1 --reopen=50 --max_background_compactions=10 --max_background_flushes=10 --db=/tmp/db_stress
It is ready to be committed :)
Test Plan: Ran it for 10 hours
Reviewers: dhruba, haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16797