Summary: I was pretty sure I compiled this before landing, sorry :/
Test Plan: compiles
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34173
Summary:
This diff fixes a bug introduced by D28521. Read-only backup engine can delete a backup that is later than the latest -- we never check the condition.
I also added a bunch of logging that will help with debugging cases like this in the future.
See more discussion at t6218248.
Test Plan: Added a unit test that was failing before the change. Also, see new LOG file contents: https://phabricator.fb.com/P19738984
Reviewers: benj, sanketh, sumeet, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33897
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.
Prerequisites: bear and clang 3.5 build with extra tools
```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```
Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```
Reviewers: igor, kradhakrishnan, rven, meyering, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34077
Summary: It looks like ASAN with gcc 4.9 works better than 4.8.1. It detected this possibility of heap buffer overflow. This was in our codebase for a year :)
Test Plan: COMPILE_WITH_ASAN=1 make backupable_db && ./backupable_db
Reviewers: yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D32085
* Use emplace when possible.
* Make FileInfo shared among all BackupMeta, instead of storing filenames.
* Make checksum_value in FileInfo constant.
* Reserve space beforehand if container size is known.
* Make FileInfo and BackupMeta non-copyable and non-assignable to prevent future logic errors.
It is very dangerous to copy BackupMeta without careful handling refcounts of FileInfo.
* Remove a copy of BackupMeta when detected corrupt backup.
* Use strtoul() and strtoull() instead of sscanf().
glibc's sscanf() will do a implicit strlen().
* Move implicit construction of Slice("crc32 ") out of loop.
Summary:
CorruptionTest for backupable_db_test did not call
GarbageCollect() after deleting a corrupt backup,
which sometimes lead to test failures as the newly created backup
would reuse the same backup ID and files and fail the consistency
check.
Moved around some of the test logic to ensure that GarbageCollect()
is called at the right time.
Test Plan:
Run backupable_db_test eight times and make sure
it passes repeatedly. Also run make check to make sure other
tests don't fail.
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28863
Summary:
Improve the backup engine by not deleting the corrupted
backup when it is detected; instead leaving it to the client
to delete the corrupted backup.
Also add a BackupEngine::Open() call.
Test Plan:
Add check to CorruptionTest inside backupable_db_test
to check that the corrupt backups are not deleted. The previous
version of the code failed this test as backups were deleted,
but after the changes in this commit, this test passes.
Run make check to ensure that no other tests fail.
Reviewers: sdong, benj, sanketh, sumeet, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28521
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
Test Plan: compiles
Reviewers: ljin, rven, yhchiang, sdong
Reviewed By: yhchiang
Subscribers: bobbaldwin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28689
Summary:
Added the following statistics to BackupableDB:
1. Number of successful and failed backups in class BackupStatistics
2. Time taken to do a backup
3. Number of files in a backup
1 is implemented in the BackupStatistics class
2 and 3 are added in the BackupMeta and BackupInfo class
Test Plan:
1 can be tested using BackupStatistics::ToString(),
2 and 3 can be tested in the BackupInfo class
Reviewers: sdong, igor2, ljin, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D22785
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:
#include <rocksdb/db.h>
#include <utilities/backupable_db.h> // still our public header!
Also, internally, we include:
#include "utilities/backupable/backupable_db.h" // internal header
#include "utilities/backupable_db.h" // public header
which is confusing.
This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.
Test Plan: compiles
Reviewers: dhruba, ljin, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20409
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
Summary: added a new option to BackupEngine: if share_files_with_checksum is set to true, sst files are stored in shared_checksum/ and are identified by the triple (file name, checksum, file size) instead of just the file name. This option is targeted at distributed databases that want to backup their primary replica.
Test Plan: unit tests and tested backup and restore on a distributed rocksdb
Reviewers: igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D18393
Summary: Read-only BackupEngine can connect to the same backup directory that is already running BackupEngine. That enables some interesting use-cases (i.e. restoring replica from primary's backup directory)
Test Plan: added a unit test
Reviewers: dhruba, haobo, ljin
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18297
Summary:
Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
Test Plan: compiles :)
Reviewers: dhruba, haobo, ljin, sdong, yhchiang
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17835
Summary: Might be useful if client doesn't want to effect running system during backup too much.
Test Plan: added a test case
Reviewers: dhruba, haobo, ljin
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17091
Summary:
Added an option to BackupableDB implementation that allows users to persist in-memory databases. When the restore happens with keep_log_files = true, it will
*) Not delete existing log files in wal_dir
*) Move log files from archive directory to wal_dir, so that DB can replay them if necessary
Test Plan: Added an unit test
Reviewers: dhruba, ljin
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16941
Summary: We should dump options in backupable DB log, just like with to with RocksDB. This will aid debugging.
Test Plan: checked the log
Reviewers: ljin
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16719
Summary:
(1) Report corruption if backup meta file has tailing data that was not read. This should fix: https://github.com/facebook/rocksdb/issues/81 (also, @sdong reported similar issue)
(2) Don't use OS buffer when copying file to backup directory. We don't need the file in cache since we won't be reading it twice
(3) Don't delete newer backups when somebody tries to backup the diverged DB (restore from older backup, add new data, try to backup). Rather, just fail the new backup.
Test Plan: backupable_db_test
Reviewers: ljin, dhruba, sdong
Reviewed By: ljin
CC: leveldb, sdong
Differential Revision: https://reviews.facebook.net/D16287
Summary: Even if user flushes the memtables before getting live files, we still can't guarantee that new data didn't come in (to already-flushed memtables). If we want backups to provide consistent view of the database, we still need to get WAL files.
Test Plan: backupable_db_test
Reviewers: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16299
Summary:
The change to the public behavior:
* When opening a DB or creating new column family client gets a ColumnFamilyHandle.
* As long as column family handle is alive, client can do whatever he wants with it, even drop it
* Dropped column family can still be read from (using the column family handle)
* Added a new call CloseColumnFamily(). Client has to close all column families that he has opened before deleting the DB
* As soon as column family is closed, any calls to DB using that column family handle will fail (also any outstanding calls)
Internally:
* Ref-counting ColumnFamilyData
* New thread-safety for ColumnFamilySet
* Dropped column families are now completely dropped and their memory cleaned-up
Test Plan: added some tests to column_family_test
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16101
Summary: Clean up IOErrors so that it only indicates errors talking to device.
Test Plan: make all check
Reviewers: igor, haobo, dhruba, emayanke
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15831
Summary:
There are three SanitizeOption-s now : one for DBOptions, one for ColumnFamilyOptions and one for Options (which just calls the other two)
I have also reshuffled some options -- table_cache options and info_log should live in DBOptions, for example.
Test Plan: make check doesn't complain
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15873
Summary:
if it happens (randomly) to corrupt shared file in the test, then the
checksum will be inconsistent between meta files from different backup.
BackupEngine will then detect this issue and fail. But in reality, this
does not happen since the checksum is checked on every backup. So here,
only corrupt checksum of private file to let BackupEngine to construct
properly (but fail during restore).
Test Plan: run test with valgrind
Reviewers: igor
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15531
Summary: I think it looks nicer. In RocksDB we have both styles, but I think that static method is the more common version.
Test Plan: backupable_db_test
Reviewers: ljin, benj, swk
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15519
Summary:
Lots of clients have problems with using StackableDB interface. It's nice to have BackupableDB as a layer on top of DB, but not necessary.
This diff exports BackupEngine, which can be used to create backups without forcing clients to use StackableDB interface.
Test Plan: backupable_db_test
Reviewers: dhruba, ljin, swk
Reviewed By: ljin
CC: leveldb, benj
Differential Revision: https://reviews.facebook.net/D15477
Summary: Keep checksum of each backuped file in meta file. When it restores these files, compute their checksum on the fly and compare against what is in the meta file. Fail the restore process if checksum mismatch.
Test Plan: unit test
Reviewers: haobo, igor, sdong, kailiu
Reviewed By: igor
CC: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D15381
Summary: By removing some includes form options.h and reply on forward declaration, we can more easily reason the dependencies.
Test Plan: make all check
Reviewers: kailiu, haobo, igor, dhruba
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15411
Summary:
This diff introduces some features that were requested by two internal customers:
* Ability for backups not to share table files, because we can't guarantee that equal filename means equal content accross replicas
* Ability for two threads to call EnableFileDeletions() and DisableFileDeletions()
* Ability to stop backup from another thread and not slow down the DB close
* Copy the files to the temporary folder first and then atomically rename
Test Plan: Added some tests to backupable_db_test
Reviewers: dhruba, sanketh, muthu, sdong, haobo
Reviewed By: haobo
CC: leveldb, sanketh, muthu
Differential Revision: https://reviews.facebook.net/D14769
Summary:
We don't want two threads to clash if they concurrently call DisableFileDeletions() and EnableFileDeletions(). I'm adding a counter that will enable file deletions only after all DisableFileDeletions() calls have been negated with EnableFileDeletions().
However, we also don't want to break the old behavior, so I added a parameter force to EnableFileDeletions(). If force is true, we will still enable file deletions after every call to EnableFileDeletions(), which is what is happening now.
Test Plan: make check
Reviewers: dhruba, haobo, sanketh
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14781
Summary:
<This diff is for Column Family branch>
Sharing some of the work I've done so far. This diff compiles and passes the tests.
The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
Please provide feedback.
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo, sdong, kailiu, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14445
Summary:
I realized that manifest will get deleted by PurgeObsoleteFiles in DBImpl, but it is sill cleaner to delete
files before we restore the backup
Test Plan: backupable_db_test
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14619
Summary: We now delete backups with newer sequence number, so the clients don't have to handle confusing situations when they restore from backup.
Test Plan: added a unit test
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14547
Summary: Valgrind complained about BackupableDB. This fixes valgrind errors. Also, I cleaned up some code.
Test Plan: valgrind does not complain anymore
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14529
Summary:
In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
There are multiple things you can configure:
1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
2. sync - if true, it *guarantees* backup consistency on machine reboot
3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
5. More things you can find in BackupableDBOptions
Here is the directory structure I use:
backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
All the files are ref counted and deleted immediatelly when they get out of scope.
Some other stuff in this diff:
1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
Test Plan:
I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
Also, `make asan_check`
Reviewers: dhruba, haobo, emayanke
Reviewed By: dhruba
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D14295