Commit graph

502 commits

Author SHA1 Message Date
Sameet Agarwal e7c434c364 Add columnfamily option optimize_filters_for_hits to optimize for key hits only
Summary:
    Summary:
    Added a new option to ColumnFamllyOptions  - optimize_filters_for_hits. This option can be used in the case where most
    accesses to the store are key hits and we dont need to optimize performance for key misses.
    This is useful when you have a very large database and most of your lookups succeed.  The option allows the store to
     not store and use filters in the last level (the largest level which contains data). These filters can take a large amount of
     space for large databases (in memory and on-disk). For the last level, these filters are only useful for key misses and not
     for key hits. If we are not optimizing for key misses, we can choose to not store these filters for that level.

    This option is only provided for BlockBasedTable. We skip the filters when we are compacting

Test Plan:
1. Modified db_test toalso run tests with an additonal option (skip_filters_on_last_level)
 2. Added another unit test to db_test which specifically tests that filters are being skipped

Reviewers: rven, igor, sdong

Reviewed By: sdong

Subscribers: lgalanis, yoshinorim, MarkCallaghan, rven, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33717
2015-02-26 16:25:56 -08:00
Igor Sugak 62247ffa3b rocksdb: Add missing override
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.

Prerequisites: bear and clang 3.5 build with extra tools

```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```

Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```

Reviewers: igor, kradhakrishnan, rven, meyering, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34077
2015-02-26 11:28:41 -08:00
Jinfu Leng 96d989f70d catch config errors with L0 file count triggers
Test Plan: Run "make clean && make all check"

Reviewers: rven, igor, yhchiang, kradhakrishnan, MarkCallaghan, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33627
2015-02-23 16:08:27 -08:00
sdong d45a6a4002 Add rocksdb.num-live-versions: number of live versions
Summary: Add a DB property about live versions. It can be helpful to figure out whether there are files not live but not yet deleted, in some use cases.

Test Plan: make all check

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: yoshinorim, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33327
2015-02-19 13:10:37 -08:00
Venkatesh Radhakrishnan 7d817268b9 Managed iterator
Summary:
This is a diff for managed iterator. A managed iterator
is a wrapper around an iterator which saves the options for that
iterator as well as the current key/value so that the underlying iterator
and its associated memory can be released when it is aged out
automatically or on the request of the user. Will provide the automatic release as a follow-up diff.

Test Plan: Managed* tests in db_test and XF tests for managed iterator

Reviewers: igor, yhchiang, anthony, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31401
2015-02-18 11:49:31 -08:00
Yueh-Hsuan Chiang 12753130ec Remove ThreadStatusMultiCompaction test
Summary:
Remove ThreadStatusMultiCompaction test as it's currently written
in a way that depends on some randomness, while the flush / compaction
status of a single thread is also covered in ThreadStatusFlush
and ThreadStatusSingleCompaction tests.

Test Plan: ./db_test

Reviewers: igor, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33537
2015-02-17 12:04:56 -08:00
Yueh-Hsuan Chiang e60bc99fe0 Allow GetThreadList to reflect flush activity.
Summary: Allow GetThreadList to reflect flush activity.

Test Plan:
Developed ThreadStatusFlush test and updated ThreadStatusMultiCompaction test.

./db_test  ./thread_list_test

Reviewers: sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D32871
2015-02-17 10:13:52 -08:00
Igor Canadi e7ea51a8e7 Introduce job_id for flush and compaction
Summary:
It would be good to assing background job their IDs. Two benefits:
1) makes LOGs more readable
2) I might use it in my EventLogger, which will try to make our LOG easier to read/query/visualize

Test Plan: ran rocksdb, read the LOG

Reviewers: sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31617
2015-02-12 09:54:48 -08:00
sdong 5f00af4570 DBTest.DestroyDBMetaDatabase: create DB directories if not exists
Summary: DBTest.DestroyDBMetaDatabase occasionally fails on my dev host, for file not existing. Always create directories to avoid that.

Test Plan: Run the test

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33321
2015-02-11 16:16:50 -08:00
sdong 68af7811ea Remember whole key/prefix filtering on/off in SST file
Summary: Remember whole key or prefix filtering on/off in SST files. If user opens the DB with a different setting that cannot be satisfied while reading the SST file, ignore the bloom filter.

Test Plan: Add a unit test for it

Reviewers: yhchiang, igor, rven

Reviewed By: rven

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32889
2015-02-11 11:20:04 -08:00
Igor Canadi 863009b5a5 Fix deleting obsolete files #2
Summary: For description of the bug, see comment in db_test. The fix is pretty straight forward.

Test Plan: added unit test. eventually we need better testing of FOF/POF process.

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33081
2015-02-09 17:38:32 -08:00
Grace Law 1851f977c2 Added RocksDB stats GET_HIT_L0 and GET_HIT_L1
Summary:
  - In statistics.h , added tickers.
  - In version_set.cc,
  -- Added a getter method for hit_file_level_ in the class FilePicker
  -- Added a line in the Get() method in case of a found, increment the corresponding counters based on the level of the file respectively.

Corresponding task: https://our.intern.facebook.com/intern/tasks/?s=506100481&t=5952818
Personal fork: 0c3f2e3600

Test Plan:
In terminal,
```
make -j32 db_test
ROCKSDB_TESTS=L0L1L2AndUpHitCounter ./db_test
```

Or to use debugger,
```
make -j32 db_test
export ROCKSDB_TESTS=L0L1L2AndUpHitCounter
gdb db_test
```

Reviewers: rven, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D32205
2015-02-09 14:53:58 -08:00
Igor Canadi 2a979822b6 Fix deleting obsolete files
Summary:
This diff basically reverts D30249 and also adds a unit test that was failing before this patch.

I have no idea how I didn't catch this terrible bug when writing a diff, sorry about that :(

I think we should redesign our system of keeping track of and deleting files. This is already a second bug in this critical piece of code. I'll think of few ideas.

BTW this diff is also a regression when running lots of column families. I plan to revisit this separately.

Test Plan: added a unit test

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33045
2015-02-06 08:44:30 -08:00
Igor Canadi 6f10130354 Fix DestroyDB
Summary:
When DestroyDB() finds a wal file in the DB directory, it assumes it is actually in WAL directory. This can lead to confusion, since it reports IO error when it tries to delete wal file from DB directory. For example: https://ci-builds.fb.com/job/rocksdb_clang_build/296/console

This change will fix our unit tests.

Test Plan: unit tests work

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32907
2015-02-05 20:09:42 -08:00
Igor Canadi 7de4e99a8e Revert "Fix wal_dir not getting cleaned"
This reverts commit f36d394aed.
2015-02-05 11:44:17 -08:00
Yueh-Hsuan Chiang 181191a1e4 Add a counter for collecting the wait time on db mutex.
Summary:
Add a counter for collecting the wait time on db mutex.
Also add MutexWrapper and CondVarWrapper for measuring wait time.

Test Plan:
./db_test
export ROCKSDB_TESTS=MutexWaitStats
./db_test

verify stats output using db_bench
make clean
make release
./db_bench --statistics=1 --benchmarks=fillseq,readwhilewriting --num=10000 --threads=10

Sample output:
    rocksdb.db.mutex.wait.micros COUNT : 7546866

Reviewers: MarkCallaghan, rven, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32787
2015-02-04 21:39:45 -08:00
Igor Canadi f36d394aed Fix wal_dir not getting cleaned 2015-02-04 18:57:22 -08:00
sdong 53ae09c398 db_test: fix a data race in SpecialEnv
Summary: db_test's test class SpecialEnv has a thread unsafe variable rnd_ but it can be accessed by multiple threads. It is complained by TSAN. Protect it by a mutex.

Test Plan: Run the test

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32895
2015-02-04 18:32:53 -08:00
sdong e63140d52b Get() to use prefix bloom filter when filter is not block based
Summary:
Get() now doesn't make use of bloom filter if it is prefix based. Add the check.
Didn't touch block based bloom filter. I can't fully reason whether it is correct to do that. But it's straight-forward to for full bloom filter.

Test Plan:
make all check
Add a test case in DBTest

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: MarkCallaghan, leveldb, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D31941
2015-02-04 15:15:41 -08:00
Venkatesh Radhakrishnan dad98dd4ae Changes for supporting cross functional tests for inplace_update
Summary:
This diff containes the changes to the code and db_test
for supporting cross functional tests for inplace_update

Test Plan: Run XF with inplace_test and also without

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32367
2015-02-03 12:19:56 -08:00
Venkatesh Radhakrishnan 0b8dec7172 Cross functional test infrastructure for RocksDB.
Summary:
This Diff provides the implementation of the cross functional
test infrastructure. This provides the ability to test a single feature
with every existing regression test in order to identify issues with
interoperability between features.

Test Plan:
Reference implementation of inplace update support cross
functional test. Able to find interoperability issues with inplace
support and ran all of db_test. Will add separate diff for those changes.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32247
2015-02-02 14:49:22 -08:00
sdong 5917de0bae CappedFixTransform: return fixed length prefix, or full key if key is shorter than the fixed length
Summary: Add CappedFixTransform, which is the same as fixed length prefix extractor, except that when slice is shorter than the fixed length, it will use the full key.

Test Plan:
Add a test case for
db_test
options_test
and a new test

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: MarkCallaghan, leveldb, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D31887
2015-01-30 16:04:30 -08:00
Igor Canadi 6c6037f60c Expose Snapshot's SequenceNumber
Summary:
Requested here: https://www.facebook.com/groups/rocksdb.dev/permalink/705524519546065/

It might also help with mongo. I don't see a reason why we shouldn't expose this info.

Test Plan: make check

Reviewers: sdong, yhchiang, rven

Reviewed By: rven

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32547
2015-01-29 18:22:43 -08:00
sdong d07fec3bdc make DBTest.SharedWriteBuffer to pass MockEnv
Summary: DBTest.SharedWriteBuffer uses an Options that doesn't pass CurrentOptions(), so that it doesn't use MockEnv. However, DBTest's constructor uses MockEnv to call DestoryDB() to clean up, causing uncleaned state before it runs.

Test Plan: Run the test modified to make sure they pass default Env and SharedWriteBuffer now passes MockEnv.

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32475
2015-01-28 16:19:27 -08:00
Igor Canadi 4bdf38b16e Disable FlushSchedule when running TSAN
Summary:
There's a bug in TSAN (or libstdc++?) with std::shared_ptr<> for some reason. In db_test, only FlushSchedule is affected.

See more: https://groups.google.com/forum/#!topic/thread-sanitizer/vz_s-t226Vg

With this change and all other @sdong's and mine diffs, our db_test should be TSAN-clean. I'll move to other tests.

Test Plan: no more flush schedule when running TSAN

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32469
2015-01-28 15:31:48 -08:00
Igor Canadi 560ed402bd [minor] fprintf to stderr instead of stdout in test 2015-01-27 21:00:33 -08:00
Yueh-Hsuan Chiang 46a7048dcd Reduce false alarm in ThreadStatusMultipleCompaction test 2015-01-22 15:45:02 -08:00
sdong 4e48753b73 Sync manifest file when initializing it
Summary: Now we don't sync manifest file when initializing it, so DB cannot be safely reopened before the first mem table flush. Fix it by syncing it. This fixes fault_injection_test.

Test Plan: make all check

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32001
2015-01-22 14:32:03 -08:00
Igor Canadi 9ab5adfc59 New BlockBasedTable version -- better compressed block format
Summary:
This diff adds BlockBasedTable format_version = 2. New format version brings better compressed block format for these compressions:
1) Zlib -- encode decompressed size in compressed block header
2) BZip2 -- encode decompressed size in compressed block header
3) LZ4 and LZ4HC -- instead of doing memcpy of size_t encode size as varint32. memcpy is very bad because the DB is not portable accross big/little endian machines or even platforms where size_t might be 8 or 4 bytes.

It does not affect format for snappy.

If you write a new database with format_version = 2, it will not be readable by RocksDB versions before 3.10. DB::Open() will return corruption in that case.

Test Plan:
Added a new test in db_test.
I will also run db_bench and verify VSIZE when block_cache == 1GB

Reviewers: yhchiang, rven, MarkCallaghan, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31461
2015-01-14 16:24:24 -08:00
Yueh-Hsuan Chiang 2159484dd6 Remove two unnecessary blank lines in db/db_test.cc 2015-01-13 01:40:11 -08:00
Yueh-Hsuan Chiang d2c018fd5b Make ThreadStatusMultipleCompaction more robust. 2015-01-13 01:02:10 -08:00
Yueh-Hsuan Chiang bf9aa4dfcd Improve GetThreadStatus to avoid false alarm in some case. 2015-01-13 00:38:09 -08:00
Yueh-Hsuan Chiang c91cdd59c1 Allow GetThreadList() to indicate a thread is doing Compaction.
Summary: Allow GetThreadList() to indicate a thread is doing Compaction.

Test Plan:
export ROCKSDB_TESTS=ThreadStatus
./db_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba, jonahcohen, rven

Differential Revision: https://reviews.facebook.net/D30105
2015-01-13 00:04:08 -08:00
Igor Canadi 15d2abbec3 Fix build issues 2015-01-09 13:04:06 -08:00
Yueh-Hsuan Chiang bf287b76e0 Add structures for exposing thread events and operations.
Summary:
Add structures for exposing events and operations.  Event describes
high-level action about a thread such as doing compaciton or
doing flush, while an operation describes lower-level action
of a thread such as reading / writing a SST table, waiting for
mutex.  Events and operations are designed to be independent.
One thread would typically involve in one event and one operation.

Code instrument will be in a separate diff.

Test Plan:
Add unit-tests in thread_list_test
make dbg -j32
./thread_list_test
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: rven, jonahcohen, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29781
2014-12-30 10:39:13 -08:00
Chris BeHanna d232cb156b Fix the build with -DNDEBUG.
Dike out the body of VerifyCompactionResult.  With assert() compiled out, the
loop index variable in the inner loop was unused, breaking the build when
-Werror is enabled.
2014-12-22 17:06:18 -06:00
Yueh-Hsuan Chiang 45bab305f9 Move GetThreadList() feature under Env.
Summary:
GetThreadList() feature depends on the thread creation and destruction, which is currently handled under Env.
This patch moves GetThreadList() feature under Env to better manage the dependency of GetThreadList() feature
on thread creation and destruction.

Renamed ThreadStatusImpl to ThreadStatusUpdater.  Add ThreadStatusUtil, which is a static class contains
utility functions for ThreadStatusUpdater.

Test Plan: run db_test, thread_list_test and db_bench and verify the life cycle of Env and ThreadStatusUpdater is properly managed.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: ljin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30057
2014-12-22 12:20:17 -08:00
Igor Canadi 0acc738810 Speed up FindObsoleteFiles()
Summary:
There are two versions of FindObsoleteFiles():
* full scan, which is executed every 6 hours (and it's terribly slow)
* no full scan, which is executed every time a background process finishes and iterator is deleted

This diff is optimizing the second case (no full scan). Here's what we do before the diff:
* Get the list of obsolete files (files with ref==0). Some files in obsolete_files set might actually be live.
* Get the list of live files to avoid deleting files that are live.
* Delete files that are in obsolete_files and not in live_files.

After this diff:
* The only files with ref==0 that are still live are files that have been part of move compaction. Don't include moved files in obsolete_files.
* Get the list of obsolete files (which exclude moved files).
* No need to get the list of live files, since all files in obsolete_files need to be deleted.

I'll post the benchmark results, but you can get the feel of it here: https://reviews.facebook.net/D30123

This depends on D30123.

P.S. We should do full scan only in failure scenarios, not every 6 hours. I'll do this in a follow-up diff.

Test Plan:
One new unit test. Made sure that unit test fails if we don't have a `if (!f->moved)` safeguard in ~Version.

make check

Big number of compactions and flushes:

  ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30249
2014-12-22 12:04:45 +01:00
Venkatesh Radhakrishnan 153f4f0719 RocksDB: Allow Level-Style Compaction to Place Files in Different Paths
Summary:
Allow Level-style compaction to place files in different paths
This diff provides the code for task 4854591. We now support level-compaction
to place files in different paths by specifying  them in db_paths  along with
the minimum level for files to store in that path.

Test Plan: ManualLevelCompactionOutputPathId in db_test.cc

Reviewers: yhchiang, MarkCallaghan, dhruba, yoshinorim, sdong

Reviewed By: sdong

Subscribers: yoshinorim, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29799
2014-12-15 21:48:16 -08:00
sdong d7a486668c Improve scalability of DB::GetSnapshot()
Summary: Now DB::GetSnapshot() doesn't scale to more column families, as it needs to go through all the column families to find whether snapshot is supported. This patch optimizes it.

Test Plan:
Add unit tests to cover negative cases.
make all check

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30093
2014-12-11 13:27:57 -08:00
sdong 1f04066cab Add DBProperty to return number of snapshots and time for oldest snapshot
Summary:
Add a counter in SnapshotList to show number of snapshots. Also a unix timestamp in every snapshot.
Add two DB Properties to return number of snapshots and timestamp of the oldest one.

Test Plan: Add unit test checking

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba, MarkCallaghan

Differential Revision: https://reviews.facebook.net/D29919
2014-12-05 17:07:49 -08:00
Jonah Cohen a14b7873ee Enforce write buffer memory limit across column families
Summary:
Introduces a new class for managing write buffer memory across column
families.  We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.

Test Plan: Added SharedWriteBuffer unit test to db_test.cc

Reviewers: sdong, rven, ljin, igor

Reviewed By: igor

Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim

Differential Revision: https://reviews.facebook.net/D22581
2014-12-02 12:09:20 -08:00
Yueh-Hsuan Chiang 13de000f07 Add rocksdb::ToString() to address cases where std::to_string is not available.
Summary:
In some environment such as android, the c++ library does not have
std::to_string.  This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.

Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29181
2014-11-24 20:44:49 -08:00
Venkatesh Radhakrishnan 3257221499 Fixes valgrind error in GetSnapshotLink. Free checkpoint now.
Summary: Free checkpoint after its directory is removed.

Test Plan: Run valgrind with GetSnapshotLink.

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29493
2014-11-24 10:20:50 -08:00
Yueh-Hsuan Chiang 569853ed10 Fix leak when create_missing_column_families=true on ThreadStatus
Summary:
An entry of ConstantColumnFamilyInfo is created when:
1. DB::Open
2. CreateColumnFamily.

However, there are cases that DB::Open could also call CreateColumnFamily
when create_missing_column_families=true.  As a result, it will create
duplicate ConstantColumnFamilyInfo and one of them would be leaked.

Test Plan: ./deletefile_test

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29307
2014-11-22 00:04:41 -08:00
Yueh-Hsuan Chiang 4b63fcbff3 Add enable_thread_tracking to DBOptions
Summary:
Add enable_thread_tracking to DBOptions to allow
tracking thread status related to the DB.  Default is off.

Test Plan:
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29289
2014-11-20 21:13:18 -08:00
Venkatesh Radhakrishnan 004f416b77 Moved checkpoint to utilities
Summary:
Moved checkpoint to utilities.
Addressed comments by Igor, Siying, Dhruba

Test Plan: db_test/SnapshotLink

Reviewers: dhruba, igor, sdong

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29079
2014-11-20 15:54:47 -08:00
Yueh-Hsuan Chiang d0c5f28a5c Introduce GetThreadList API
Summary:
Add GetThreadList API, which allows developer to track the
status of each process.  Currently, calling GetThreadList will
only get the list of background threads in RocksDB with their
thread-id and thread-type (priority) set.  Will add more support
on this in the later diffs.

ThreadStatus currently has the following properties:

  // An unique ID for the thread.
  const uint64_t thread_id;

  // The type of the thread, it could be ROCKSDB_HIGH_PRIORITY,
  // ROCKSDB_LOW_PRIORITY, and USER_THREAD
  const ThreadType thread_type;

  // The name of the DB instance where the thread is currently
  // involved with.  It would be set to empty string if the thread
  // does not involve in any DB operation.
  const std::string db_name;

  // The name of the column family where the thread is currently
  // It would be set to empty string if the thread does not involve
  // in any column family.
  const std::string cf_name;

  // The event that the current thread is involved.
  // It would be set to empty string if the information about event
  // is not currently available.

Test Plan:
./thread_list_test
export ROCKSDB_TESTS=GetThreadList
./db_test

Reviewers: rven, igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25047
2014-11-20 10:49:32 -08:00
Igor Canadi 5c04acda08 Explicitly clean JobContext
Summary: This way we can gurantee that old MemTables get destructed before DBImpl gets destructed, which might be useful if we want to make them depend on state from DBImpl.

Test Plan: make check with asserts in JobContext's destructor

Reviewers: ljin, sdong, yhchiang, rven, jonahcohen

Reviewed By: jonahcohen

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28959
2014-11-14 15:43:10 -08:00
Venkatesh Radhakrishnan 6c1b040cc9 Provide openable snapshots
Summary: Store links to live files in directory on same disk

Test Plan:
Take snapshot and open it. Added a test GetSnapshotLink in
db_test.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28713
2014-11-14 11:38:26 -08:00
sdong f822129b32 Add a unit test for behavior when merge operator and compaction filter co-exist.
Summary: Add a unit test in db_test to verify the behavior when both of merge operator and compaction filter apply to a key when merging.

Test Plan: Run the new test

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28455
2014-11-13 15:21:12 -08:00
Igor Canadi 767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Yueh-Hsuan Chiang 8447861896 Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h
Summary:
Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h

Test Plan:
make
2014-11-07 14:57:51 -08:00
Yueh-Hsuan Chiang 28c82ff1b3 CompactFiles, EventListener and GetDatabaseMetaData
Summary:
This diff adds three sets of APIs to RocksDB.

= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h

= EventListener =
* A virtual class that allows users to implement a set of
  call-back functions which will be called when specific
  events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners

= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
  will try to compact those files into the specified level.

= Example =
* Example code can be found in example/compact_files_example.cc, which implements
  a simple external compactor using EventListener, GetColumnFamilyMetaData, and
  CompactFiles API.

Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test

Reviewers: ljin, igor, rven, sdong

Reviewed By: sdong

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24705
2014-11-07 14:45:18 -08:00
Igor Canadi a0f887c9e4 Fix compile 2014-11-07 12:07:43 -08:00
Igor Canadi 53af5d877d Redesign pending_outputs_
Summary:
Here's a prototype of redesigning pending_outputs_. This way, we don't have to expose pending_outputs_ to other classes (CompactionJob, FlushJob, MemtableList). DBImpl takes care of it.

Still have to write some comments, but should be good enough to start the discussion.

Test Plan: make check, will also run stress test

Reviewers: ljin, sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28353
2014-11-07 11:50:34 -08:00
Jonah Cohen ec101cd49a Correctly test both compaction styles in CompactionDeletionTriggerReopen
Summary:
CompactionDeletionTriggerReopen wasn't actually testing universal
compaction.

Test Plan: db_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28443
2014-11-06 17:28:49 -08:00
Yueh-Hsuan Chiang 8d87467bb0 Make PartialCompactionFailure Test more robust again.
Summary:
Make PartialCompactionFailure Test more robust again by
blocking background compaction until we simulate the
file creation error.

Test Plan:
export ROCKSDB_TESTS=PartialCompactionFailure
./db_test

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28431
2014-11-06 17:07:52 -08:00
Lei Jin 64d302d304 make DropWritesFlush deterministic
Summary:
TEST_WaitForFlush should wait until it sees error when parameter is set
to true so we don't need to loop and timeout

Test Plan: ROCKSDB_TESTS=DropWritesFlush ./db_test

Reviewers: sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28419
2014-11-06 16:07:07 -08:00
Yueh-Hsuan Chiang e526b71402 Make PartialCompactionFailure Test more robust.
Summary: Make PartialCompactionFailure Test more robust.

Test Plan:
export ROCKSDB_TESTS=PartialCompactionFailure
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28425
2014-11-06 13:53:02 -08:00
Igor Canadi 9f20395cd6 Turn -Wshadow back on
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.

Test Plan: compiles

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28131
2014-11-06 11:14:28 -08:00
sdong 367a3f9cb4 Improve DBTest.GroupCommitTest: artificially slowdown log writing to trigger group commit
Summary: In order to avoid random failure of DBTest.GroupCommitTest, artificially sleep 100 microseconds in each log writing.

Test Plan: Run the test in a machine where valgrind version of the test always fails multiple times and see it always succeed.

Reviewers: igor, yhchiang, rven, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28401
2014-11-06 10:48:06 -08:00
sdong ac95ae1b5d Make sure WAL is synced for DB::Write() if write batch is empty
Summary: This patch makes it a contract that if an empty write batch is passed to DB::Write() and WriteOptions.sync = true, fsync is called to WAL.

Test Plan: A new unit test

Reviewers: ljin, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, MarkCallaghan, leveldb

Differential Revision: https://reviews.facebook.net/D28365
2014-11-06 09:48:19 -08:00
Lei Jin fd24ae9d05 SetOptions() to return status and also add it to StackableDB
Summary: as title

Test Plan: ./db_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28269
2014-11-04 16:23:05 -08:00
sdong da5daa061c Replace some ASSERT_TRUE() asserts in DBTest.DynamicMemtableOptions and DynamicCompactionOptions with more specific ones
Summary: Replace some ASSERT_TRUE() to ASSERT_GT() and ASSERT_LT() so that in case the assert is triggered, the value is printed out.

Test Plan: Run the two tests

Reviewers: ljin, rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28293
2014-11-04 11:45:03 -08:00
sdong b0cda4a116 DBTest.DynamicMemtableOptions to use single background compaction
Summary:
Now DBTest.DynamicMemtableOptions sets background compaction to be 4, without actually increasing thread pool size (even before the feature of automatic increasing it). To make sure the behavior stays the same after the automatic thread pool increasing, set it back to 1.

Hopefully it can fix the occasional failure of the test.

Test Plan: Run the test

Reviewers: igor, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28281
2014-11-04 11:14:12 -08:00
sdong ac6afaf9ef Enforce naming convention of getters in version_set.h
Summary: Enforce the accessier naming convention in functions in version_set.h

Test Plan: make all check

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28143
2014-11-04 09:59:05 -08:00
sdong 09899f0b51 DB::Open() to automatically increase thread pool size if it is smaller than max number of parallel compactions or flushes
Summary:
With the patch, thread pool size will be automatically increased if DB's options ask for more parallelism of compactions or flushes.

Too many users have been confused by the API. Change it to make it harder for users to make mistakes

Test Plan: Add two unit tests to cover the function.

Reviewers: yhchiang, rven, igor, MarkCallaghan, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27555
2014-11-03 17:22:34 -08:00
Lei Jin 0e526eb9d7 introduce TestMemEnv and use it in db_test
Summary:
TestMemEnv simulates all Env APIs using in-memory data structures.
We can use it to speed up db_test run, which is now reduced ~7mins when it is
enabled.
We can also add features to simulate power/disk failures in the next
step
TestMemEnv is derived from helper/mem_env
mem_env can not be used for rocksdb since some of its APIs do not give
the same results as env_posix. And its file read/write is not thread safe

Test Plan:
make all -j32
./db_test
./env_mem_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28035
2014-10-31 15:08:10 -07:00
Igor Canadi 9f7fc3ac45 Turn on -Wshadow
Summary:
...and fix all the errors :)

Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.

Test Plan: compiles

Reviewers: yhchiang, rven, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27711
2014-10-31 11:59:54 -07:00
Igor Canadi 635905481d WalManager
Summary: Decoupling code that deals with archived log files outside of DBImpl. That will make this code easier to reason about and test. It will also make the code easier to improve, because an improver doesn't have to understand DBImpl code in entirety.

Test Plan: added test

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27873
2014-10-29 17:43:37 -07:00
Lei Jin 065766b8d2 DynamicCompactionOptions: relax the check bound a little
Summary:
Increase the level size so that impact of a single file is smaller.
Also relax the bound

Test Plan: ran locally

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27939
2014-10-29 17:02:21 -07:00
Lei Jin 2d4fe048f4 remove dead code
Summary:
as title

Test Plan:
make db_test
2014-10-29 13:38:06 -07:00
Lei Jin 76d54530d3 minor - remove default value for ChangeFilterOptions() and ChangeCompactionOptions()
Summary:
So now all open() in db_test should get options from callsite. And
destroy() always uses the last used options saved on open()
I will start to integrate env_mem in the next diff

Test Plan: make all check -j32

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27819
2014-10-29 13:36:18 -07:00
Lei Jin e130e88bc6 DBTest: options clean up - part 4
Summary: as title

Test Plan: as part 1

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27789
2014-10-29 12:00:42 -07:00
Lei Jin 34f3c5a20f DBTest: options clean up - part 3
Summary: as title

Test Plan: same as part 1

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27705
2014-10-29 12:00:01 -07:00
Lei Jin cdc7230e4c DBTest: options clean up - part 2
Summary: as title

Test Plan: same as part 1

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27693
2014-10-29 11:59:18 -07:00
Lei Jin 5a921b8950 DBTest: options clean up - part 1
Summary:
DBTest has several functions (Reopen(), TryReopen(), ChangeOptins(), etc
that takes a pointer to options), depending on if it is nullptr, it uses
different options underneath. This makes it really hard to track what
options is used in different test case. We should just kill the default
value and make it being passed into explicitly. It is going to be very
hairy. I will start with simple ones.

Test Plan:
make db_test
stacked diffs, will run test with full stack

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27687
2014-10-29 11:58:09 -07:00
Yueh-Hsuan Chiang fb3f8ffe5e Improve the robustness of PartialCompactionFailure test again.
Summary:
Improve the robustness of PartialCompactionFailure test again.

Test Plan:
./db_test
2014-10-28 15:35:10 -07:00
Yueh-Hsuan Chiang 60fa7d1365 Improve the robustnesss of PartialCompactionFailure test.
Summary:
Improve the robustness of PartialCompactionFailure test.

Test Plan:
./db_test
2014-10-28 15:17:50 -07:00
Yueh-Hsuan Chiang 3772a3d09d Fix the bug where compaction does not fail when RocksDB can't create a new file.
Summary:
This diff has two fixes.

1. Fix the bug where compaction does not fail when RocksDB can't create a new file.
2. When NewWritableFiles() fails in OpenCompactionOutputFiles(), previously such fail-to-created file will be still be included as a compaction output.  This patch also fixes this bug.
3. Allow VersionEdit::EncodeTo() to return Status and add basic check.

Test Plan:
./version_edit_test
export ROCKSDB_TESTS=FileCreationRandomFailure
./db_test

Reviewers: ljin, sdong, nkg-, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25581
2014-10-28 14:27:26 -07:00
Igor Canadi 48842ab316 Deprecate AtomicPointer
Summary: RocksDB already depends on C++11, so we might as well all the goodness that C++11 provides. This means that we don't need AtomicPointer anymore. The less things in port/, the easier it will be to port to other platforms.

Test Plan: make check + careful visual review verifying that NoBarried got memory_order_relaxed, while Acquire/Release methods got memory_order_acquire and memory_order_release

Reviewers: rven, yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D27543
2014-10-27 14:50:21 -07:00
Lei Jin 122f98e0b9 dynamic max_mem_compact_level
Summary: as title

Test Plan: unit test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25347
2014-10-23 15:37:14 -07:00
Lei Jin 1fee591e74 comments for DynamicCompactionOptions test
Summary: as title

Test Plan: n/a

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24963
2014-10-23 15:35:10 -07:00
Lei Jin 574028679b dynamic max_sequential_skip_in_iterations
Summary:
This is not a critical options. Making it dynamic so that we can remove
more reference to cfd->options()

Test Plan: unit test

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24957
2014-10-23 15:34:21 -07:00
Igor Canadi 4b1786e959 Fix SIGSEGV when declaring Arena after ScopedArenaIterator 2014-10-23 12:03:19 -07:00
Igor Canadi 90f156402c Fix CompactBetweenSnapshots 2014-10-23 11:18:33 -07:00
Lei Jin ff8f74c204 remove checking lower bound of level size
Summary:
as title

Test Plan:
make db_test
2014-10-17 21:18:51 -07:00
Lei Jin 2dd9bfe3a8 Sanitize block-based table index type and check prefix_extractor
Summary:
Respond to issue reported
https://www.facebook.com/groups/rocksdb.dev/permalink/651090261656158/
Change the Sanitize signature to take both DBOptions and CFOptions

Test Plan: unit test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25041
2014-10-17 21:18:36 -07:00
Yueh-Hsuan Chiang 5db9e76644 Fix Mac compile error: C++11 forbids default arguments for lambda expressions
Summary:
Fix the following Mac compile error.
db/db_test.cc:8686:52: error: C++11 forbids default arguments for lambda expressions [-Werror,-Wlambda-extensions]
  auto gen_l0_kb = [this](int start, int size, int stride = 1) {
                                                   ^        ~
Test Plan:
db_test
2014-10-17 14:47:26 -07:00
Lei Jin f4363fb81c Fix DynamicMemtableOptions test
Summary: as title

Test Plan: make release

Reviewers: igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25029
2014-10-17 10:09:45 -07:00
Lei Jin 274dc81c92 fix build failure
Summary: missed default value during merge

Test Plan: ./db_test

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24975
2014-10-16 17:33:09 -07:00
Lei Jin 4d5708aa56 dynamic soft_rate_limit and hard_rate_limit
Summary: as title

Test Plan:
unit test
I am only able to build the test case for hard_rate_limit.
soft_rate_limit is essentially the same thing as hard_rate_limit

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24759
2014-10-16 17:21:31 -07:00
Lei Jin 065a67c4f0 dynamic disable_auto_compactions
Summary: Add more tests as well

Test Plan: unit test

Reviewers: igor, sdong, yhchiang

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24747
2014-10-16 17:14:17 -07:00
Lei Jin dc50a1a593 make max_write_buffer_number dynamic
Summary: as title

Test Plan: unit test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24729
2014-10-16 16:57:59 -07:00
Yueh-Hsuan Chiang 5a76186340 Fixed compile error on Mac: default arguments for lambda expressions
Summary:
Fixed the following compile error on Mac.

db/db_test.cc:8618:52: error: C++11 forbids default arguments for lambda expressions [-Werror,-Wlambda-extensions]
  auto gen_l0_kb = [this](int start, int size, int stride = 1) {
                                                   ^        ~
1 error generated.

Test Plan:
db_test
2014-10-10 14:10:16 -07:00
Lei Jin 25f6a852e4 add db_test for changing memtable size
Summary:
The test only covers changing write_buffer_size. Other changable
parameters such bloom bits/probes are not obvious how to test.
Suggestions are welcome

Test Plan: db_test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24429
2014-10-07 10:40:45 -07:00
Igor Canadi d6987216c9 Merge pull request #327 from dalgaaf/wip-da-SCA-20141001
Fix some issues from SCA
2014-10-02 10:59:52 -07:00
Yueh-Hsuan Chiang 89833e5a85 Fixed signed-unsigned comparison warning in db_test.cc
Summary:
Fixed signed-unsigned comparison warning in db_test.cc

db/db_test.cc:8606:3: note: in instantiation of function template specialization 'rocksdb::test::Tester::IsEq<int, unsigned long>' requested here
  ASSERT_EQ(2, metadata.size());
    ^

Test Plan:
make db_test
2014-10-02 01:05:59 -07:00
Lei Jin 5ec53f3edf make compaction related options changeable
Summary:
make compaction related options changeable. Most of changes are tedious,
following the same convention: grabs MutableCFOptions at the beginning
of compaction under mutex, then pass it throughout the job and register
it in SuperVersion at the end.

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23349
2014-10-01 16:19:16 -07:00