Commit graph

1861 commits

Author SHA1 Message Date
Ari Ekmekji 40c64434d4 Parallelize L0-L1 Compaction: Restructure Compaction Job
Summary:
As of now compactions involving files from Level 0 and Level 1 are single
threaded because the files in L0, although sorted, are not range partitioned like
the other levels. This means that during L0-L1 compaction each file from L1
needs to be merged with potentially all the files from L0.

This attempt to parallelize the L0-L1 compaction assigns a thread and a
corresponding iterator to each L1 file that then considers only the key range
found in that L1 file and only the L0 files that have those keys (and only the
specific portion of those L0 files in which those keys are found). In this way
the overlap is minimized and potentially eliminated between different iterators
focusing on the same files.

The first step is to restructure the compaction logic to break L0-L1 compactions
into multiple, smaller, sequential compactions. Eventually each of these smaller
jobs will be run simultaneously. Areas to pay extra attention to are

  # Correct aggregation of compaction job statistics across multiple threads
  # Proper opening/closing of output files (make sure each thread's is unique)
  # Keys that span multiple L1 files
  # Skewed distributions of keys within L0 files

Test Plan: Make and run db_test (newer version has separate compaction tests) and compaction_job_stats_test

Reviewers: igor, noetzli, anthony, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42699
2015-08-03 11:32:14 -07:00
Andres Notzli 193dc977e7 Fixing dead code in table_properties_collector_test
Summary:
There was a bug in table_properties_collector_test that this patch
is fixing: `!backward_mode && !test_int_tbl_prop_collector` in
TestCustomizedTablePropertiesCollector was never true, so the code
in the if-block never got executed. The reason is that the
CustomizedTablePropertiesCollector test was skipping tests with
`!backward_mode_ && !encode_as_internal`. The reason for skipping
the tests is unknown.

Test Plan: make table_properties_collector_test && ./table_properties_collector_test

Reviewers: rven, igor, yhchiang, anthony, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43281
2015-07-30 16:59:03 -07:00
agiardullo 8161bdb5a0 WriteBatch Save Points
Summary:
Support RollbackToSavePoint() in WriteBatch and WriteBatchWithIndex.  Support for partial transaction rollback is needed for MyRocks.

An alternate implementation of Transaction::RollbackToSavePoint() exists in D40869.  However, the other implementation is messier because it is implemented outside of WriteBatch.  This implementation is much cleaner and also exposes a potentially useful feature to WriteBatch.

Test Plan: Added unit tests

Reviewers: IslamAbdelRahman, kradhakrishnan, maykov, yoshinorim, hermanlee4, spetrunia, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42723
2015-07-29 16:54:23 -07:00
Andres Notzli d06c82e477 Further cleanup of CompactionJob and MergeHelper
Summary:
Simplified logic in CompactionJob and removed unused parameter in
MergeHelper.

Test Plan: make && make check

Reviewers: rven, igor, sdong, yhchiang

Reviewed By: sdong

Subscribers: aekmekji, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42687
2015-07-28 19:21:55 -07:00
Andres Notzli e95c59cd2f Count number of corrupt keys during compaction
Summary:
For task #7771355, we would like to log the number of corrupt keys
during a compaction. This patch implements and tests the count
as part of CompactionJobStats.

Test Plan: make && make check

Reviewers: rven, igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42921
2015-07-28 16:41:40 -07:00
Poornima Chozhiyath Raman 1bdfcef7bf Fix when output level is 0 of universal compaction with trivial move
Summary: Fix for universal compaction with trivial move, when the ouput level is 0. The tests where failing. Fixed by allowing normal compaction when output level is 0.

Test Plan: modified test cases run successfully.

Reviewers: sdong, yhchiang, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: anthony, kradhakrishnan, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42933
2015-07-27 14:25:57 -07:00
sdong 82f148ef97 Fix test DBCompactionTest.PartialCompactionFailure undeterministic failure
Summary: DBCompactionTest.PartialCompactionFailure has a risk that one flush job writes out two mem tables into one file, so that the total files flushed are less than expected. Fix it by writing for flush to finish after every write.

Test Plan: Run the test

Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42831
2015-07-22 13:46:56 -07:00
Mike Kolupaev 4922af6f8d fixed DBTest.GetPropertiesOfAllTablesTest and DBTest.GetUserDefinedTablaProperties flakiness
Summary: These tests used to fail if a compaction happened between flushing tables and enumerating them to get properties.

Test Plan: this reports occasional failures without this diff and no failures with it: `for i in {1..10000}; do echo $i; done | parallel --gnu -j100 'TEST_TMPDIR=`TMPDIR=/dev/shm/rockstemp mktemp -d -t` ./db_test --gtest_filter=DBTest.GetUserDefinedTablaProperties >&/dev/null || echo {} failed'`

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42861
2015-07-22 12:37:49 -07:00
Mike Kolupaev fe09a6dae3 [wal changes 2/3] write with sync=true syncs previous unsynced wals to prevent illegal data loss
Summary:
I'll just copy internal task summary here:

"
This sequence will cause data loss in the middle after an sync write:

non-sync write key 1
flush triggered, not yet scheduled
sync write key 2
system crash

After rebooting, users might see key 2 but not key 1, which violates the API of sync write.

This can be reproduced using unit test FaultInjectionTest::DISABLED_WriteOptionSyncTest.

One way to fix it is for a sync write, if there is outstanding unsynced log files, we need to syc them too.
"

This diff should be considered together with the next diff D40905; in isolation this fix probably could be a little simpler.

Test Plan: `make check`; added a test for that (DBTest.SyncingPreviousLogs) before noticing FaultInjectionTest.WriteOptionSyncTest (keeping both since mine asserts a bit more); both tests fail without this diff; for D40905 stacked on top of this diff, ran tests with ASAN, TSAN and valgrind

Reviewers: rven, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40899
2015-07-22 03:28:08 -07:00
Andres Notzli 06aebca592 Report live data size estimate
Summary:
Fixes T6548822. Added a new function for estimating the size of the live data
as proposed in the task. The value can be accessed through the property
rocksdb.estimate-live-data-size.

Test Plan:
There are two unit tests in version_set_test and a simple test in db_test.
make version_set_test && ./version_set_test;
make db_test && ./db_test gtest_filter=GetProperty

Reviewers: rven, igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41493
2015-07-21 21:33:20 -07:00
sdong 02b635fa38 Fix undeterministic failure of DBTest.GetPropertiesOfAllTablesTest
Summary: DBTest.GetPropertiesOfAllTablesTest generates four files and expects four files there, but a L0->L1 comapction can trigger to compact to one single file. Fix it by raising level 0 number of file compaction trigger

Test Plan: Run it many times and see it never fails.

Reviewers: kradhakrishnan, IslamAbdelRahman, yhchiang, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42789
2015-07-21 17:13:23 -07:00
Yueh-Hsuan Chiang 7219088cda Move general compaction tests from db_test.cc to db_compaction_test.cc
Summary: Move general compaction tests from db_test.cc to db_compaction_test.cc

Test Plan:
db_test
db_compaction_test

Reviewers: igor, sdong, IslamAbdelRahman, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42651
2015-07-21 03:05:57 -07:00
agiardullo 064294081b Improved FileExists API
Summary: Add new CheckFileExists method.  Considered changing the FileExists api but didn't want to break anyone's builds.

Test Plan: unit tests

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42003
2015-07-20 17:20:40 -07:00
Yueh-Hsuan Chiang 443c6646bc Move remaining universal compaction tests from db_test.cc to db_universal_compaction_test.cc
Summary: Move remaining universal compaction tests from db_test.cc to db_universal_compaction_test.cc

Test Plan:
db_test
db_universal_compaction_test

Reviewers: igor, anthony, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42645
2015-07-20 16:07:54 -07:00
Yueh-Hsuan Chiang 7462286d33 Move in-place-update related tests from db_test.cc to db_inplace_update_test.cc
Summary: Move in-place-update related tests from db_test.cc to db_inplace_update_test.cc

Test Plan:
db_test
db_inplace_update_test

Reviewers: igor, anthony, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42657
2015-07-20 16:05:28 -07:00
Yueh-Hsuan Chiang 331954ab83 Fixed DBTestUniversalManualCompactionOutputPathId test
Summary:
Fixed DBTestUniversalManualCompactionOutputPathId test
by changing the expected number of files when setting up
the test as flushes no-longer preempt compactions
in patch https://reviews.facebook.net/D41931.

Also, include db_universal_copaction_test in make all check.

Test Plan: db_universal_copaction_test

Reviewers: igor, sdong, IslamAbdelRahman, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42639
2015-07-20 13:53:41 -07:00
krad a75f23eb87 Relax assertions in unit DropWrites to be more permissible
Summary: This unit test is blocking our release since it fails under certain
compiler versions. The failure is due to a race in the unit test and not the
core functionality.

Test Plan: Run locally

Reviewers: sdong

CC: leveldb

Task ID: #7760955

Blame Rev:
2015-07-20 12:37:54 -07:00
sdong ee80432ff8 db_bench add an option of --universal_allow_trivial_move
Summary: Now we allow trivial move in universal compaction. Add a parameter in db_bench

Test Plan: Run db_bench with this option on and off and make sure the option is switched correctly.

Reviewers: yhchiang, igor, kradhakrishnan, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41427
2015-07-20 11:56:12 -07:00
sdong 58b4209e05 Trigger non-trivial compaction in fault_injection_test
Summary: Now the major test cases of fault_injection_test only insert keys in sorted order so compactions will be trivial move. Add a new mode to insert in non-sequential order to trigger non-trivial compactions.

Test Plan: Run the test

Reviewers: kradhakrishnan, anthony, IslamAbdelRahman, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42435
2015-07-20 11:51:57 -07:00
Islam AbdelRahman aa8ac6445b Skip unsupported tests in ROCKSDB_LITE
Summary:
Skipping these tests in ROCKSDB_LITE since they are not supported
json_document_test
wal_manager_test
ttl_test
sst_dump_test
deletefile_test
compact_files_test
prefix_test
checkpoint_test

Test Plan:
json_document_test
wal_manager_test
ttl_test
sst_dump_test
deletefile_test
compact_files_test
prefix_test
checkpoint_test

Reviewers: igor, sdong, yhchiang, kradhakrishnan, anthony

Reviewed By: anthony

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42573
2015-07-20 11:24:54 -07:00
Islam AbdelRahman c06d1d8393 Make merge_test runnable in ROCKSDB_LITE
Summary: Make merge_test runnable in ROCKSDB_LITE

Test Plan: merge_test

Reviewers: igor, sdong, yhchiang, kradhakrishnan, anthony

Reviewed By: anthony

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42579
2015-07-20 11:17:52 -07:00
Islam AbdelRahman 0d1d9aeebe Block plain_table_db_test in ROCKSDB_LITE
Summary: Block plain_table_db_test in ROCKSDB_LITE since plain table is not supported in ROCKSDB_LITE

Test Plan: plain_table_db_test

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42159
2015-07-20 11:12:02 -07:00
Islam AbdelRahman f0fe9126f2 Fix compile for write_callback_test in ROCKSDB_LITE
Summary: Add main for write_callback_test when compiled under ROCKSDB_LITE

Test Plan: write_callback_test

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42147
2015-07-20 10:54:15 -07:00
Islam AbdelRahman cf6a7bebc8 Block cuckoo table tests in ROCKSDB_LITE
Summary: Cuckoo table is not supported in ROCKSDB_LITE, blocking it's tests

Test Plan:
cuckoo_table_builder_test
cuckoo_table_db_test
cuckoo_table_reader_test

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42141
2015-07-20 10:50:46 -07:00
Islam AbdelRahman 20922c4a5a Make compaction_picker_test runnable in ROCKSDB_LITE
Summary: Remove universal and fifo compaction tests from ROCKSDB_LITE since they are not supported

Test Plan: compaction_picker_test

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42129
2015-07-20 10:46:09 -07:00
sdong 6e9fbeb27c Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.

Test Plan: Run all existing unit tests.

Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42321
2015-07-17 16:58:18 -07:00
Igor Canadi 35ca59364c Don't let flushes preempt compactions
Summary:
When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.

We have a special case that when you set max_background_flushes to 0, we
schedule the flush to execute on the compaction thread.

Test Plan: make check (there might be some unit tests that depend on this behavior)

Reviewers: IslamAbdelRahman, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41931
2015-07-17 12:02:52 -07:00
Ari Ekmekji 74c755c552 Added JSON manifest dump option to ldb command
Summary:
Added a new flag --json to the ldb manifest_dump command
that prints out the version edits as JSON objects for easier
reading and parsing of information.

Test Plan:
**Sample usage: **
```
./ldb manifest_dump --json --path=path/to/manifest/file
```

**Sample output:**
```
{"EditNumber": 0, "Comparator": "leveldb.BytewiseComparator", "ColumnFamily": 0}
{"EditNumber": 1, "LogNumber": 0, "ColumnFamily": 0}
{"EditNumber": 2, "LogNumber": 4, "PrevLogNumber": 0, "NextFileNumber": 7, "LastSeq": 35356, "AddedFiles": [{"Level": 0, "FileNumber": 5, "FileSize": 1949284, "SmallestIKey": "'", "LargestIKey": "'"}], "ColumnFamily": 0}
...
{"EditNumber": 13, "PrevLogNumber": 0, "NextFileNumber": 36, "LastSeq": 290994, "DeletedFiles": [{"Level": 0, "FileNumber": 17}, {"Level": 0, "FileNumber": 20}, {"Level": 0, "FileNumber": 22}, {"Level": 0, "FileNumber": 24}, {"Level": 1, "FileNumber": 13}, {"Level": 1, "FileNumber": 14}, {"Level": 1, "FileNumber": 15}, {"Level": 1, "FileNumber": 18}], "AddedFiles": [{"Level": 1, "FileNumber": 25, "FileSize": 2114340, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 26, "FileSize": 2115213, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 27, "FileSize": 2114807, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 30, "FileSize": 2115271, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 31, "FileSize": 2115165, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 32, "FileSize": 2114683, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 35, "FileSize": 1757512, "SmallestIKey": "'", "LargestIKey": "'"}], "ColumnFamily": 0}
...
```

Reviewers: sdong, anthony, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41727
2015-07-17 10:07:40 -07:00
Igor Canadi a96fcd09b7 Deprecate CompactionFilterV2
Summary: It has been around for a while and it looks like it never found any uses in the wild. It's also complicating our compaction_job code quite a bit. We're deprecating it in 3.13, but will put it back in 3.14 if we actually find users that need this feature.

Test Plan: make check

Reviewers: noetzli, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42405
2015-07-17 18:59:11 +02:00
Andres Notzli 1d20fa9d0f Fixed and simplified merge_helper
Summary:
MergeUntil was not reporting a success when merging an operand with
a Value/Deletion despite the comments in MergeHelper and CompactionJob
indicating otherwise. This lead to operands being written to the compaction
output unnecessarily:

M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M2 M3 M4 M5 (before the diff)
M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M4 M5 (after the diff)

In addition, the code handling Values/Deletion was basically identical.
This patch unifies the code. Finally, this patch also adds testing for
merge_helper.

Test Plan: make && make check

Reviewers: sdong, rven, yhchiang, tnovak, igor

Reviewed By: igor

Subscribers: tnovak, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42351
2015-07-17 09:27:24 -07:00
sdong ac2b9367fe Fix a typo in variable
Summary: Typo seqeuntial -> sequential in db/fault_injection_test.cc

Test Plan: Build it

Reviewers: ott, igor, anthony, kradhakrishnan, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42417
2015-07-16 12:17:24 -07:00
sdong 6c0c8dee7b Fix data loss after DB recovery by not allowing flush/compaction to be scheduled until DB opened
Summary:
Previous run may leave some SST files with higher file numbers than manifest indicates.
Compaction or flush may start to run while DB::Open() is still going on. SST file garbage collection may happen interleaving with compaction or flush, and overwrite files generated by compaction of flushes after they are generated. This might cause data loss. This possibility of interleaving is recently introduced.
Fix it by not allowing compaction or flush to be scheduled before DB::Open() finishes.

Test Plan: Add a unit test. This verification will have a chance to fail without the fix but doesn't fix without the fix.

Reviewers: kradhakrishnan, anthony, yhchiang, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42399
2015-07-16 10:57:41 -07:00
Andres Notzli e4af3bfb27 Test for compaction of corrupted keys
Summary:
Fixes T7697334. Adds a simple test to check whether CompactionJob deals
with corrupted keys correctly. Right now, we preserve corrupted keys.
Note: depending on the type of corruption and options like comparators,
CompactionJob fails. This test just checks whether corrupted keys that
do not fail CompactionJob are preserved.

Test Plan:
`make compaction_job_test && ./compaction_job_test` -> Tests pass.
Add `input->Next(); continue;` in CompactionJob::ProcessKeyValueCompaction()
inside then-branch of `!ParseInternalKey(key, &ikey)` -> Tests fail.

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42237
2015-07-16 09:18:35 -07:00
agiardullo 81d072623c move convenience.h out of utilities
Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.

Test Plan: unit tests.  Also compiled a link to the old location to verify the _Pragma works.

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42201
2015-07-15 14:51:51 -07:00
Poornima Chozhiyath Raman beb19ad0dd Fixing delete files in Trivial move of universal compaction
Summary:
Trvial move in universal compaction was failing when trying to move files from levels other than 0.
This was because the DeleteFile while trivially moving, was only deleting files of level 0 which caused duplication of same file in different levels.
This is fixed by passing the right level as argument in the call of DeleteFile while doing trivial move.

Test Plan: ./db_test ran successfully with the new test cases.

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D42135
2015-07-15 12:28:22 -07:00
krad c613960690 Build fix.
Summary: gcc-4.9-glibc-2.20 complains about uninitialized variable.

db/compaction_picker.cc: In member function 'bool
rocksdb::CompactionPicker::IsInputNonOverlapping(rocksdb::Compaction*)':
db/compaction_picker.cc:1174:17: error:
'prev.rocksdb::{anonymous}::InputFileInfo::f' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
   InputFileInfo prev, curr, next;

Test Plan: pmake on local environment

Reviewers: sdong igor

CC: leveldb@

Task ID: #

Blame Rev:
2015-07-15 11:25:10 -07:00
Aaron Feldman 2c8de0ecae Update --help message in db_bench.
Summary:
Remove --help entry for readhot.

Update read_random_exp_range flag description: The distribution is num *
exp(-r), not num * exp(r).

Test Plan: Run ./db_bench --help

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42303
2015-07-15 10:21:09 -07:00
Andres Notzli 6b2d44b2ff Refactoring of writing key/value pairs
Summary:
Before, writing key/value pairs out to files was done inside
ProcessKeyValueCompaction(). To make ProcessKeyValueCompaction()
more understandable, this patch moves the writing part to a separate
function. This is intended to be a stepping stone for additional
changes.

Test Plan: make && make check

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42243
2015-07-15 09:55:45 -07:00
Igor Canadi 9a6a0bd8c9 Style fix in compaction_job.cc
Summary: I didn't know this can work :)

Test Plan: none

Reviewers: sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42081
2015-07-15 09:36:47 +02:00
Andres Notzli ddad40e930 Fixed nullptr deref and added assert
Summary:
Fixes two minor issues in CompactionJob.
CompactionJob::Run() dereferences log_buffer_ without a check, so
this patch adds an assert in the constructor where log_buffer_
is assigned. compaction_job_stats_ can be null but
ProcessKeyValueCompaction was dereferencing it without a check.

Test Plan: make && make check

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42231
2015-07-14 23:12:34 -07:00
Yueh-Hsuan Chiang 801df912af Move UniversalCompaction related db-tests to db_universal_compaction_test.cc
Summary: Move UniversalCompaction related db-tests to db_universal_compaction_test.cc

Test Plan:
db_test
db_universal_compaction_test

Reviewers: igor, sdong, IslamAbdelRahman, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42225
2015-07-14 18:24:45 -07:00
Yueh-Hsuan Chiang 3ca6b2541e Move TailingIterator tests from db_test.cc to db_test_tailing_iterator.cc
Summary: Move TailingIterator tests from db_test.cc to db_test_tailing_iterator.cc

Test Plan:
db_test
db_test_tailing_iterator

Reviewers: igor, anthony, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42021
2015-07-14 16:41:08 -07:00
Yueh-Hsuan Chiang ce829c77e3 Make TransactionLogIterator related tests from db_test.cc to db_log_iter_test.cc
Summary: Make TransactionLogIterator related tests from db_test.cc to db_log_iter_test.cc

Test Plan:
db_test
db_log_iter_test

Reviewers: sdong, IslamAbdelRahman, igor, anthony

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42045
2015-07-14 16:08:21 -07:00
Yueh-Hsuan Chiang c3f98bb89b Move CompactionFilter tests in db_test.cc to db_compaction_filter_test.cc
Summary: Move CompactionFilter tests in db_test.cc to db_compaction_filter_test.cc

Test Plan:
db_test
db_compaction_filter_test

Reviewers: igor, sdong, IslamAbdelRahman, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42207
2015-07-14 16:03:47 -07:00
agiardullo 18d5e1bf88 Remove db_impl_readonly dependency on utilities
Summary: Seems like the cleanest way to resolve this is to move CompactedDBImpl into db/.  CompactedDBImpl should probably live in the same place as DBImplReadonly since Opening the latter could end up instantiating the former.  Both DBImplReadonly and CompactedDBImpl inherit from DBImpl access protected members of DBImpl( and the latter access friendly private methods).

Test Plan: unit tests

Reviewers: sdong, yhchiang, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42027
2015-07-14 11:32:54 -07:00
Islam AbdelRahman 49640bd82f Allow write_batch_test to run with ROCKSDB_LITE
Summary: Remove ColumnFamiliesBatchWithIndexTest from ROCKSDB_LITE since WriteBatchWithIndex is not supported

Test Plan: write_batch_test

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42033
2015-07-14 10:35:35 -07:00
Igor Canadi a9c5109515 Deprecate purge_redundant_kvs_while_flush
Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.

Test Plan: none

Reviewers: sdong, yhchiang, anthony, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42063
2015-07-14 13:07:02 +02:00
Igor Canadi 5aea98ddd8 Deprecate WriteOptions::timeout_hint_us
Summary:
In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
* Less code == better icache hit rate, smaller builds, simpler code
* The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
* Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.

Test Plan: make check

Reviewers: anthony, kradhakrishnan, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41937
2015-07-14 09:35:48 +02:00
Andres Notzli ae29495e4b Avoid manipulating const char* arrays
Summary:
We were manipulating `const char*` arrays in CompactionJob to
change the sequence number/types of keys. This patch changes
UpdateInternalKey() to use string methods to do the manipulation
and updates all calls accordingly.

Test Plan:
Added test case for UpdateInternalKey() in dbformat_test.
make && make check

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41985
2015-07-14 00:21:41 -07:00
Andres Notzli ab137af4ba Partial cleanup of CompactionJob
Summary:
Logging, dealing with key prefix batches and updating stats
moved from CompactionJob::Run() into separate functions.

Test Plan: make all && make check

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41919
2015-07-14 00:09:20 -07:00
Yueh-Hsuan Chiang b10cf4e2e0 Move DynamicLevel related db-tests to db_dynamic_level_test.cc
Summary: Move DynamicLevel related db-tests to db_dynamic_level_test.cc

Test Plan:
db_dynamic_level_test
db_test

Reviewers: igor, anthony, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42039
2015-07-13 19:00:30 -07:00
Islam AbdelRahman e290f5d3ce Block reduce_levels_test in ROCKSDB_LITE
Summary: Block reduce_levels_test in ROCKSDB_LITE as LDBCommand is not supported

Test Plan:
make reduce_levels_test -j64
OPT=-DROCKSDB_LITE make reduce_levels_test -j64
make check -j64

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41967
2015-07-13 18:45:55 -07:00
Yueh-Hsuan Chiang 49f42ad032 Move global static functions in db_test_util to DBTestBase
Summary:
Move global static functions in db_test_util to DBTestBase.
This is to prevent unused function warning when decoupling
db_test.cc into multiple files.

Test Plan: db_test

Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D42009
2015-07-13 17:41:41 -07:00
Yueh-Hsuan Chiang 625467a08a Move reusable part of db_test.cc to util/db_test_util.h
Summary:
Move reusable part of db_test.cc to util/db_test_util.h.
This makes it more possible to partition db_test.cc into
multiple smaller test files.

Also, fixed many old lint errors in db_test.

Test Plan: db_test

Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan

Reviewed By: sdong, kradhakrishnan

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41973
2015-07-13 16:53:38 -07:00
Ari Ekmekji 8bca83e5dd Add tombstone information in CompactionJobStats
Summary:
Added new statistics in CompactionJobStats to keep track of
deletion entries and the expiration of those entries. Updated these
fields in compaction_job.cc as compaction took place and wrote a new
test in compaction_job_stats_test.cc to verify accuracy.

Test Plan:
Wrote new test DeletionStatsTest in
compaction_job_stats_test.cc to verify

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41355
2015-07-13 15:51:38 -07:00
sdong f9728640f3 "make format" against last 10 commits
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.

Test Plan: Build it.

Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41961
2015-07-13 13:50:18 -07:00
sdong 76d3cd3286 Fix public API dependency on internal codes and dependency on MAX_INT32
Summary:
Public API depends on port/port.h which is wrong. Fix it.
Also with gcc 4.8.1 build was broken as MAX_INT32 was not recognized. Fix it by using ::max in linux.

Test Plan: Build it and try to build an external project on top of it.

Reviewers: anthony, yhchiang, kradhakrishnan, igor

Reviewed By: igor

Subscribers: yoshinorim, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41745
2015-07-11 10:32:11 -07:00
sdong 5fd11853cb Print Fast CRC32 support information in DB LOG
Summary: Print whether fast CRC32 is supported in DB info LOG

Test Plan: Run db_bench and see it prints out correctly.

Reviewers: yhchiang, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: MarkCallaghan, yoshinorim, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41733
2015-07-10 17:59:36 -07:00
Siying Dong e41cbd9c2f Merge pull request #646 from yuslepukhin/ms_win_port
Windows Port from Microsoft
2015-07-10 15:53:39 -07:00
sdong 041b6f95a2 perf_context: report time spent on reading index and bloom blocks
Summary: Add a perf context counter to help users figure out time spent on reading indexes and bloom filter blocks.

Test Plan: Will write a unit test

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41433
2015-07-10 14:45:42 -07:00
Dmitri Smirnov c903ccc4c2 Merge from github/master 2015-07-09 18:01:08 -07:00
krad 7189e90c22 Fix a noisy unit test.
Summary: The t/DBTest.DropWrites test still fails under certain gcc version in release unit test.

I unfortunately cannot repro the failure (since the compilers have mapped library which I am not able to map to correctly). I am suspecting the clock skew.

Test Plan: Run make check

Reviewers:

CC: sdong igore

Task ID: #7312624

Blame Rev:
2015-07-09 13:43:52 -07:00
Aaron Feldman 1f4d565709 Add db_bench flag to set cache_index_and_filter_blocks
Summary:
The new flag --cache_index_and_filter_blocks sets
BlockBasedTableOptions.cache_index_and_filter_blocks

Test Plan: make db_bench. Working on benchmarks with the new flag.

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41481
2015-07-09 13:36:16 -07:00
Dmitri Smirnov d8586ab22b All of these are in the new code added past 3.10
1) Crash in env_win.cc that prevented db_test run to completion and some new tests
     2) Fix new corruption tests in DBTest by allowing a shared trunction of files. Note that this is generally needed ONLY for tests.
     3) Close database so WAL is closed prior to inducing corruption similar to what we did within Corruption tests.
2015-07-08 16:56:45 -07:00
Poornima Chozhiyath Raman 4bed00a44b Fix function name format according to google style
Summary: Change the naming style of getter and setters according to Google C++ style in compaction.h file

Test Plan: Compilation success

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41265
2015-07-08 15:21:10 -07:00
krad e2e3d84b2c Added multi WAL log testing to recovery tests.
Summary: Currently there is no test in the suite to test the case where
there are multiple WAL files and there is a corruption in one of them. We have
tests for single WAL file corruption scenarios. Added tests to mock
the scenarios for all combinations of recovery modes and corruption in
specified file locations.

Test Plan: Run make check

Reviewers: sdong igor

CC: leveldb@

Task ID: #7501229

Blame Rev:
2015-07-08 14:39:00 -07:00
Dmitri Smirnov ef4b87f1b2 Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
agiardullo 4f56632b16 Fix occasional failure in compaction_job_test
Summary: Coverage test has been occasionally failing due to this timing check.

Test Plan: run test

Reviewers: yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41367
2015-07-07 16:10:23 -07:00
Poornima Chozhiyath Raman 411c8e3d19 Build fail fix
Summary: Build fail fix. Type cast issues.

Test Plan: compiled

Reviewers: sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41349
2015-07-07 15:21:17 -07:00
Poornima Chozhiyath Raman c0b23dd5b0 Enabling trivial move in universal compaction
Summary: This change enables trivial move if all the input files are non onverlapping while doing Universal Compaction.

Test Plan: ./compaction_picker_test and db_test ran successfully with the new testcases.

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40875
2015-07-07 14:18:55 -07:00
Yueh-Hsuan Chiang d8e3e766f9 Fixed a bug in test ThreadStatusSingleCompaction
Summary:
Fixed a bug in test ThreadStatusSingleCompaction where
SyncPoint traces are not cleared before the test begins
its second iteration.

Test Plan: db_test

Reviewers: sdong, anthony, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41337
2015-07-07 14:00:03 -07:00
Yueh-Hsuan Chiang 4ce5be4255 fixed leaking log::Writers
Summary: Fixes valgrind errors in column_family_test.

Test Plan: `make check`, `make valgrind_check`

Reviewers: igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41181
2015-07-07 12:10:10 -07:00
Yueh-Hsuan Chiang 685582a0b4 Revert two diffs related to DBIter::FindPrevUserKey()
Summary:
This diff reverts the following two previous diffs related to
DBIter::FindPrevUserKey(), which makes db_stress unstable.
We should bake a better fix for this.

* "Fix a comparison in DBIter::FindPrevUserKey()"
  ec70fea4c4.

* "Fixed endless loop in DBIter::FindPrevUserKey()"
  acee2b08a2.

Test Plan: db_stress

Reviewers: anthony, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41301
2015-07-07 11:36:24 -07:00
Andres Notzli 58d7ab3c68 Added tests for ExpandWhileOverlapping()
Summary:
This patch adds three test cases for ExpandWhileOverlapping()
to the compaction_picker_test test suite.
ExpandWhileOverlapping() only has an effect if the comparison
function for the internal keys allows for overlapping user
keys in different SST files on the same level. Thus, this
patch adds a comparator based on sequence numbers to
compaction_picker_test for the new test cases.

Test Plan:
- make compaction_picker_test && ./compaction_picker_test
  -> All tests pass
- Replace body of ExpandWhileOverlapping() with `return true`
  -> Compile and run ./compaction_picker_test as before
  -> New tests fail

Reviewers: sdong, yhchiang, rven, anthony, IslamAbdelRahman, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41277
2015-07-06 22:25:27 -07:00
Igor Canadi 155ce60daf Fix compaction_job_test
Summary:
Two issues:
* the input keys to the compaction don't include sequence number.
* sequence number is set to max(seq_num), but it should be set to max(seq_num)+1, because the condition here is strictly-larger (i.e. we will only zero-out sequence number if the DB's sequence number is strictly greater than the key's sequence number): https://github.com/facebook/rocksdb/blob/master/db/compaction_job.cc#L830

Test Plan: make compaction_job_test && ./compaction_job_test

Reviewers: sdong, lovro

Reviewed By: lovro

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41247
2015-07-06 11:14:08 -07:00
Dmitri Smirnov d2f0912bd3 Merge the latest changes from github/master 2015-07-02 17:23:41 -07:00
Mike Kolupaev 218487d8dc [wal changes 1/3] fixed unbounded wal growth in some workloads
Summary:
This fixes the following scenario we've hit:
 - we reached max_total_wal_size, created a new wal and scheduled flushing all memtables corresponding to the old one,
 - before the last of these flushes started its column family was dropped; the last background flush call was a no-op; no one removed the old wal from alive_logs_,
 - hours have passed and no flushes happened even though lots of data was written; data is written to different column families, compactions are disabled; old column families are dropped before memtable grows big enough to trigger a flush; the old wal still sits in alive_logs_ preventing max_total_wal_size limit from kicking in,
 - a few more hours pass and we run out disk space because of one huge .log file.

Test Plan: `make check`; backported the new test, checked that it fails without this diff

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40893
2015-07-02 14:27:00 -07:00
Dmitri Smirnov 9dbde7277c Merge remote-tracking branch 'origin' into ms_win_port 2015-07-02 11:34:22 -07:00
Dmitri Smirnov 18285c1e2f Windows Port from Microsoft
Summary: Make RocksDb build and run on Windows to be functionally
 complete and performant. All existing test cases run with no
 regressions. Performance numbers are in the pull-request.

 Test plan: make all of the existing unit tests pass, obtain perf numbers.

 Co-authored-by: Praveen Rao praveensinghrao@outlook.com
 Co-authored-by: Sherlock Huang baihan.huang@gmail.com
 Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
 Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
2015-07-01 16:13:56 -07:00
sdong 05e2831966 Allocate LevelFileIteratorState and LevelFileNumIterator from DB iterator's arena
Summary: Try to allocate LevelFileIteratorState and LevelFileNumIterator from DB iterator's arena, instead of calling malloc and free.

Test Plan: valgrind check

Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40929
2015-06-30 17:30:38 -07:00
krad b0f1927dbb Increasing timeout for drop writes.
Summary: We have a race in the way test works. We avoided the race by adding the
wait to the counter. I thought 1s was eternity, but that is not true in some
scenarios. Increasing the timeout to 10s and adding warnings.

Also, adding nosleep to avoid the case where the wakeup thread is waiting behind
the sleeping thread for scheduling.

Test Plan: Run make check

Reviewers: siying igorcanadi

CC: leveldb@

Task ID: #7312624

Blame Rev:
2015-06-30 11:11:56 -07:00
Tomislav Novak ec70fea4c4 Fix a comparison in DBIter::FindPrevUserKey()
Summary:
When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()`
advances the underlying iterator _past_ the current key (`saved_key_`); see
`MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_`
points to an entry with the same user key as `saved_key_`. As a result,
`it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the
_next_, instead of the previous, entry (new test, written by @lovro, reproduces
the bug).

This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than
`saved_key_`.

Test Plan: db_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba, lovro

Differential Revision: https://reviews.facebook.net/D40791
2015-06-29 17:04:03 -07:00
Yueh-Hsuan Chiang 501591c423 Make column_family_test runnable in ROCKSDB_LITE
Summary: Make column_family_test runnable in ROCKSDB_LITE.

Test Plan: column_family_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40251
2015-06-29 14:39:01 -07:00
krad 6199cba998 Fix race in unit test.
Summary: Avoid falling victim to race condition.

Test Plan: Run the unit test

Reviewers: sdong igor

CC: leveldb@

Task ID: #7312624

Blame Rev:
2015-06-29 11:40:21 -07:00
Igor Canadi 4cbc4e6f88 Call merge operators with empty values
Summary: It's not really nice to call user's API with garbage data in new_value. This diff makes sure that new_value is empty before calling the merge operator.

Test Plan: Added assert to Merge operator in merge_test

Reviewers: sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40773
2015-06-26 11:35:46 -07:00
Venkatesh Radhakrishnan c9cd404bcd Make flush check for shutdown
Summary:
Fixes task 7156865 where a compaction causes a hang in flush
memtable if CancelAllBackgroundWork was called prior to it.
Stack trace is in : https://phabricator.fb.com/P19848829
We end up waiting for a flush which will never happen because there are no background threads.

Test Plan: PreShutdownFlush

Reviewers: sdong, igor

Reviewed By: sdong, igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40617
2015-06-25 14:43:25 -07:00
Poornima Chozhiyath Raman 4fb09c6871 Updating SeekToLast with upper bound
Summary: #7124486: RocksDB's Iterator.SeekToLast should seek to the last key before iterate_upper_bound if presents

Test Plan: ./db_iter_test run successfully with the new testcase

Reviewers: rven, yhchiang, igor, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40425
2015-06-25 09:44:30 -07:00
Yueh-Hsuan Chiang dec2c9f564 Make table_properties_collector_test runnable in ROCKSDB_LITE
Summary: Make table_properties_collector_test runnable in ROCKSDB_LITE

Test Plan: table_properties_collector_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40581
2015-06-24 01:38:53 -07:00
Islam AbdelRahman 674b1181cf Bottommost level compaction option
Summary: Replace force_bottommost_level_compaction in CompactRangeOption with an option that allow the user to (always skip, always compact, compact if compaction filter is present) the bottommost level for level based compaction.

Test Plan: make check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40527
2015-06-23 13:32:40 -07:00
Giuseppe Ottaviano 782a1590f9 Implement a table-level row cache
Summary:
Implementation of a table-level row cache.
It only caches point queries done through the `DB::Get` interface, queries done through the `Iterator` interface will completely skip the cache.

Supports snapshots and merge operations.

Test Plan: Ran `make valgrind_check commit-prereq`

Reviewers: igor, philipp, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D39849
2015-06-23 10:25:45 -07:00
krad de85e4cadf Introduce WAL recovery consistency levels
Summary:
The "one size fits all" approach with WAL recovery will only introduce inconvenience for our varied clients as we go forward. The current recovery is a bit heuristic. We introduce the following levels of consistency while replaying the WAL.

1. RecoverAfterRestart (kTolerateCorruptedTailRecords)

This mocks the current recovery mode.

2. RecoverAfterCleanShutdown (kAbsoluteConsistency)

This is ideal for unit test and cases where the store is shutdown cleanly. We tolerate no corruption or incomplete writes.

3. RecoverPointInTime (kPointInTimeRecovery)

This is ideal when using devices with controller cache or file systems which can loose data on restart. We recover upto the point were is no corruption or incomplete write.

4. RecoverAfterDisaster (kSkipAnyCorruptRecord)

This is ideal mode to recover data. We tolerate corruption and incomplete writes, and we hop over those sections that we cannot make sense of salvaging as many records as possible.

Test Plan:
(1) Run added unit test to cover all levels.
(2) Run make check.

Reviewers: leveldb, sdong, igor

Subscribers: yoshinorim, dhruba

Differential Revision: https://reviews.facebook.net/D38487
2015-06-22 15:28:12 -07:00
Islam AbdelRahman 530534fceb Fix trivial move merge
Summary: Fixing bad merge

Test Plan: make -j64 check (this is not enough to verify the fix)

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40521
2015-06-22 15:20:30 -07:00
Venkatesh Radhakrishnan 04251e1e3a Add wal files to Checkpoint for multiple column families.
Summary:
When there are multiple column families, the flush in
GetLiveFiles is not atomic, so that there are entries in the wal files
which are needed to get a consisten RocksDB. We now add the log files to
the checkpoint.

Test Plan:
CheckpointCF - This test forces more data to be written to
the other column families after the flush of the first column family but
before the second.

Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40323
2015-06-19 16:08:31 -07:00
Igor Canadi bf03f59c11 Disable CompressLevelCompaction() if Zlib is not supported
Summary: CompressLevelCompaction() depends on Zlib. We should skip it when zlib is not present.

Test Plan: `make check` without zlib

Reviewers: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40401
2015-06-18 18:46:26 -07:00
Igor Canadi 760e9a94de Fail DB::Open() when the requested compression is not available
Summary:
Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.

This patch fails DB::Open() if we don't support the compression that is specified in the options.

Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39687
2015-06-18 14:55:05 -07:00
Islam AbdelRahman 4eabbdb7ec Skip bottommost level compaction if possible
Summary:
This is https://reviews.facebook.net/D39999 but after introducing an option to force compaction the bottom most level

Changes in this patch
- Introduce force_bottommost_level_compaction to CompactRangeOptions that force compacting bottommost level during compaction
- Skip bottommost level compaction if we dont have a compaction filter and force_bottommost_level_compaction options is not set

Although tests pass on my machine but I suspect that there maybe some tests that I am not aware of that  should use force_bottommost_level_compaction to pass in a deterministic way

Test Plan:
make check
adding new tests

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40059
2015-06-18 11:03:31 -07:00
Igor Canadi 4b8bb62f0a Don't dump DBOptions for each column family
Summary: Currently we dump DBOptions for each column family options we dump. This leads to duplicate lines in our LOG file. This diff fixes that.

Test Plan: Check out the LOG

Reviewers: sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: IslamAbdelRahman, yoshinorim, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39729
2015-06-18 10:15:54 -07:00
Poornima Chozhiyath Raman 176f0bedce Merge branch 'master' of github.com:facebook/rocksdb
D40233: Replace %llu with format macros in ParsedInternalKey::DebugString())
2015-06-18 10:12:55 -07:00
Yueh-Hsuan Chiang bb1c74ce18 Fixed a bug of CompactionStats in multi-level universal compaction case
Summary:
Universal compaction can involves in multiple levels.  However,
the current implementation of bytes_readn and bytes_readnp1
(and some other stats with postfix `n` and `np1`) assumes compaction
can only have two levels.

This patch fixes this bug and redefines bytes_readn and bytes_readnp1:
* bytes_readnp1: the number of bytes read in the compaction output level.
* bytes_readn: the total number of bytes read minus bytes_readnp1

Test Plan: Add a test in compaction_job_stats_test

Reviewers: igor, sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40239
2015-06-17 23:40:34 -07:00
Poornima Chozhiyath Raman a66b8157df Merge branch 'master' of github.com:facebook/rocksdb
D40233: Replace %llu with format macros in ParsedInternalKey::DebugString())
2015-06-17 20:44:52 -07:00