Commit Graph

981 Commits

Author SHA1 Message Date
Venkatesh Radhakrishnan 8984e5f848 Fix race in sync point.
Summary:
The LoadDependency function does not take a lock when it runs
and it could be modifying data structures while other threads are
accessing it.

Test Plan: Run TSAN.

Reviewers: igor, sdong

Reviewed By: igor, sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D34095
2015-02-26 15:11:50 -08:00
Igor Sugak 03b432d4b8 rocksdb: Fix uninitialized use error
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Some errors are uninitialized use errors.

```
...
  CC       db/log_test.o
util/ldb_cmd.cc:394:16: error: base class 'rocksdb::LDBCommand' is uninitialized when used here to access 'rocksdb::LDBCommand::BuildCmdLineOptions' [-Werror,-Wuninitialized]
               BuildCmdLineOptions({ARG_FROM, ARG_TO, ARG_HEX, ARG_KEY_HEX,
               ^
...
```

```lang=c++
CompactorCommand::CompactorCommand(const vector<string>& params,
      const map<string, string>& options, const vector<string>& flags) :
    LDBCommand(options, flags, false,
               BuildCmdLineOptions({ARG_FROM, ARG_TO, ARG_HEX, ARG_KEY_HEX,
                                    ARG_VALUE_HEX, ARG_TTL})),
    null_from_(true), null_to_(true) {
. . .
}
```
For the fourth parameter of the base constructor (`LDBCommand`) we call `BuildCmdLineOptions`, which is a private non-static method of `LDBCommand` base class.

This diff adds missing `static` keyword for `LDBCommand::BuildCmdLineOptions` method.

Test Plan:
Build with trunk clang and make sure all tests are passing.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make check
``

Reviewers: meyering, sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34083
2015-02-26 14:19:51 -08:00
Igor Sugak 62247ffa3b rocksdb: Add missing override
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.

Prerequisites: bear and clang 3.5 build with extra tools

```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```

Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```

Reviewers: igor, kradhakrishnan, rven, meyering, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34077
2015-02-26 11:28:41 -08:00
krad d9f4875e52 Disable pre-fetching of index and filter blocks for sst_dump_tool.
Summary:
BlockBasedTable pre-fetches the filter and index blocks on Open call.
This is an optimistic optimization targeted for runtime scenario. The
optimization is unnecessary for sst_dump_tool

- Added a provision to disable pre-fetching of index and filter blocks
  in BlockBasedTable
- Disabled pre-fetching for the sst_dump tool

Stack for reference :

#01  0x00000000005ed944 in snappy::InternalUncompress<snappy::SnappyArrayWriter> () from /home/engshare/third-party2/snappy/1.0.3/src/snappy-1.0.3/snappy.cc:148
#02  0x00000000005edeee in snappy::RawUncompress () from /home/engshare/third-party2/snappy/1.0.3/src/snappy-1.0.3/snappy.cc:947
#03  0x00000000004e0b4d in rocksdb::UncompressBlockContents () from /data/users/paultuckfield/rocksdb/./util/compression.h:69
#04  0x00000000004e145c in rocksdb::ReadBlockContents () from /data/users/paultuckfield/rocksdb/table/format.cc:334
#05  0x00000000004ca424 in rocksdb::(anonymous namespace)::ReadBlockFromFile () from /data/users/paultuckfield/rocksdb/table/block_based_table_reader.cc:70
#06  0x00000000004cccad in rocksdb::BlockBasedTable::CreateIndexReader () from /data/users/paultuckfield/rocksdb/table/block_based_table_reader.cc:173
#07  0x00000000004d17e5 in rocksdb::BlockBasedTable::Open () from /data/users/paultuckfield/rocksdb/table/block_based_table_reader.cc:553
#08  0x00000000004c8184 in rocksdb::BlockBasedTableFactory::NewTableReader () from /data/users/paultuckfield/rocksdb/table/block_based_table_factory.cc:51
#09  0x0000000000598463 in rocksdb::SstFileReader::NewTableReader () from /data/users/paultuckfield/rocksdb/util/sst_dump_tool.cc:69
#10  0x00000000005986c2 in rocksdb::SstFileReader::SstFileReader () from /data/users/paultuckfield/rocksdb/util/sst_dump_tool.cc:26
#11  0x0000000000599047 in rocksdb::SSTDumpTool::Run () from /data/users/paultuckfield/rocksdb/util/sst_dump_tool.cc:332
#12  0x0000000000409b06 in main () from /data/users/paultuckfield/rocksdb/tools/sst_dump.cc:12

Test Plan:
- Added a unit test to trigger the code.
- Also did some manual verification.
- Passed all unit tests

task #6296048

Reviewers: igor, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34041
2015-02-25 16:34:26 -08:00
krad a047409ae9 Fixed a bug in the test case
Summary:
The unit test was supposed to check that the old file and the new file contains
the header message.

Test Plan: Run the unit test.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33705
2015-02-23 11:59:39 -08:00
Jim Meyering 1b4082581c mark as unused some variables with cpp-derived names
Summary: as above

Test Plan:
  Run "make EXTRA_CXXFLAGS='-W -Wextra'" and see fewer errors.

Reviewers: ljin, sdong, igor.sugak, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33753
2015-02-20 11:07:21 -08:00
Islam AbdelRahman 06a766de56 Adding Flush to AutoRollLogger
Summary:
During running AutoRollLoggerTest on FreeBSD we have found out that AutoRollLogger is not flushing correctly (test fails on FreeBSD)
This diff add Flush to AutoRollLogger to fix this problem

Test Plan:
[My machine] make all check ( all tests pass)
[FreeBSD VM] running AutoRollLoggerTest ( all tests pass )

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33633
2015-02-20 10:58:29 -08:00
Igor Canadi 92416fa7f2 Fix mac build 2015-02-19 19:26:38 -08:00
Igor Canadi 96ab15d306 GetOptionsFromString + fixes to block_based_table_options
Summary:
In mongo, we currently have a single column family and I'd like to support setting rocksdb::Options from string. This diff provides an option to GetOptionsFromString()

There's one more problem. Currently GetColumnFamilyOptionsFromString() overwrites block_based options. In mongo I set default values for block_cache and some other values of BlockBasedTableOptions and I don't want them reset to default with GetOptionsFromString().

Test Plan: added unit test

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33729
2015-02-19 19:11:14 -08:00
Jim Meyering a42324e370 build: do not relink every single binary just for a timestamp
Summary:
Prior to this change, "make check" would always waste a lot of
time relinking 60+ binaries. With this change, it does that
only when the generated file, util/build_version.cc, changes,
and that happens only when the date changes or when the
current git SHA changes.

This change makes some other improvements: before, there was no
rule to build a deleted util/build_version.cc. If it was somehow
removed, any attempt to link a program would fail.
There is no longer any need for the separate file,
build_tools/build_detect_version.  Its functionality is
now in the Makefile.

* Makefile (DEPFILES): Don't filter-out util/build_version.cc.
No need, and besides, removing that dependency was wrong.
(date, git_sha, gen_build_version): New helper variables.
(util/build_version.cc): New rule, to create this file
and update it only if it would contain new information.
* build_tools/build_detect_platform: Remove file.
* db/db_impl.cc: Now, print only date (not the time).
* util/build_version.h (rocksdb_build_compile_time): Remove
declaration.  No longer used.

Test Plan:
- Run "make check" twice, and note that the second time no linking is performed.
- Remove util/build_version.cc and ensure that any "make"
command regenerates it before doing anything else.
- Run this: strings librocksdb.a|grep _build_.
That prints output including the following:

  rocksdb_build_git_date:2015-02-19
  rocksdb_build_git_sha:2.8.fb-1792-g3cb6cc0

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33591
2015-02-19 13:11:10 -08:00
Venkatesh Radhakrishnan 7d817268b9 Managed iterator
Summary:
This is a diff for managed iterator. A managed iterator
is a wrapper around an iterator which saves the options for that
iterator as well as the current key/value so that the underlying iterator
and its associated memory can be released when it is aged out
automatically or on the request of the user. Will provide the automatic release as a follow-up diff.

Test Plan: Managed* tests in db_test and XF tests for managed iterator

Reviewers: igor, yhchiang, anthony, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31401
2015-02-18 11:49:31 -08:00
Igor Sugak 3ad6b794cf rocksdb: Fix 'Division by zero' scan-build warning
Summary:
scan-build complains with division by zero warning in a test. Added an assertion to prevent this.

scan-build report: http://home.fburl.com/~sugak/latest6/report-c61be9.html#EndPath

Test Plan:
Make sure scan-build does not report 'Division by zero' and all tests are passing.
```lang=bash
% make analyze
% make check
```

Reviewers: igor, meyering

Reviewed By: meyering

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33495
2015-02-17 17:54:02 -08:00
Yueh-Hsuan Chiang f0c36da6ee Add thread_status_util_debug.cc back
Summary:
Add thread_status_util_debug.cc back as InstrumentedMutex related tests
are using it to produce wait that can be reflected in the counter.

Test Plan:
./perf_context_test
export ROCKSDB_TESTS=MutexWaitStats
./db_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33525
2015-02-17 11:29:36 -08:00
Yueh-Hsuan Chiang e60bc99fe0 Allow GetThreadList to reflect flush activity.
Summary: Allow GetThreadList to reflect flush activity.

Test Plan:
Developed ThreadStatusFlush test and updated ThreadStatusMultiCompaction test.

./db_test  ./thread_list_test

Reviewers: sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D32871
2015-02-17 10:13:52 -08:00
Igor Sugak 4e4b857841 rocksdb: Fix scan-build 'Called C++ object pointer is null' and 'Dereference of null pointer' bugs
Summary:
In the existing implementation of `ASSERT*`, test termination happens in `~Tester`, which is called when instance of `Tester` goes out of scope. This is the cause of many scan-build bugs.

This diff changes `ASSERT*` to terminate the test immediately. Also added one suppression in `util/signal_test.cc`

scan-build bugs
before: http://home.fburl.com/~sugak/latest/index.html
after: http://home.fburl.com/~sugak/latest2/index.html

Test Plan:
Modify some test to fail an assertion and make sure that `ASSERT*` terminated the test.

Run `make analyze` and make sure no 'Called C++ object pointer is null' and 'Dereference of null pointer' bugs reported.

Run tests and make sure no failing tests:
```lang=bash
% make check
% USE_CLANG=1 make check
```

Reviewers: meyering, lgalanis, sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33381
2015-02-13 15:10:47 -08:00
sdong 6d6305dd7d Perf Context to report DB mutex waiting time
Summary: Add counters in perf context to allow users to figure out how time spent on waiting for DB mutex

Test Plan: Add a test and run it.

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D33177
2015-02-09 17:55:12 -08:00
fyrz cfe8837e43 Switch logv with loglevel to virtual 2015-02-09 20:59:29 +01:00
Igor Canadi 218c3ecea3 Fix std::cout data race
Summary: std::cout is not thread safe. tsan complains. Eliminate it.

Test Plan: env_test with TSAN

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33087
2015-02-06 13:01:59 -08:00
Karthikeyan Radhakrishnan 8f679c2901 Merge branch 'master' of github.com:facebook/rocksdb 2015-02-06 11:02:25 -08:00
Karthikeyan Radhakrishnan da9cbce731 Add Header to logging to capture application level information
Summary:
This change adds LogHeader provision to the logger. For the rolling logger
implementation, the headers are copied over to the new log file every time
there is a log roll over.

Test Plan: Added a unit test to test the rolling log case.

Reviewers: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D32817
2015-02-06 10:37:45 -08:00
Igor Canadi 2a979822b6 Fix deleting obsolete files
Summary:
This diff basically reverts D30249 and also adds a unit test that was failing before this patch.

I have no idea how I didn't catch this terrible bug when writing a diff, sorry about that :(

I think we should redesign our system of keeping track of and deleting files. This is already a second bug in this critical piece of code. I'll think of few ideas.

BTW this diff is also a regression when running lots of column families. I plan to revisit this separately.

Test Plan: added a unit test

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33045
2015-02-06 08:44:30 -08:00
Yueh-Hsuan Chiang 8e83a9d315 Add a missing field for STATE_MUTEX_WAIT to global_state_table
Summary:
Add a missing field for STATE_MUTEX_WAIT to global_state_table.
This will fix the failure of thread_list_test.

Test Plan:
thread_list_test
2015-02-06 02:38:14 -08:00
Yueh-Hsuan Chiang 181191a1e4 Add a counter for collecting the wait time on db mutex.
Summary:
Add a counter for collecting the wait time on db mutex.
Also add MutexWrapper and CondVarWrapper for measuring wait time.

Test Plan:
./db_test
export ROCKSDB_TESTS=MutexWaitStats
./db_test

verify stats output using db_bench
make clean
make release
./db_bench --statistics=1 --benchmarks=fillseq,readwhilewriting --num=10000 --threads=10

Sample output:
    rocksdb.db.mutex.wait.micros COUNT : 7546866

Reviewers: MarkCallaghan, rven, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32787
2015-02-04 21:39:45 -08:00
sdong fe9f691194 Fix fault_injestion_test
Summary: A bug in MockEnv causes fault_injestion_test to fail. I don't know why it doesn't fail every time but it doesn't seem to be right.

Test Plan:
Run fault_injestion_test
Also run db_test with MEM_ENV=1 until the first failure.

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32877
2015-02-04 17:49:07 -08:00
Yueh-Hsuan Chiang 678503ebcf Add utility functions for interpreting ThreadStatus
Summary:
Add ThreadStatus::GetOperationName() and ThreadStatus::GetStateName(),
two utility functions that help interpreting ThreadStatus.

Test Plan: ./thread_list_test

Reviewers: sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32793
2015-02-04 01:47:32 -08:00
Yueh-Hsuan Chiang 756e1f151e Remove unused util/thread_event_info.h
Summary:
Remove unused util/thread_event_info.h, which is replaced by
util/thread_operation.h

Test Plan:
make dbg -j32
make release -j32
2015-02-03 17:53:05 -08:00
sdong 829363b449 Options::PrepareForBulkLoad() to increase parallelism of flushes
Summary: Increasing parallelism of flushes will help bulk load throughput.

Test Plan: Compile it.

Reviewers: MarkCallaghan, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32685
2015-02-03 09:46:04 -08:00
Igor Canadi b04408c47b Fix unity build
Summary: I broke it with 2fd8f750ab

Test Plan: make unity

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32577
2015-02-03 00:32:11 -06:00
Yueh-Hsuan Chiang 2c2d5ab7e8 Fix compile warning in util/xfunc.h
Summary:
./util/xfunc.h:31:1: error: class 'Options' was previously declared as a struct [-Werror,-Wmismatched-tags]
class Options;
^

Test Plan:
make dbg -j32
2015-02-02 15:20:19 -08:00
Venkatesh Radhakrishnan 0b8dec7172 Cross functional test infrastructure for RocksDB.
Summary:
This Diff provides the implementation of the cross functional
test infrastructure. This provides the ability to test a single feature
with every existing regression test in order to identify issues with
interoperability between features.

Test Plan:
Reference implementation of inplace update support cross
functional test. Able to find interoperability issues with inplace
support and ran all of db_test. Will add separate diff for those changes.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32247
2015-02-02 14:49:22 -08:00
Erik Garrison e6eaf938c3 remove old debugging message (#487)
It doesn't seem this is needed.
2015-02-01 20:34:24 +00:00
sdong 5917de0bae CappedFixTransform: return fixed length prefix, or full key if key is shorter than the fixed length
Summary: Add CappedFixTransform, which is the same as fixed length prefix extractor, except that when slice is shorter than the fixed length, it will use the full key.

Test Plan:
Add a test case for
db_test
options_test
and a new test

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: MarkCallaghan, leveldb, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D31887
2015-01-30 16:04:30 -08:00
Igor Canadi 2fd8f750ab Compile MemEnv with standard RocksDB library
Summary: This was a feature request by osquery. See task t5617758

Test Plan: compiles and memenv_test runs

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32115
2015-01-29 16:33:11 -08:00
sdong e84299c769 Fix bug recently introduced in MemFile::Lock()
Summary: This bug fails DBTest.CheckLock

Test Plan: DBTest.CheckLock now passes with MEM_ENV=1.

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32451
2015-01-28 15:21:39 -08:00
sdong e5aab4c2b2 Fix data race in HashLinkList
Summary:
1) need to do acquire load when read the first entry in the bucket.
2) Make num_entries atomic

Test Plan: Ran DBTest.MultiThreaded with TSAN

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32361
2015-01-28 15:16:52 -08:00
sdong 10af17f3d7 fault_injection_test: add a unit test to allow parallel compactions and multiple levels
Summary: Add a new test case in fault_injection_test, which covers parallel compactions and multiple levels. Use MockEnv to run the new test case to speed it up. Improve MockEnv to avoid DestoryDB(), previously failed when deleting lock files.

Test Plan: Run ./fault_injection_test, including valgrind

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32415
2015-01-28 14:07:25 -08:00
Igor Canadi 0c4d1053df Fix data race #5
Summary: TSAN complained that these are non-atomic reads and writes from different threads.

Test Plan: TSAN no longer complains

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32409
2015-01-28 13:42:40 -08:00
Igor Canadi e8bf2310a0 Remove blob store from the codebase
Summary: We don't have plans to work on this in the short term. If we ever resurrect the project, we can find the code in the history. No need for it to linger around

Test Plan: no test

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32349
2015-01-27 16:55:33 -08:00
sdong 4c49fedaf1 Use ustricter consistency in thread local operations
Summary:
ThreadSanitizer complains data race of super version and version's destructor with Get(). This patch will fix those warning.

The warning is likely from ColumnFamilyData::ReturnThreadLocalSuperVersion(). With relaxed consistency of CAS, reading the data of the super version can technically happen after swapping it in, enabling the background thread to clean it up.

Test Plan: make all check

Reviewers: rven, igor, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32265
2015-01-27 13:56:03 -08:00
Igor Canadi 26b50783d3 Fix assert in histogramData 2015-01-23 18:10:52 -08:00
Yueh-Hsuan Chiang 3b494a6103 Make options_test runnable on ROCKSDB_LITE
Summary:
Make options_test runnable on ROCKSDB_LITE by blocking
those tests that require non-ROCKSDB_LITE feature.

Test Plan:
make options_test OPT=-DROCKSDB_LITE -j32
./options_test
make clean
make options_test -j32
./options_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32025
2015-01-22 16:57:16 -08:00
Yoshinori Matsunobu ceaea2b72d Adding prefix_extractor string config parameter
Summary:
This diff enables to configure prefix_extractor
string parameter as a CF option.

Test Plan: make all check, ./options_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31653
2015-01-16 13:44:42 -08:00
Igor Canadi 9ab5adfc59 New BlockBasedTable version -- better compressed block format
Summary:
This diff adds BlockBasedTable format_version = 2. New format version brings better compressed block format for these compressions:
1) Zlib -- encode decompressed size in compressed block header
2) BZip2 -- encode decompressed size in compressed block header
3) LZ4 and LZ4HC -- instead of doing memcpy of size_t encode size as varint32. memcpy is very bad because the DB is not portable accross big/little endian machines or even platforms where size_t might be 8 or 4 bytes.

It does not affect format for snappy.

If you write a new database with format_version = 2, it will not be readable by RocksDB versions before 3.10. DB::Open() will return corruption in that case.

Test Plan:
Added a new test in db_test.
I will also run db_bench and verify VSIZE when block_cache == 1GB

Reviewers: yhchiang, rven, MarkCallaghan, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31461
2015-01-14 16:24:24 -08:00
Igor Canadi 02b30202cd Merge pull request #455 from Andersbakken/stdlib_fix
Build with clang 3.5 on Linux.
2015-01-13 09:08:32 -08:00
Yueh-Hsuan Chiang c91cdd59c1 Allow GetThreadList() to indicate a thread is doing Compaction.
Summary: Allow GetThreadList() to indicate a thread is doing Compaction.

Test Plan:
export ROCKSDB_TESTS=ThreadStatus
./db_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba, jonahcohen, rven

Differential Revision: https://reviews.facebook.net/D30105
2015-01-13 00:04:08 -08:00
Anders Bakken a9ea65d652 Build with clang 3.5 on Linux. 2015-01-12 09:59:36 -08:00
Igor Canadi abb9b95ffe Move compression functions from port/ to util/
Summary: We keep checksum functions in util/, there is no reason for compression to be in port/

Test Plan: compiles

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31281
2015-01-09 12:57:11 -08:00
Igor Canadi 62ad0a9b19 Deprecating skip_log_error_on_recovery
Summary:
Since https://reviews.facebook.net/D16119, we ignore partial tailing writes. Because of that, we no longer need skip_log_error_on_recovery.

The documentation says "Skip log corruption error on recovery (If client is ok with losing most recent changes)", while the option actually ignores any corruption of the WAL (not only just the most recent changes). This is very dangerous and can lead to DB inconsistencies. This was originally set up to ignore partial tailing writes, which we now do automatically (after D16119). I have digged up old task t2416297 which confirms my findings.

Test Plan: There was actually no tests that verified correct behavior of skip_log_error_on_recovery.

Reviewers: yhchiang, rven, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30603
2015-01-05 13:35:56 -08:00
Yueh-Hsuan Chiang bf287b76e0 Add structures for exposing thread events and operations.
Summary:
Add structures for exposing events and operations.  Event describes
high-level action about a thread such as doing compaciton or
doing flush, while an operation describes lower-level action
of a thread such as reading / writing a SST table, waiting for
mutex.  Events and operations are designed to be independent.
One thread would typically involve in one event and one operation.

Code instrument will be in a separate diff.

Test Plan:
Add unit-tests in thread_list_test
make dbg -j32
./thread_list_test
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: rven, jonahcohen, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29781
2014-12-30 10:39:13 -08:00
Manish Patil 2067058a60 Dump routine to BlockBasedTableReader (valgrind)
Summary: Fixed valgrind issue

Test Plan: valgrind check done

Reviewers: rven, sdong

Reviewed By: sdong

Subscribers: sdong, dhruba

Differential Revision: https://reviews.facebook.net/D30699
2014-12-23 18:01:29 -08:00
Manish Patil 7ea7bdf04d Dump routine to BlockBasedTableReader
Summary: Added necessary routines for dumping block based SST with block filter

Test Plan: Added "raw" mode to utility sst_dump

Reviewers: sdong, rven

Reviewed By: rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D29679
2014-12-23 13:24:07 -08:00
Lei Jin 5045c43944 add support for nested BlockBasedTableOptions in config string
Summary:
Add support to allow nested config for block-based table factory. The format looks like this:

"write_buffer_size=1024;block_based_table_factory={block_size=4k};max_write_buffer_num=2"

Test Plan: unit test

Reviewers: yhchiang, rven, igor, ljin, jonahcohen

Reviewed By: jonahcohen

Subscribers: jonahcohen, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29223
2014-12-22 16:34:21 -08:00
Yueh-Hsuan Chiang 45bab305f9 Move GetThreadList() feature under Env.
Summary:
GetThreadList() feature depends on the thread creation and destruction, which is currently handled under Env.
This patch moves GetThreadList() feature under Env to better manage the dependency of GetThreadList() feature
on thread creation and destruction.

Renamed ThreadStatusImpl to ThreadStatusUpdater.  Add ThreadStatusUtil, which is a static class contains
utility functions for ThreadStatusUpdater.

Test Plan: run db_test, thread_list_test and db_bench and verify the life cycle of Env and ThreadStatusUpdater is properly managed.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: ljin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30057
2014-12-22 12:20:17 -08:00
Venkatesh Radhakrishnan 7198ed5a2e Handle errors during pthread calls
Summary: Release locks before calling exit.

Test Plan: Force errors in debugger and verify correctness

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30423
2014-12-17 16:25:09 -08:00
Qiao Yang cef6f84393 Added 'dump_live_files' command to ldb tool.
Summary:
Priliminary diff to solicit comments.
Given DB path, dump all SST files (key/value and properties), WAL file and manifest
files. What command options do we need to support for this command? Maybe
output_hex for keys?

Test Plan: Create additional ldb unit tests.

Reviewers: sdong, rven

Reviewed By: rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D29547
2014-12-12 17:50:36 -08:00
Yueh-Hsuan Chiang 74b3fb6d97 Fix Mac compile errors on util/cache_test.cc
Summary:
Fix Mac compile errors on util/cache_test.cc

Test Plan:
make dbg -j32
./cache_test
2014-12-11 14:15:13 -08:00
Alexey Maykov ee95cae9a4 Modifed the LRU cache eviction code so that it doesn't evict blocks which have exteranl references
Summary:
Currently, blocks which have more than one reference (ie referenced by something other than cache itself) are evicted from cache. This doesn't make much sense:
- blocks are still in RAM, so the RAM usage reported by the cache is incorrect
- if the same block is needed by another iterator, it will be loaded and decompressed again

This diff changes the reference counting scheme a bit. Previously, if the cache contained the block, this was accounted for in its refcount. After this change, the refcount is only used to track external references. There is a boolean flag which indicates whether or not the block is contained in the cache.
This diff also changes how LRU list is used. Previously, both hashtable and the LRU list contained all blocks. After this change, the LRU list contains blocks with the refcount==0, ie those which can be evicted from the cache.

Note that this change still allows for cache to grow beyond its capacity. This happens when all blocks are pinned (ie refcount>0). This is consistent with the current behavior. The cache's insert function never fails. I spent lots of time trying to make table_reader and other places work with the insert which might failed. It turned out to be pretty hard. It might really destabilize some customers, so finally, I decided against doing this.

table_cache_remove_scan_count_limit option will be unneeded after this change, but I will remove it in the following diff, if this one gets approved

Test Plan: Ran tests, made sure they pass

Reviewers: sdong, ljin

Differential Revision: https://reviews.facebook.net/D25503
2014-12-10 22:28:53 -08:00
Igor Canadi cb82d7b081 Fix #434
Summary: Why do we assert here? This doesn't seem like user friendly thing to do :)

Test Plan: none

Reviewers: sdong, yhchiang, rven

Reviewed By: rven

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30027
2014-12-09 10:22:07 -08:00
Yueh-Hsuan Chiang 8f4e1c1c9a Remove the compability check on log2 OS_ANDROID as it's already blocked by ROCKSDB_LITE
Summary:
Remove the compability check on log2 OS_ANDROID as it's already blocked by ROCKSDB_LITE

Test Plan:
make OPT="-DROCKSDB_LITE -DOS_ANDROID" shared_lib -j32
make shared_lib -j32
2014-12-04 13:56:14 -08:00
Yueh-Hsuan Chiang c4a7423c1d Replace runtime_error exception by abort() in thread_local
Summary: Replace runtime_error exception by abort() in thread_local

Test Plan: make dbg -j32

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29853
2014-12-04 13:35:31 -08:00
Yueh-Hsuan Chiang 1a8f4821a7 Replace exception by assertion in autovector
Summary: Replace exception by assertion in autovector

Test Plan: autovector_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29847
2014-12-04 11:41:56 -08:00
Jonah Cohen a14b7873ee Enforce write buffer memory limit across column families
Summary:
Introduces a new class for managing write buffer memory across column
families.  We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.

Test Plan: Added SharedWriteBuffer unit test to db_test.cc

Reviewers: sdong, rven, ljin, igor

Reviewed By: igor

Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim

Differential Revision: https://reviews.facebook.net/D22581
2014-12-02 12:09:20 -08:00
Yueh-Hsuan Chiang bcf9086899 Block Universal and FIFO compactions in ROCKSDB_LITE
Summary: Block Universal and FIFO compactions in ROCKSDB_LITE

Test Plan:
make shared_lib -j32
make OPT=-DROCKSDB_LITE shared_lib

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29589
2014-11-26 15:45:11 -08:00
Yueh-Hsuan Chiang beb74c14ca Fix travis-build error
Summary:
Fix travis-build error
2014-11-26 09:42:13 -08:00
Yueh-Hsuan Chiang a97314219e Fix compile error in ROCKSDB_LITE 2014-11-25 23:39:52 -08:00
Yueh-Hsuan Chiang 9d5019327b Replace log2 by implementing Log2 in options_builder
Summary:
log2 function is only used in options_builder, and this function
is not available under certain platform such as android.
This patch implements Log2 by log(n) / log(2).

Test Plan:
make
2014-11-25 23:28:36 -08:00
Yueh-Hsuan Chiang 13de000f07 Add rocksdb::ToString() to address cases where std::to_string is not available.
Summary:
In some environment such as android, the c++ library does not have
std::to_string.  This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.

Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29181
2014-11-24 20:44:49 -08:00
Igor Canadi d699d70343 Make RocksDB compile without gflags
Summary: We want to make sure people without gflags can compile RocksDB.

Test Plan: remove gflags, make all

Reviewers: sdong, rven, yhchiang, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29469
2014-11-24 15:53:23 -05:00
Yueh-Hsuan Chiang 569853ed10 Fix leak when create_missing_column_families=true on ThreadStatus
Summary:
An entry of ConstantColumnFamilyInfo is created when:
1. DB::Open
2. CreateColumnFamily.

However, there are cases that DB::Open could also call CreateColumnFamily
when create_missing_column_families=true.  As a result, it will create
duplicate ConstantColumnFamilyInfo and one of them would be leaked.

Test Plan: ./deletefile_test

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29307
2014-11-22 00:04:41 -08:00
Shaohua Li 1410180167 Make arena use hugepage if possible
Summary:
arena doesn't use huge page by default. This change will make it happen
if possible. A new paramerter is added for Arena(). If it's set, Arena
will use huge page always. If huge page allocation fails, Arena
allocation will fallback to malloc().

Test Plan:
Change util/arena_test to support huge page allocation.
Run below tests:
1. normal regression test:
  make check
2. Check if huge page allocation works
  echo 50 > /proc/sys/vm/nr_hugepages
  make check

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28647
2014-11-21 14:11:40 -08:00
Yueh-Hsuan Chiang 9222a2d024 Fixed iOS build caused by GetThreadList feature.
Summary:
Fixed iOS build caused by GetThreadList feature.
2014-11-21 11:00:42 -08:00
Igor Canadi cd278584c9 Clean up StringSplit
Summary: stringSplit is not how we name our functions. Also, we had two StringSplit's in the codebase

Test Plan: make check

Reviewers: yhchiang, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29361
2014-11-21 11:05:28 -05:00
Yueh-Hsuan Chiang 4b63fcbff3 Add enable_thread_tracking to DBOptions
Summary:
Add enable_thread_tracking to DBOptions to allow
tracking thread status related to the DB.  Default is off.

Test Plan:
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29289
2014-11-20 21:13:18 -08:00
Bryan Rosario 9e285d4238 Added CompatibleOptions for compatibility with LevelDB Options
Summary: Created a CompatibleOptions object that can be used as a LevelDB Options object and then converted to a RocksDB Options object using the ConvertOptions() method.

Test Plan: Unit test included in diff.

Reviewers: ljin

Reviewed By: ljin

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28893
2014-11-20 19:24:39 -08:00
Yueh-Hsuan Chiang eecdebe65b Fixed the destruction order of static variables in ThreadStatusImpl. 2014-11-20 16:02:03 -08:00
Yueh-Hsuan Chiang beabc6879c Fixed ~ThreadStatusImpl(). 2014-11-20 15:45:56 -08:00
Yueh-Hsuan Chiang fbc42a0933 Fixed -Werror=unused-but-set-variable in thread_status_impl
Summary:
Fixed -Werror=unused-but-set-variable in thread_status_impl
2014-11-20 13:46:04 -08:00
Yueh-Hsuan Chiang a564be715d Fix asan error in thread_status_impl.cc 2014-11-20 11:47:18 -08:00
Yueh-Hsuan Chiang 7165d18869 Fix clang compile error 2014-11-20 11:00:21 -08:00
Yueh-Hsuan Chiang d0c5f28a5c Introduce GetThreadList API
Summary:
Add GetThreadList API, which allows developer to track the
status of each process.  Currently, calling GetThreadList will
only get the list of background threads in RocksDB with their
thread-id and thread-type (priority) set.  Will add more support
on this in the later diffs.

ThreadStatus currently has the following properties:

  // An unique ID for the thread.
  const uint64_t thread_id;

  // The type of the thread, it could be ROCKSDB_HIGH_PRIORITY,
  // ROCKSDB_LOW_PRIORITY, and USER_THREAD
  const ThreadType thread_type;

  // The name of the DB instance where the thread is currently
  // involved with.  It would be set to empty string if the thread
  // does not involve in any DB operation.
  const std::string db_name;

  // The name of the column family where the thread is currently
  // It would be set to empty string if the thread does not involve
  // in any column family.
  const std::string cf_name;

  // The event that the current thread is involved.
  // It would be set to empty string if the information about event
  // is not currently available.

Test Plan:
./thread_list_test
export ROCKSDB_TESTS=GetThreadList
./db_test

Reviewers: rven, igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25047
2014-11-20 10:49:32 -08:00
Lei Jin be005e17bb fix clang compilation
Summary:
as title
2014-11-18 20:42:32 -08:00
fyrz 9e9a83baf7 Missing header in build on CentOS
While building RocksJava the build fails on
CentOS because of the missing stdexcept header.
2014-11-18 22:21:02 +01:00
Lei Jin 8d3f8f9696 remove all remaining references to cfd->options()
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29073
2014-11-18 10:20:10 -08:00
Jonah Cohen 517c28994d Options helper supports k, m, g, and t unit suffixes
Summary:
Add unit support in options helper so we can specify, e.g., 10m for
10 megabytes.

Test Plan: Updated options_test

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28977
2014-11-17 13:47:51 -08:00
Igor Canadi 0ce38fe983 Fix signed/unsigned compile 2014-11-14 16:33:41 -08:00
Venkatesh Radhakrishnan 6c1b040cc9 Provide openable snapshots
Summary: Store links to live files in directory on same disk

Test Plan:
Take snapshot and open it. Added a test GetSnapshotLink in
db_test.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28713
2014-11-14 11:38:26 -08:00
Siying Dong 136b8583bf Merge pull request #395 from lalinsky/fix-env-test
Relax the block count check on deallocation in env_test
2014-11-14 10:56:41 -08:00
sdong a177742a9b Make db_stress built for ROCKSDB_LITE
Summary:
Make db_stress built for ROCKSDB_LITE.
The test doesn't pass tough. It seg fault quickly. But I took a look and it doesn't seem to be related to lite version. Likely to be a bug inside RocksDB.

Test Plan: make db_stress

Reviewers: yhchiang, rven, ljin, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28797
2014-11-14 10:20:51 -08:00
Lukáš Lalinský 746cfaac58 Relax the block count check on deallocation in env_test
It seems that on some FS we get more blocks than we ask for. This is
already handled when checking the allocated number of blocks, but
after the file is closed it checks for an exact number of blocks,
which fails on my machine.

I changed the test to add one full page to the size, then calculate
the expected number of blocks and check if the actual number of blocks
is less or equal to that.
2014-11-13 16:58:05 -08:00
Igor Canadi 3c92e52338 Fix include 2014-11-13 14:47:54 -05:00
Igor Canadi 25f273027b Fix iOS compile with -Wshorten-64-to-32
Summary: So iOS size_t is 32-bit, so we need to static_cast<size_t> any uint64_t :(

Test Plan: TARGET_OS=IOS make static_lib

Reviewers: dhruba, ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28743
2014-11-13 14:39:30 -05:00
Igor Canadi 1f621e6abc Fix additional -Wshorten-64-to-32 errros 2014-11-11 14:09:14 -08:00
Igor Canadi 767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Igor Canadi c7ee9c3ab7 Fix -Wnon-virtual-dtor errors
Summary: This breaks mongo+rocks build https://mci.10gen.com/task_log_raw/rocksdb_ubuntu1404_rocksdb_c6e8e3d868660dc66b3bbd438cdc135df6356c5a_14_11_10_21_36_10_compile_ubuntu1404_rocksdb/0?type=T

Test Plan: m check + -Wnon-virtual-dtor

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28653
2014-11-10 17:39:38 -05:00
Igor Canadi bcdb9671c0 Fix build 2014-11-09 13:01:50 -05:00
liuchang e7620536cf fix make static_lib error 2014-11-08 11:35:10 -08:00
Federico Piccinini 543df158c5 Expose sst_dump functionality as library call.
Summary:
Refactor sst_dump to follow the same structure as ldb. Introduce a
SSTDump interface.

Test Plan: built sst_dump and tried it manually.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28485
2014-11-07 17:23:58 -08:00
Igor Canadi 68effa0348 Fix -Wshadow for tools
Summary: Previously I made `make check` work with -Wshadow, but there are some tools that are not compiled using `make check`.

Test Plan: make all

Reviewers: yhchiang, rven, ljin, sdong

Reviewed By: ljin, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28497
2014-11-07 15:04:30 -08:00
Yueh-Hsuan Chiang 28c82ff1b3 CompactFiles, EventListener and GetDatabaseMetaData
Summary:
This diff adds three sets of APIs to RocksDB.

= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h

= EventListener =
* A virtual class that allows users to implement a set of
  call-back functions which will be called when specific
  events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners

= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
  will try to compact those files into the specified level.

= Example =
* Example code can be found in example/compact_files_example.cc, which implements
  a simple external compactor using EventListener, GetColumnFamilyMetaData, and
  CompactFiles API.

Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test

Reviewers: ljin, igor, rven, sdong

Reviewed By: sdong

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24705
2014-11-07 14:45:18 -08:00
Igor Canadi 9f20395cd6 Turn -Wshadow back on
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.

Test Plan: compiles

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28131
2014-11-06 11:14:28 -08:00
Yueh-Hsuan Chiang e4211d10c1 Apply InfoLogLevel to the logs in util/env_hdfs.cc
Summary: Apply InfoLogLevel to the logs in util/env_hdfs.cc

Test Plan: make

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28011
2014-11-05 00:12:20 -08:00
Lei Jin fd24ae9d05 SetOptions() to return status and also add it to StackableDB
Summary: as title

Test Plan: ./db_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28269
2014-11-04 16:23:05 -08:00
sdong 09899f0b51 DB::Open() to automatically increase thread pool size if it is smaller than max number of parallel compactions or flushes
Summary:
With the patch, thread pool size will be automatically increased if DB's options ask for more parallelism of compactions or flushes.

Too many users have been confused by the API. Change it to make it harder for users to make mistakes

Test Plan: Add two unit tests to cover the function.

Reviewers: yhchiang, rven, igor, MarkCallaghan, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27555
2014-11-03 17:22:34 -08:00
Igor Canadi 8e79ce68ce Revert "Fix lint errors and coding style of ldb related codes."
This reverts commit bc9f36fd5e.
2014-10-31 19:22:49 -07:00
Igor Canadi 45a612f990 Revert "Fix incorrect fixing of lint errors in ldb_cmd.cc"
This reverts commit 8ddddd62d0.
2014-10-31 19:22:42 -07:00
Yueh-Hsuan Chiang 8ddddd62d0 Fix incorrect fixing of lint errors in ldb_cmd.cc
Summary: Fix incorrect fixing of lint errors in ldb_cmd.cc

Test Plan: reduce_levels_test

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28137
2014-10-31 16:15:15 -07:00
Igor Canadi 46c14c6661 Fix #258. benchmarkharness -- make bm_min_usec uint 2014-10-31 15:41:56 -07:00
Lei Jin 72cb7cf201 Add fsync / corrupt simulation to env_mem
Summary: as title

Test Plan: env_mem_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28077
2014-10-31 15:16:31 -07:00
Lei Jin 0e526eb9d7 introduce TestMemEnv and use it in db_test
Summary:
TestMemEnv simulates all Env APIs using in-memory data structures.
We can use it to speed up db_test run, which is now reduced ~7mins when it is
enabled.
We can also add features to simulate power/disk failures in the next
step
TestMemEnv is derived from helper/mem_env
mem_env can not be used for rocksdb since some of its APIs do not give
the same results as env_posix. And its file read/write is not thread safe

Test Plan:
make all -j32
./db_test
./env_mem_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28035
2014-10-31 15:08:10 -07:00
Yueh-Hsuan Chiang bc9f36fd5e Fix lint errors and coding style of ldb related codes.
Summary: Fix lint errors and coding style of ldb related codes.

Test Plan: ./ldb

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28125
2014-10-31 15:01:39 -07:00
Lei Jin c76dcb44d4 fix
Summary:

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2014-10-31 13:30:08 -07:00
Lei Jin b452dede5c fix
Summary:

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2014-10-31 12:23:00 -07:00
Igor Canadi 9f7fc3ac45 Turn on -Wshadow
Summary:
...and fix all the errors :)

Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.

Test Plan: compiles

Reviewers: yhchiang, rven, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27711
2014-10-31 11:59:54 -07:00
sdong 4d2ba38b65 Make VersionBuilder unit testable
Summary:
Rename Version::Builder to VersionBuilder and expose its definition to a header.
Make VerisonBuilder not reference Version or ColumnFamilyData, only working with VersionStorageInfo.
Add version_builder_test which has a simple test.

Test Plan: make all check

Reviewers: rven, yhchiang, igor, ljin

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D27969
2014-10-31 10:44:06 -07:00
Yueh-Hsuan Chiang 0f7f3b8605 Check InfoLogLevel earlier in Log functions.
Summary: Check InfoLogLevel earlier in Log functions.

Test Plan: auto_roll_logger_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27993
2014-10-30 13:36:18 -07:00
Yueh-Hsuan Chiang 73605d9179 Apply InfoLogLevel to the logs in util/db_info_dumper.cc
Summary:
* Rename util/db_info_dummper.cc to util/db_info_dumper.cc
* Apply InfoLogLevel to the logs in util/db_info_dumper.cc

Test Plan: make

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27915
2014-10-30 13:34:44 -07:00
Damian Lezama 41af0f56b0 Fix build break because of unsigned/signed mismatch 2014-10-30 10:36:13 -07:00
sdong 86de2007b8 Add ComparatorDBTest to test non-default comparators
Summary:
Add some helper functions to make sure DB works well for non-default comparators.
Add a test for SimpleSuffixReverseComparator.

Test Plan: Run the new test

Reviewers: ljin, rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D27831
2014-10-29 15:42:36 -07:00
Lei Jin 17be187ff9 dummy var to suppress compiler warning/error
Summary: Revmoed this in D25641, causing compiler complain. put it back

Test Plan: make release

Reviewers: igor, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27891
2014-10-29 15:38:34 -07:00
Igor Canadi c2999f54bd Revert "tmp"
This reverts commit 9ab0132360.
2014-10-29 15:29:33 -07:00
sdong 76d1c28e82 Make CompactionPicker more easily tested
Summary:
Make compaction picker easier to test.
The basic idea is to separate a minimum subcomponent of Version to VersionStorageInfo, which just responsible to LSM tree. A stub VersionStorageInfo can then be easily created and passed into compaction picker so that we can check the outputs.

It now passes most tests. Still two things need to be done:
(1) deal with the FIFO compaction's file size.
(2) write an example test to make sure the interface can do the job.

Add a compaction_picker_test to make sure compaction picker codes can be easily unit tested.

Test Plan:
Pass all unit tests and compaction_picker_test

Reviewers: yhchiang, rven, igor, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D27639
2014-10-29 15:16:53 -07:00
Lei Jin 9ab0132360 tmp
Summary:

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2014-10-29 13:36:47 -07:00
Lei Jin 44f0ff31c2 use fallocate(FALLOC_FL_PUNCH_HOLE) to release unused blocks at the end of file
Summary:
ftruncate does not always free preallocated unused space at the end of file.
In some cases, we pin too much disk space than it should

Test Plan: env_test

Reviewers: sdong, rven, yhchiang, igor

Reviewed By: igor

Subscribers: nkg-, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25641
2014-10-29 12:24:49 -07:00
Lei Jin 97451f837e add an env var ROCKSDB_TESTS_FROM to control where to start from a list of tests
Summary:
Sometimes, I got a test failure. After fixing that, I want to resume
db_test from that test. ROCKSDB_TESTS_FROM is for this purpose.

Test Plan: as title

Reviewers: yhchiang, rven, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27807
2014-10-29 12:02:11 -07:00
Igor Canadi b680033e63 Include atomic in env_test 2014-10-27 15:41:05 -07:00
Igor Canadi 48842ab316 Deprecate AtomicPointer
Summary: RocksDB already depends on C++11, so we might as well all the goodness that C++11 provides. This means that we don't need AtomicPointer anymore. The less things in port/, the easier it will be to port to other platforms.

Test Plan: make check + careful visual review verifying that NoBarried got memory_order_relaxed, while Acquire/Release methods got memory_order_acquire and memory_order_release

Reviewers: rven, yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D27543
2014-10-27 14:50:21 -07:00
Lei Jin f1841985e4 dynamic inplace_update options
Summary:
Make inplace_update_support and inplace_update_num_locks dynamic.
inplace_callback becomes immutable
We are almost free of references to cfd->options() in db_impl

Test Plan: unit test

Reviewers: igor, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25293
2014-10-27 12:10:13 -07:00
Igor Canadi 965d9d50b8 Fix timing 2014-10-24 11:58:15 -07:00
Igor Canadi 001ce64dc7 Use chrono for timing
Summary: Since we depend on C++11, we might as well use it for timing, instead of this platform-depended code.

Test Plan: Ran autovector_test, which reports time and confirmed that output is similar to master

Reviewers: ljin, sdong, yhchiang, rven, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25587
2014-10-24 10:11:57 -07:00
Lei Jin 720c1c056d fix erro during merge
Summary:
as title

Test Plan:
make release
2014-10-23 15:41:37 -07:00
Lei Jin 122f98e0b9 dynamic max_mem_compact_level
Summary: as title

Test Plan: unit test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25347
2014-10-23 15:37:14 -07:00
Lei Jin 574028679b dynamic max_sequential_skip_in_iterations
Summary:
This is not a critical options. Making it dynamic so that we can remove
more reference to cfd->options()

Test Plan: unit test

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24957
2014-10-23 15:34:21 -07:00
Yueh-Hsuan Chiang e11a5e776f Improve the comment of util/thread_local.h
Summary: Improve the comment of util/thread_local.h

Test Plan: n/a

Reviewers: ljin

Reviewed By: ljin

Differential Revision: https://reviews.facebook.net/D25449
2014-10-21 17:28:31 -07:00
Igor Canadi c12f571d31 Fix mac compile, second try 2014-10-17 12:09:45 -04:00
Igor Canadi d2e60f5cee Fix mac compile 2014-10-17 12:05:32 -04:00
Lei Jin d6c8dba727 Log MutableCFOptions in SetOptions
Summary: as title

Test Plan: make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24903
2014-10-16 17:22:28 -07:00
Lei Jin 4d5708aa56 dynamic soft_rate_limit and hard_rate_limit
Summary: as title

Test Plan:
unit test
I am only able to build the test case for hard_rate_limit.
soft_rate_limit is essentially the same thing as hard_rate_limit

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24759
2014-10-16 17:21:31 -07:00
Lei Jin 065a67c4f0 dynamic disable_auto_compactions
Summary: Add more tests as well

Test Plan: unit test

Reviewers: igor, sdong, yhchiang

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24747
2014-10-16 17:14:17 -07:00
Lei Jin dc50a1a593 make max_write_buffer_number dynamic
Summary: as title

Test Plan: unit test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24729
2014-10-16 16:57:59 -07:00
sdong 6a150c0118 ldb: support --fix_prefix_len
Summary:
ldb to support --fix_prefix_len to allow us to verify more cases.
Also fix a small issue that --bloom_bits might not be applied if --block_size is not given.

Test Plan: run ldb tool against an example DB.

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24819
2014-10-16 13:42:34 -07:00
Yueh-Hsuan Chiang 3ead857a0d Fixed Mac compile error in util/options_test.cc
Summary:
Fixed the following error in Mac:

./util/testharness.h:93:19: error: comparison of integers of different signs: 'const unsigned long' and 'const int' [-Werror,-Wsign-compare]
  BINARY_OP(IsEq, ==)
  ~~~~~~~~~~~~~~~~^~~
./util/testharness.h:86:14: note: expanded from macro 'BINARY_OP'
    if (! (x op y)) {                                   \
             ^
util/options_test.cc:269:3: note: in instantiation of function template specialization 'rocksdb::test::Tester::IsEq<unsigned long, int>' requested here
  ASSERT_EQ(new_cf_opt.write_buffer_size, 5);
  ^

Test Plan:
options_test
2014-10-10 14:20:59 -07:00
Lei Jin cd0d581ff5 convert Options from string
Summary: Allow accepting Options as a string of key/value pairs

Test Plan: unit test

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24597
2014-10-10 10:00:12 -07:00
Tomislav Novak 88edfd90ae SkipListRep::LookaheadIterator
Summary:
This diff introduces the `lookahead` argument to `SkipListFactory()`. This is an
optimization for the tailing use case which includes many seeks. E.g. consider
the following operations on a skip list iterator:

   Seek(x), Next(), Next(), Seek(x+2), Next(), Seek(x+3), Next(), Next(), ...

If `lookahead` is positive, `SkipListRep` will return an iterator which also
keeps track of the previously visited node. Seek() then first does a linear
search starting from that node (up to `lookahead` steps). As in the tailing
example above, this may require fewer than ~log(n) comparisons as with regular
skip list search.

Test Plan:
Added a new benchmark (`fillseekseq`) which simulates the usage pattern. It
first writes N records (with consecutive keys), then measures how much time it
takes to read them by calling `Seek()` and `Next()`.

   $ time ./db_bench -num 10000000 -benchmarks fillseekseq -prefix_size 1 \
      -key_size 8 -write_buffer_size $[1024*1024*1024] -value_size 50 \
      -seekseq_next 2 -skip_list_lookahead=0
   [...]
   DB path: [/dev/shm/rocksdbtest/dbbench]
   fillseekseq  :       0.389 micros/op 2569047 ops/sec;

   real    0m21.806s
   user    0m12.106s
   sys     0m9.672s

   $ time ./db_bench [...] -skip_list_lookahead=2
   [...]
   DB path: [/dev/shm/rocksdbtest/dbbench]
   fillseekseq  :       0.153 micros/op 6540684 ops/sec;

   real    0m19.469s
   user    0m10.192s
   sys     0m9.252s

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb, march, lovro

Differential Revision: https://reviews.facebook.net/D23997
2014-10-07 11:48:23 -07:00
Igor Canadi d6987216c9 Merge pull request #327 from dalgaaf/wip-da-SCA-20141001
Fix some issues from SCA
2014-10-02 10:59:52 -07:00
Lei Jin 5ec53f3edf make compaction related options changeable
Summary:
make compaction related options changeable. Most of changes are tedious,
following the same convention: grabs MutableCFOptions at the beginning
of compaction under mutex, then pass it throughout the job and register
it in SuperVersion at the end.

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23349
2014-10-01 16:19:16 -07:00
Danny Al-Gaaf bf3bfd0448 util/cache_test.cc: use static_cast over C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-10-01 10:53:37 +02:00
Danny Al-Gaaf 4cc8643baf util/ldb_cmd.cc: prefer prefix ++operator for non-primitive types
Prefer prefix ++operator for non-primitive types like iterators for
performance reasons. Prefix ++/-- operators avoid creating a temporary
copy.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
Danny Al-Gaaf af8c2b2d97 util/signal_test.cc: suppress intentional null pointer deref
Add comment to enabele cppcheck suppression of intentional null
pointer deref via --inline-suppr option.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
Danny Al-Gaaf 8558457143 ldb_cmd_execute_result.h: perform init in initialization list
Fix for:

[util/ldb_cmd_execute_result.h:18]: (performance) Variable 'message_'
 is assigned in constructor body. Consider performing initialization
 in initialization list.
[util/ldb_cmd_execute_result.h:23]: (performance) Variable 'message_'
 is assigned in constructor body. Consider performing initialization
 in initialization list.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
fyrz 8b8011a68c Changed name of ReverseBytewiseComparator based on review comment 2014-09-27 10:06:13 +02:00
fyrz 5340484266 Built-in comparator(s) in RocksJava
Extended Built-in comparators with ReverseBytewiseComparator.

Reverse key handling is under certain conditions essential. E.g. while
using timestamp versioned data.

As native-comparators were not available using JAVA-API. Both built-in comparators
were exposed via JNI to be set upon database creation time.
2014-09-26 10:35:12 +02:00
Igor Canadi 21ddcf6e4f Remove allow_thread_local
Summary: See https://reviews.facebook.net/D19365

Test Plan: compiles

Reviewers: sdong, yhchiang, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23907
2014-09-24 13:12:16 -07:00
sdong cdaf44f9ae Enlarge log size cap when printing file summary
Summary:
Now the file summary is too small for printing. Enlarge it.
To enable it, allow to pass a size to log buffer.

Test Plan:
Add a unit test.
make all check

Reviewers: ljin, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21723
2014-09-23 16:56:34 -07:00
Lei Jin 0a29ce5393 re-enable BlockBasedTable::SetupForCompaction()
Summary:
It was commented out in D22545 by accident. Keep the option in
ImmutableOptions for now. I can make it dynamic in
https://reviews.facebook.net/D23349

Test Plan: make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23865
2014-09-23 14:18:57 -07:00
Igor Canadi 53b0039954 Fix release compile 2014-09-22 15:00:03 -07:00
Lei Jin 57a32f147f change target_file_size_base to uint64_t
Summary: It contrains the file size to be 4G max with int

Test Plan:
tried to grep instance and made sure other related variables are also
uint64

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23697
2014-09-22 11:15:03 -07:00
liuchang0812 787cb4db29 remove cast, replace %llu with % PRIu64 2014-09-23 01:10:46 +08:00
whu_liuchang a7574d4fa1 Update logging.cc 2014-09-22 23:37:00 +08:00
whu_liuchang 7e0dcb953f Update logging.cc
fix cast style to cpp static_cast
2014-09-22 23:24:53 +08:00
liuchang0812 6a031b6a81 remove unused variable 2014-09-21 22:20:00 +08:00
Mark Callaghan 32f2532a0b Print compression_size_percent as a signed int
Summary:
compression_size_percent is an int but was printed as
an unsigned int. So the default of -1 is displayed as a big number.

Test Plan: make check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23679
2014-09-19 13:09:25 -07:00
sdong 976caca09b Skip AllocateTest if fallocate() is not supported in the file system
Summary: To avoid false positive test failures when the file system doesn't support fallocate. In EnvTest.AllocateTest, we first make a simple fallocate call and check the error codes to rule out the possibility that it is not supported. Skip the test if the error code indicates it is not supported.

Test Plan: Run the test and make sure it passes on file systems supporting and not supporting fallocate

Reviewers: yhchiang, ljin, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23667
2014-09-19 10:55:59 -07:00
Igor Canadi 3c232e1647 Fix mac compile 2014-09-17 15:45:43 -07:00
Lei Jin a062e1f2c4 SetOptions() for memtable related options
Summary: as title

Test Plan:
make all check
I will think a way to set up stress test for this

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23055
2014-09-17 12:49:13 -07:00
Lei Jin e4eca6a1e5 Options conversion function for convenience
Summary: as title

Test Plan: options_test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23283
2014-09-17 12:46:32 -07:00
Saghm Rossi f9eaaa66e6 added include for inttypes.h to fix nonworking printf statements 2014-09-17 15:15:53 -04:00
Igor Canadi faad439ac4 Fix #284
Summary: This work on my compiler, but it turns out some compilers don't implicitly add constness, see: https://github.com/facebook/rocksdb/issues/284. This diff adds constness explicitly.

Test Plan: still compiles

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23409
2014-09-16 10:30:32 -07:00
yinqiwen 28be16b1dc fix rate limiter crash #286 2014-09-15 20:43:38 +08:00
Feng Zhu 53404d9fb7 add_qps_info_in cache bench
Summary: print qps in summary

Test Plan: ./cache_bench

Reviewers: yhchiang, ljin, sdong, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23079
2014-09-10 09:46:56 -07:00
Igor Canadi 6bb7e3ef25 Merger test
Summary: I abandoned https://reviews.facebook.net/D18789, but I wrote a good unit test there, so let's check it in. :)

Test Plan: this is test

Reviewers: sdong, yhchiang, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22827
2014-09-08 22:24:40 -07:00
Igor Canadi 88841bd007 Explicitly cast char to signed char in Hash()
Summary:
The compilers we use treat char as signed. However, this is not guarantee of C standard and some compilers (for ARM platform for example), treat char as unsigned. Code that assumes that char is either signed or unsigned is wrong.

This change explicitly casts the char to signed version. This will not break any of our use cases on x86, which, I believe are all of them. In case somebody out there is using RocksDB on ARM AND using bloom filters, they're going to have a bad time. However, it is very unlikely that this is the case.

Test Plan: sanity test with previous commit (with new sanity test)

Reviewers: yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D22767
2014-09-08 18:57:40 -07:00
Lei Jin 52311463e9 MemTableOptions
Summary: removed reference to options in WriteBatch and DBImpl::Get()

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23049
2014-09-08 18:46:52 -07:00
Lei Jin 659d2d50c3 move compaction_filter to immutable_options
Summary:
all shared_ptrs are in immutable_options now. This will also make
options assignment a little cheaper

Test Plan: make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23001
2014-09-08 15:09:25 -07:00
Lei Jin 048560a642 reduce references to cfd->options() in DBImpl
Summary:
I found it is almost impossible to get rid of this function in a single
batch. I will take a step by step approach

Test Plan: make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22995
2014-09-08 15:04:34 -07:00
Igor Canadi a2bb7c3c33 Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes

The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).

When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.

This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.

Test Plan: make check for now. I'll add some unit tests later. Also, perf test.

Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22791
2014-09-08 11:20:25 -07:00
Feng Zhu 0af157f9bf Implement full filter for block based table.
Summary:
1. Make filter_block.h a base class. Derive block_based_filter_block and full_filter_block. The previous one is the traditional filter block. The full_filter_block is newly added. It would generate a filter block that contain all the keys in SST file.

2. When querying a key, table would first check if full_filter is available. If not, it would go to the exact data block and check using block_based filter.

3. User could choose to use full_filter or tradional(block_based_filter). They would be stored in SST file with different meta index name. "filter.filter_policy" or "full_filter.filter_policy". Then, Table reader is able to know the fllter block type.

4. Some optimizations have been done for full_filter_block, thus it requires a different interface compared to the original one in filter_policy.h.

5. Actual implementation of filter bits coding/decoding is placed in util/bloom_impl.cc

Benchmark: base commit 1d23b5c470
Command:
db_bench --db=/dev/shm/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --write_buffer_size=134217728 --max_write_buffer_number=2 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --verify_checksum=false --max_background_compactions=4 --use_plain_table=0 --memtablerep=prefix_hash --open_files=-1 --mmap_read=1 --mmap_write=0 --bloom_bits=10 --bloom_locality=1 --memtable_bloom_bits=500000 --compression_type=lz4 --num=393216000 --use_hash_search=1 --block_size=1024 --block_restart_interval=16 --use_existing_db=1 --threads=1 --benchmarks=readrandom —disable_auto_compactions=1
Read QPS increase for about 30% from 2230002 to 2991411.

Test Plan:
make all check
valgrind db_test
db_stress --use_block_based_filter = 0
./auto_sanity_test.sh

Reviewers: igor, yhchiang, ljin, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20979
2014-09-08 10:37:05 -07:00
Nik Bougalis 9f8aa09395 Don't leak data returned by opendir 2014-09-05 20:50:29 -07:00
Nik Bougalis bfee319fb0 sizeof(int*) where sizeof(int) was intended 2014-09-05 20:50:29 -07:00
Feng Zhu 40ddc3d6c4 add cache bench
Summary: 1. A benchmark for cache

Test Plan: ./cache_bench

Reviewers: yhchiang, dhruba, sdong, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22809
2014-09-05 15:55:43 -07:00
liuhuahang bb6ae0f80c fix more compile warnings
N/A

Change-Id: I5b6f9c70aea7d3f3489328834fed323d41106d9f
Signed-off-by: liuhuahang <liuhuahang@zerus.co>
2014-09-05 14:14:37 +08:00
Stanislau Hlebik 45a5e3ede0 Remove path with arena==nullptr from NewInternalIterator
Summary:
Simply code by removing code path which does not use Arena
from NewInternalIterator

Test Plan:
make all check
make valgrind_check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22395
2014-09-04 17:40:41 -07:00
Lei Jin 5665e5e285 introduce ImmutableOptions
Summary:
As a preparation to support updating some options dynamically, I'd like
to first introduce ImmutableOptions, which is a subset of Options that
cannot be changed during the course of a DB lifetime without restart.

ColumnFamily will keep both Options and ImmutableOptions. Any component
below ColumnFamily should only take ImmutableOptions in their
constructor. Other options should be taken from APIs, which will be
allowed to adjust dynamically.

I am yet to make changes to memtable and other related classes to take
ImmutableOptions in their ctor. That can be done in a seprate diff as
this one is already pretty big.

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D22545
2014-09-04 16:18:36 -07:00
Igor Canadi 51ea889002 Fix travis builds
Summary:
Lots of travis builds are failing because on EnvPosixTest.RandomAccessUniqueID: https://travis-ci.org/facebook/rocksdb/builds/34400833

This is the result of their environment and not because of RocksDB's bug.

Also note that RocksDB works correctly even though UniqueID feature is not present in the system (as it's the case with os x)

Test Plan:
OPT=-DTRAVIS make env_test && ./env_test
Observed that offending tests are not being run

Reviewers: sdong, yhchiang, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22803
2014-09-04 10:23:45 -07:00
liuhuahang ef5b384729 fix a few compile warnings
1, const qualifiers on return types make no sense and will trigger a compile warning: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]

2, class HistogramImpl has virtual functions and thus should have a virtual destructor

3, with some toolchain, the macro __STDC_FORMAT_MACROS is predefined and thus should be checked before define

Change-Id: I69747a03bfae88671bfbb2637c80d17600159c99
Signed-off-by: liuhuahang <liuhuahang@zerus.co>
2014-09-04 23:06:23 +08:00
Igor Canadi 7f19bb93c6 Merge pull request #242 from tdfischer/perf-timer-destructors
Refactor PerfStepTimer to automatically stop on destruct
2014-09-02 13:06:40 -07:00
Torrie Fischer 6614a48418 Refactor PerfStepTimer to stop on destruct
This eliminates the need to remember to call PERF_TIMER_STOP when a section has
been timed. This allows more useful design with the perf timers and enables
possible return value optimizations. Simplistic example:

class Foo {
  public:
    Foo(int v) : m_v(v);
  private:
    int m_v;
}

Foo makeFrobbedFoo(int *errno)
{
  *errno = 0;
  return Foo();
}

Foo bar(int *errno)
{
  PERF_TIMER_GUARD(some_timer);

  return makeFrobbedFoo(errno);
}

int main(int argc, char[] argv)
{
  Foo f;
  int errno;

  f = bar(&errno);

  if (errno)
    return -1;
  return 0;
}

After bar() is called, perf_context.some_timer would be incremented as if
Stop(&perf_context.some_timer) was called at the end, and the compiler is still
able to produce optimizations on the return value from makeFrobbedFoo() through
to main().
2014-09-02 12:04:22 -07:00
Lei Jin 7e9f28cb23 limit max bytes that can be read/written per pread/write syscall
Summary:
BlockBasedTable sst file size can grow to a large size when universal
compaction is used. When index block exceeds 2G, pread seems to fail and
return truncated data and causes "trucated block" error. I tried to use
```
  #define _FILE_OFFSET_BITS 64
```
But the problem still persists. Splitting a big write/read into smaller
batches seems to solve the problem.

Test Plan:
successfully compacted a case with resulting sst file at ~90G (2.1G
index block size)

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22569
2014-08-29 21:21:49 -07:00
Igor Canadi d5bd6c772b Fix ios compile
Summary: No __thread for ios.

Test Plan: compile works for ios now

Reviewers: ljin, dhruba

Reviewed By: dhruba

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22491
2014-08-28 12:46:05 -04:00
Igor Canadi 536e9973e3 Remove assert in vector rep
Summary: This assert makes Insert O(n^2) instead of O(n) in debug mode. Memtable insert is in the critical path. No need to assert uniqunnes of the key here, since we're adding a sequence number to it anyway.

Test Plan: none

Reviewers: sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22443
2014-08-27 11:05:41 -07:00
Radheshyam Balasundaram 4142a3e783 Adding a user comparator for comparing Uint64 slices.
Summary:
- New Uint64 comparator
- Modify Reader and Builder to take custom user comparators instead of bytewise comparator
- Modify logic for choosing unused user key in builder
- Modify iterator logic in reader
- test changes

Test Plan:
cuckoo_table_{builder,reader,db}_test
make check all

Reviewers: ljin, sdong

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D22377
2014-08-27 10:39:31 -07:00
Lei Jin 1755581f19 improve OptimizeForPointLookup()
Summary: also fix HISTORY.md

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22437
2014-08-26 14:15:00 -07:00
Igor Canadi d9c0785812 Fix assertion in PosixRandomAccessFile
Summary:
See https://github.com/facebook/rocksdb/issues/244#issuecomment-53372297
Also see this: https://github.com/facebook/rocksdb/blob/master/util/env_posix.cc#L1075

Test Plan: compiles

Reviewers: yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22419
2014-08-26 15:28:36 -04:00
Lei Jin a98badff16 print table options
Summary: Add a virtual function in table factory that will print table options

Test Plan: make release

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22149
2014-08-25 14:24:09 -07:00
Lei Jin 384400128f move block based table related options BlockBasedTableOptions
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21915
2014-08-25 14:22:05 -07:00
Jonah Cohen e173bf9c19 Eliminate VersionSet memory leak
Summary:
ManifestDumpCommand::DoCommand was allocating a VersionSet and never
freeing it.

Test Plan: make

Reviewers: igor

Reviewed By: igor

Differential Revision: https://reviews.facebook.net/D22221
2014-08-20 13:52:03 -07:00
Igor Canadi 6929b08616 Remove BitStream* tests 2014-08-19 09:52:54 -04:00
Igor Canadi 50b790c6d4 Removing BitStream* functions
Summary: I was checking some functions in coding.h and coding.cc when I noticed these unused functions. Let's remove them.

Test Plan: compiles

Reviewers: sdong, ljin, yhchiang, dhruba

Reviewed By: dhruba

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22077
2014-08-19 06:48:21 -07:00
Lei Jin c7037153c3 attempt to fix auto_roll_logger_test
Summary:
auto_roll_logger_test fails from time to time. I wasn't able to repro
the issue but by looking at the code, it seems like the initial ctime_
value can be set to the boundary of the second so it may still have a
chance to get rolled when interval is set to 1 second.

```
util/auto_roll_logger_test.cc:120: failed: 118 > 708
==19470== Syscall param msync(start) points to unaddressable byte(s)
==19470==    at 0x4E46CE0: __msync_nocancel (in
/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib/libpthread-2.17.so)
==19470==    by 0x584EFB: access_mem (Ginit.c:137)
==19470==    by 0x5834E3: _ULx86_64_access_reg (libunwind_i.h:162)
==19470==    by 0x585601: apply_reg_state (Gparser.c:742)
==19470==    by 0x5866BE: _ULx86_64_dwarf_find_save_locs (Gparser.c:883)
==19470==    by 0x584550: _ULx86_64_dwarf_step (Gstep.c:34)
==19470==    by 0x583653: _ULx86_64_step (Gstep.c:71)
==19470==    by 0x583FD2: _ULx86_64_tdep_trace (Gtrace.c:217)
==19470==    by 0x5831C3: backtrace (backtrace.c:69)

Test Plan: ./auto_roll_logger_test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21951
2014-08-18 10:23:18 -07:00
Lei Jin 625b9ef4e0 Merge pull request #234 from bbiao/master
fix compile error under Mac OS X
2014-08-14 20:34:48 -07:00
Jonah Cohen 59a2763d5c Fix typo huage => huge
Test Plan: Inspection

Reviewers: sdong

Reviewed By: sdong

Differential Revision: https://reviews.facebook.net/D21891
2014-08-14 17:01:20 -07:00
Jonah Cohen f611935e9a Fix autovector iterator increment/decrement comments
Summary: The prefix and postfix operators were mixed up in the autovector class.

Test Plan: Inspection

Reviewers: sdong, kailiu

Reviewed By: kailiu

Differential Revision: https://reviews.facebook.net/D21873
2014-08-14 14:56:11 -07:00
ZHANG Biao 8dfe2fdd51 fix compile error under Mac OS X 2014-08-14 20:01:01 +08:00
Lei Jin 58c49466d2 Allow env_posix to lower background thread IO priority
Summary: This is a linux-specific system call.

Test Plan: ran db_bench

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: haobo, leveldb

Differential Revision: https://reviews.facebook.net/D21183
2014-08-13 20:49:58 -07:00
Feng Zhu 6a2be31f14 fix_valgrind_error_caused_in_db_info_dummper
Summary: 1. add default value to FileType type, thus avoid valgrind error

Test Plan: valgrind ./db_test

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21807
2014-08-13 17:53:43 -07:00
Lei Jin e91ebf1399 print compaction_filter name in Options.Dump
Summary:
Was looking at an issue. All options are the same except
compaction_filter was missed from a newer package. Our option dump does
not capture that

Test Plan: make release

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21765
2014-08-13 15:58:31 -07:00
Feng Zhu 5e642403a9 log db path info before open
Summary: 1. write db MANIFEST, CURRENT, IDENTITY, sst files, log files to log before open

Test Plan: run db and check LOG file

Reviewers: ljin, yhchiang, igor, dhruba, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21459
2014-08-13 13:45:13 -07:00
Lei Jin 5d0074c471 set bytes_per_sync to 1MB if rate limiter is enabled
Summary: as title

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21201
2014-08-12 16:42:18 -07:00
miguelportilla 93e6b5e9d9 Changes to support unity build:
* Script for building the unity.cc file via Makefile
* Unity executable Makefile target for testing builds
* Source code changes to fix compilation of unity build
2014-08-11 13:22:47 -04:00
Radheshyam Balasundaram 0c9d03ba10 Fixing broken Mac build
Summary: Made some small changes to fix the broken mac build

Test Plan: make check all in both linux and mac. All tests pass.

Reviewers: sdong, igor, ljin, yhchiang

Reviewed By: ljin, yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20895
2014-07-31 20:52:13 -07:00
sdong 473e829784 Fix ldb dump_manifest
Summary:
We now reads table properties in VersionSet::LogAndApply(), which requires options.db_paths to be set. But since ldb_cmd directly creates VersionSet without initialization db_paths, causing a seg fault. This patch fix it by initializing db_paths.

log_and_apply_bench still shows segfault, because table cache is nullptr in VersionSet created.

Test Plan: Run ldb dump_manifest which used to fail.

Reviewers: yhchiang, ljin, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20751
2014-07-30 10:17:48 -07:00
Lei Jin 1bd3431f7c Change StopWatch interface
Summary: So that we can avoid calling NowSecs() in MakeRoomForWrite twice

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20529
2014-07-28 12:22:37 -07:00
Lei Jin f6ca226c17 make statistics forward-able
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.

Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics

Reviewers: yhchiang, sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20145
2014-07-28 12:10:49 -07:00
Lei Jin 40fa8a4cd5 make statistics forward-able
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.

Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics

Reviewers: yhchiang, sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20145
2014-07-28 12:05:36 -07:00
Lei Jin d650612c4c expose RateLimiter definition
Summary:
User gets undefinied error since the definition is not exposed.
Also re-enable the db test with only upper bound check

Test Plan: db_test, rate_limit_test

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20403
2014-07-25 15:17:06 -07:00
Yueh-Hsuan Chiang 00f56dfa28 Fixed a compile error in util/options_builder.cc
Summary:
Fixed the following compile error by replacing pow by shift, as it computes
power of 2.

util/options_builder.cc:133:14: error: no member named 'pow' in namespace 'std'
        std::pow(2, std::max(0, std::min(3, level0_stop_writes_trigger -
        ~~~~~^
1 error generated.
make: *** [util/options_builder.o] Error 1

Test Plan: make success in mac and linux

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20475
2014-07-23 10:31:32 -07:00
Igor Canadi 0ff183a0d9 Move include/utilities/*.h to include/rocksdb/utilities/*.h
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:

    #include <rocksdb/db.h>
    #include <utilities/backupable_db.h> // still our public header!

Also, internally, we include:

    #include "utilities/backupable/backupable_db.h" // internal header
    #include "utilities/backupable_db.h" // public header

which is confusing.

This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.

Test Plan: compiles

Reviewers: dhruba, ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20409
2014-07-23 10:21:38 -04:00
sdong e6de02103a Add a utility function to guess optimized options based on constraints
Summary:
Add a function GetOptions(), where based on four parameters users give: read/write amplification threshold, memory budget for mem tables and target DB size, it picks up a compaction style and parameters for them. Background threads are not touched yet.

One limit of this algorithm: since compression rate and key/value size are hard to predict, it's hard to predict level 0 file size from write buffer size. Simply make 1:1 ratio here.

Sample results: https://reviews.facebook.net/P477

Test Plan: Will add some a unit test where some sample scenarios are given and see they pick the results that make sense

Reviewers: yhchiang, dhruba, haobo, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18741
2014-07-22 15:24:21 -07:00
Stanislau Hlebik 9d70cce047 Adding option to save PlainTable index and bloom filter in SST file.
Summary:
Adding option to save PlainTable index and bloom filter in SST file.
If there is no bloom block and/or index block, PlainTableReader builds
new ones. Otherwise PlainTableReader just use these blocks.

Test Plan: make all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19527
2014-07-18 16:58:13 -07:00
sdong 0abaed2e08 Support multiple DB directories in universal compaction style
Summary:
This patch adds a target size parameter in options.db_paths and universal compaction will base it to determine which DB path to place a new file.
Level-style stays the same.

Test Plan: Add new unit tests

Reviewers: ljin, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, dhruba, igor, leveldb

Differential Revision: https://reviews.facebook.net/D19869
2014-07-15 12:06:28 -07:00
Igor Canadi 20c056306b Remove stats logger
Summary: Browsing through the code, looks like StatsLogger is not used at all!

Test Plan: compiles

Reviewers: ljin, sdong, yhchiang, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19827
2014-07-15 09:16:32 -04:00
Igor Canadi 5ff6633588 Fix mac compile 2014-07-10 13:19:47 -07:00
sdong 36de0e5359 Add a function to return current perf level
Summary: Add a function to return the perf level. It is to allow a wrapper of DB to increase the perf level and restore the original perf level after finishing the function call.

Test Plan: Add a verification in db_test

Reviewers: yhchiang, igor, ljin

Reviewed By: ljin

Subscribers: xjin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19551
2014-07-10 11:35:48 -07:00
Lei Jin 534357ca3a integrate rate limiter into rocksdb
Summary:
Add option and plugin rate limiter for PosixWritableFile. The rate
limiter only applies to flush and compaction. WAL and MANIFEST are
excluded from this enforcement.

Test Plan: db_test

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19425
2014-07-08 12:31:49 -07:00
Lei Jin 5ef1ba7ff5 generic rate limiter
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.

Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19359
2014-07-08 11:41:57 -07:00
Yueh-Hsuan Chiang 90a6aca48e Finer report I/O stats about Flush and Compaction.
Summary:
This diff allows the I/O stats about Flush and Compaction to be reported
in a more accurate way.  Instead of measuring the size of a file, it
measure I/O cost in per read / write basis.

Test Plan: make all check

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19383
2014-07-03 16:28:03 -07:00
Yueh-Hsuan Chiang d4d338de33 Add timeout_hint_us to WriteOptions and introduce Status::TimeOut.
Summary:
This diff adds timeout_hint_us to WriteOptions.  If it's non-zero, then
1) writes associated with this options MAY be aborted when it has been
  waiting for longer than the specified time.  If an abortion happens,
  associated writes will return Status::TimeOut.
2) the stall time of the associated write caused by flush or compaction
  will be limited by timeout_hint_us.

The default value of timeout_hint_us is 0 (i.e., OFF.)

The statistics of timeout writes will be recorded in WRITE_TIMEDOUT.

Test Plan:
export ROCKSDB_TESTS=WriteTimeoutAndDelayTest
make db_test
./db_test

Reviewers: igor, ljin, haobo, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18837
2014-07-03 15:47:02 -07:00
sdong 2459f7ec4e Support Multiple DB paths (without having an interface to expose to users)
Summary:
In this patch, we allow RocksDB to support multiple DB paths internally.
No user interface is supported yet so this patch is silent to users.

Test Plan: make all check

Reviewers: igor, haobo, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18921
2014-07-02 21:14:44 -07:00
Igor Canadi d3f63f03ad Fix 32-bit errors
Summary: https://www.facebook.com/groups/rocksdb.dev/permalink/590438347721350/

Test Plan: compiles

Reviewers: sdong, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19197
2014-07-02 11:40:16 +02:00
sdong 9c332aa11a HashLinkList memtable switches a bucket to a skip list to reduce performance outliers
Summary:
In this patch, we enhance HashLinkList memtable to reduce performance outliers when a bucket contains too many entries. We switch to skip list for this case to enable binary search.

Add threshold_use_skiplist parameter to determine when a bucket needs to switch to skip list.

The new data structure is documented in comments in the codes.

Test Plan:
make all check
set threshold_use_skiplist in several tests

Reviewers: yhchiang, haobo, ljin

Reviewed By: yhchiang, ljin

Subscribers: nkg-, xjin, dhruba, yhchiang, leveldb

Differential Revision: https://reviews.facebook.net/D19299
2014-07-01 17:14:15 -07:00
Feng Zhu 5656367416 use arena to allocate memtable's bloomfilter and hashskiplist's buckets_
Summary:
    Bloomfilter and hashskiplist's buckets_ allocated by memtable's arena
    DynamicBloom: pass arena via constructor, allocate space in SetTotalBits
    HashSkipListRep: allocate space of buckets_ using arena.
       do not delete it in deconstructor because arena would take care of it.
    Several test files are changed.

Test Plan:
    make all check

Reviewers: ljin, haobo, yhchiang, sdong

Reviewed By: sdong

Subscribers: igor, dhruba

Differential Revision: https://reviews.facebook.net/D19335
2014-06-30 15:54:31 -07:00
Yueh-Hsuan Chiang 81c5d98900 Fixed a comparison between signed and unsigned integers in options.cc
Summary:
Fixed the following warning:

util/options.cc: In constructor ‘rocksdb::ColumnFamilyOptions::ColumnFamilyOptions(const rocksdb::Options&)’:
util/options.cc:157:58: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
   if (max_bytes_for_level_multiplier_additional.size() < num_levels) {
                                                             ^

Test Plan: make all check

Reviewers: haobo, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19293
2014-06-25 15:31:30 -06:00
sdong 19de6a7aad Remove MemTableRep::GetIterator(const Slice& slice)
Summary: It seems to me that when ever function MemTableRep::GetIterator(const Slice& slice) is used, we can use MemTableRep::GetDynamicPrefixIterator() instead. Just delete it to simplify the codes.

Test Plan: make all check

Reviewers: yhchiang, ljin

Reviewed By: ljin

Subscribers: xjin, dhruba, haobo, leveldb

Differential Revision: https://reviews.facebook.net/D19281
2014-06-25 14:09:29 -07:00
Yueh-Hsuan Chiang 55531fd089 Fixed heap-buffer-overflow issue when Options.num_levels > 7.
Summary:
Currently, when num_levels has been changed to > 7, internally
it will not resize max_bytes_for_level_multiplier_additional.
As a result, max_bytes_for_level_multiplier_additional.size() will
be smaller than num_levels, which causes heap-buffer-overflow.

Test Plan: make all check

Reviewers: haobo, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19275
2014-06-25 14:11:12 -06:00
Yueh-Hsuan Chiang 82c31792d0 Revert the default setting of InitFromCmdLineArgs().
Summary:
Revert the default setting of InitFromCmdLineArgs() as all the callers
currently provide full set of arguments.

Test Plan:
make reduce_levels_test
./reduce_levels_test

Reviewers: haobo, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19257
2014-06-23 18:51:00 -06:00
Yueh-Hsuan Chiang e5e6f55bd1 Fix compile error caused in LDB tool
Summary:
Fixed the following compile error.
    tools/reduce_levels_test.cc:89:31: error: no matching function for call to 'InitFromCmdLineArgs'
      LDBCommand* level_reducer = LDBCommand::InitFromCmdLineArgs(args);
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ./util/ldb_cmd.h:56:22: note: candidate function not viable: requires 3 arguments, but 1 was provided
      static LDBCommand* InitFromCmdLineArgs(
                         ^
    ./util/ldb_cmd.h:62:22: note: candidate function not viable: requires 4 arguments, but 1 was provided
      static LDBCommand* InitFromCmdLineArgs(
                         ^
    1 error generated.

Test Plan:
make reduce_levels_test
./reduce_levels_test

Reviewers: haobo, ljin, sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19251
2014-06-23 17:19:53 -06:00
Haobo Xu dfb31d152d [RocksDB] allow LDB tool to have customized key formatter
Summary: Currently ldb tool dump keys either in ascii format or hex format - neither is ideal if the key has a binary structure and is not readable in ascii. This diff also allows LDB tool to be customized in ways beyond DB options.

Test Plan: verify that key formatter works with some simple db with binary key.

Reviewers: sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19209
2014-06-23 15:35:40 -07:00
Igor Canadi 00b26c3a83 JSONDocument
Summary:
After evaluating options for JSON storage, I decided to implement our own. The reason is that we'll be able to optimize it better and we get to reduce unnecessary dependencies (which is what we'd get with folly).

I also plan to write a serializer/deserializer for JSONDocument with our own binary format similar to BSON. That way we'll store binary JSON format in RocksDB instead of the plain-text JSON. This means less storage and faster deserialization.

There are still some inefficiencies left here. I plan to optimize them after we develop a functioning DocumentDB. That way we can move and iterate faster.

Test Plan: added a unit test

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: haobo

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18831
2014-06-20 11:14:14 +02:00
Igor Canadi d4a8423334 Remove seek compaction
Summary:
As discussed in our internal group, we don't get much use of seek compaction at the moment, while it's making code more complicated and slower in some cases.

This diff removes seek compaction and (hopefully) all code that was introduced to support seek compaction.

There is one test case that relied on didIO information. I'll try to find another way to implement it.

Test Plan: make check

Reviewers: sdong, haobo, yhchiang, ljin, dhruba

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19161
2014-06-20 10:23:02 +02:00
sdong 8c265c08f1 HashLinkList to log distribution of number of entries aross buckets
Summary: Add two parameters of hash linked list to log distribution of number of entries across all buckets, and a sample row when there are too many entries in one single bucket.

Test Plan: Turn it on in plain_table_db_test and see the logs.

Reviewers: haobo, ljin

Reviewed By: ljin

Subscribers: leveldb, nkg-, dhruba, yhchiang

Differential Revision: https://reviews.facebook.net/D19095
2014-06-17 17:55:36 -07:00
Igor Canadi 4f18bfe376 Merge pull request #176 from bgrainger/mutexrw-unlock
Add separate Read/WriteUnlock methods in MutexRW.
2014-06-17 20:38:06 +02:00
Bradley Grainger 2d02ec6533 Add separate Read/WriteUnlock methods in MutexRW.
Some platforms, particularly Windows, do not have a single method that can
release both a held reader lock and a held writer lock; instead, a
separate method (ReleaseSRWLockShared or ReleaseSRWLockExclusive) must be
called in each case.

This may also be necessary to back MutexRW with a shared_mutex in C++14;
the current language proposal includes both an unlock() and a
shared_unlock() method.
2014-06-16 15:41:46 -07:00
Yueh-Hsuan Chiang 4d913cfbc3 Fix a bug causing LOG is not created when max_log_file_size is set.
Summary:
Fix a bug causing LOG is not created when max_log_file_size is set.
This bug is reported in issue #174.

Test Plan:
Add TEST(AutoRollLoggerTest, LogFileExistence).
make auto_roll_logger_test
./auto_roll_logger_test

Reviewers: haobo, sdong, ljin, igor, igor2

Reviewed By: igor2

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19053
2014-06-16 10:27:42 -07:00
Lei Jin c83b085770 prefetch bloom filter data block for L0 files
Summary: as title

Test Plan:
db_bench
the initial result is very promising. I will post results of complete
runs

Reviewers: dhruba, haobo, sdong, igor

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18867
2014-06-12 10:06:18 -07:00
sdong 80f409ea37 Clean PlainTableReader's variables for better data locality
Summary:
Clean PlainTableReader's data structures:
(1) inline bloom_ (in order to do this, change DynamicBloom to allow lazy initialization)
(2) remove some variables only used when initialization from the class
(3) put variables not used in normal read code paths to the end of the class and reference prefix_extractor directly
(4) make Options a reference.

Test Plan: make all check

Reviewers: haobo, ljin

Reviewed By: ljin

Subscribers: igor, yhchiang, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18891
2014-06-09 13:53:39 -07:00
Igor Canadi a0191c9dfe Create Missing Column Families
Summary: Provide an convenience option to create column families if they are missing from the DB. Task #4460490

Test Plan: added unit test. also, stress test for some time

Reviewers: sdong, haobo, dhruba, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: yhchiang, leveldb

Differential Revision: https://reviews.facebook.net/D18951
2014-06-06 18:04:56 -07:00
Ankit Gupta 08314321fa Merge branch 'master' of https://github.com/facebook/rocksdb 2014-06-05 13:17:53 +01:00
Igor Canadi 552c49f0f4 Remove upper bound for rate limiting unit test 2014-06-03 13:58:44 -07:00
sdong df9069d23f In DB::NewIterator(), try to allocate the whole iterator tree in an arena
Summary:
In this patch, try to allocate the whole iterator tree starting from DBIter from an arena
1. ArenaWrappedDBIter is created when serves as the entry point of an iterator tree, with an arena in it.
2. Add an option to create iterator from arena for following iterators: DBIter, MergingIterator, MemtableIterator, all mem table's iterators, all table reader's iterators and two level iterator.
3. MergeIteratorBuilder is created to incrementally build the tree of internal iterators. It is passed to mem table list and version set and add iterators to it.

Limitations:
(1) Only DB::NewIterator() without tailing uses the arena. Other cases, including readonly DB and compactions are still from malloc
(2) Two level iterator itself is allocated in arena, but not iterators inside it.

Test Plan: make all check

Reviewers: ljin, haobo

Reviewed By: haobo

Subscribers: leveldb, dhruba, yhchiang, igor

Differential Revision: https://reviews.facebook.net/D18513
2014-06-02 17:44:57 -07:00
sdong 462796697c dynamic_bloom: replace some divide (remainder) operations with shifts in locality mode, and other improvements
Summary:
This patch changes meaning of options.bloom_locality: 0 means disable cache line optimization and any positive number means use CACHE_LINE_SIZE as block size (the previous behavior is the block size will be CACHE_LINE_SIZE*options.bloom_locality). By doing it, the divide operations inside a block can be replaced by a shift.

Performance is improved:
https://reviews.facebook.net/P471

Also, improve the basic algorithm in two ways:
(1) make sure num of blocks is an odd number
(2) rotate bytes after every probe in locality mode. Since the divider is 2^n, unless doing it, we are never able to use all the bits.
Improvements of false positive: https://reviews.facebook.net/P459

Test Plan: make all check

Reviewers: ljin, haobo

Reviewed By: haobo

Subscribers: dhruba, yhchiang, igor, leveldb

Differential Revision: https://reviews.facebook.net/D18843
2014-06-02 17:36:38 -07:00
Ankit Gupta fa1b62cabd Merge remote-tracking branch 'upstream/master' 2014-05-29 18:33:11 -07:00