Commit graph

373 commits

Author SHA1 Message Date
sdong 68af7811ea Remember whole key/prefix filtering on/off in SST file
Summary: Remember whole key or prefix filtering on/off in SST files. If user opens the DB with a different setting that cannot be satisfied while reading the SST file, ignore the bloom filter.

Test Plan: Add a unit test for it

Reviewers: yhchiang, igor, rven

Reviewed By: rven

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D32889
2015-02-11 11:20:04 -08:00
sdong e63140d52b Get() to use prefix bloom filter when filter is not block based
Summary:
Get() now doesn't make use of bloom filter if it is prefix based. Add the check.
Didn't touch block based bloom filter. I can't fully reason whether it is correct to do that. But it's straight-forward to for full bloom filter.

Test Plan:
make all check
Add a test case in DBTest

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: MarkCallaghan, leveldb, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D31941
2015-02-04 15:15:41 -08:00
Igor Canadi 9ab5adfc59 New BlockBasedTable version -- better compressed block format
Summary:
This diff adds BlockBasedTable format_version = 2. New format version brings better compressed block format for these compressions:
1) Zlib -- encode decompressed size in compressed block header
2) BZip2 -- encode decompressed size in compressed block header
3) LZ4 and LZ4HC -- instead of doing memcpy of size_t encode size as varint32. memcpy is very bad because the DB is not portable accross big/little endian machines or even platforms where size_t might be 8 or 4 bytes.

It does not affect format for snappy.

If you write a new database with format_version = 2, it will not be readable by RocksDB versions before 3.10. DB::Open() will return corruption in that case.

Test Plan:
Added a new test in db_test.
I will also run db_bench and verify VSIZE when block_cache == 1GB

Reviewers: yhchiang, rven, MarkCallaghan, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31461
2015-01-14 16:24:24 -08:00
Igor Canadi 96b8240bc5 Support footer versions bigger than 1
Summary:
In this diff I add another parameter to BlockBasedTableOptions that will let users specify block based table's format. This will greatly simplify block based table's format changes in the future.

First format change that this will support is encoding decompressed size in Zlib and BZip2 blocks. This diff is blocking https://reviews.facebook.net/D31311.

Test Plan: Added a unit tests. More tests to come as part of https://reviews.facebook.net/D31311.

Reviewers: dhruba, MarkCallaghan, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31383
2015-01-13 14:33:04 -08:00
Igor Canadi 15d2abbec3 Fix build issues 2015-01-09 13:04:06 -08:00
Igor Canadi abb9b95ffe Move compression functions from port/ to util/
Summary: We keep checksum functions in util/, there is no reason for compression to be in port/

Test Plan: compiles

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31281
2015-01-09 12:57:11 -08:00
Manish Patil 7ea7bdf04d Dump routine to BlockBasedTableReader
Summary: Added necessary routines for dumping block based SST with block filter

Test Plan: Added "raw" mode to utility sst_dump

Reviewers: sdong, rven

Reviewed By: rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D29679
2014-12-23 13:24:07 -08:00
Jonah Cohen a14b7873ee Enforce write buffer memory limit across column families
Summary:
Introduces a new class for managing write buffer memory across column
families.  We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.

Test Plan: Added SharedWriteBuffer unit test to db_test.cc

Reviewers: sdong, rven, ljin, igor

Reviewed By: igor

Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim

Differential Revision: https://reviews.facebook.net/D22581
2014-12-02 12:09:20 -08:00
Yueh-Hsuan Chiang 7e608e2fe3 Block plain_table_index.cc in ROCKSDB_LITE
Summary: Block plain_table_index.cc in ROCKSDB_LITE

Test Plan:
make clean
make OPT=-DROCKSDB_LITE shared_lib -j32
make clean
make shared_lib -j32

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29535
2014-11-24 20:47:27 -08:00
Yueh-Hsuan Chiang 13de000f07 Add rocksdb::ToString() to address cases where std::to_string is not available.
Summary:
In some environment such as android, the c++ library does not have
std::to_string.  This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.

Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29181
2014-11-24 20:44:49 -08:00
Bryan Rosario 9e285d4238 Added CompatibleOptions for compatibility with LevelDB Options
Summary: Created a CompatibleOptions object that can be used as a LevelDB Options object and then converted to a RocksDB Options object using the ConvertOptions() method.

Test Plan: Unit test included in diff.

Reviewers: ljin

Reviewed By: ljin

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28893
2014-11-20 19:24:39 -08:00
Lei Jin 8d3f8f9696 remove all remaining references to cfd->options()
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29073
2014-11-18 10:20:10 -08:00
Igor Canadi 9be338cf9d CompactionJobTest
Summary:
This is just a simple test that passes two files though a compaction. It shows the framework so that people can continue building new compaction *unit* tests.
In the future we might want to move some Compaction* tests from DBTest here. For example, CompactBetweenSnapshot seems a good candidate.

Hopefully this test can be simpler when we mock out VersionSet.

Test Plan: this is a test

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28449
2014-11-14 11:35:48 -08:00
sdong a177742a9b Make db_stress built for ROCKSDB_LITE
Summary:
Make db_stress built for ROCKSDB_LITE.
The test doesn't pass tough. It seg fault quickly. But I took a look and it doesn't seem to be related to lite version. Likely to be a bug inside RocksDB.

Test Plan: make db_stress

Reviewers: yhchiang, rven, ljin, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28797
2014-11-14 10:20:51 -08:00
Igor Canadi 25f273027b Fix iOS compile with -Wshorten-64-to-32
Summary: So iOS size_t is 32-bit, so we need to static_cast<size_t> any uint64_t :(

Test Plan: TARGET_OS=IOS make static_lib

Reviewers: dhruba, ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28743
2014-11-13 14:39:30 -05:00
Igor Canadi 767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Igor Canadi 68effa0348 Fix -Wshadow for tools
Summary: Previously I made `make check` work with -Wshadow, but there are some tools that are not compiled using `make check`.

Test Plan: make all

Reviewers: yhchiang, rven, ljin, sdong

Reviewed By: ljin, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28497
2014-11-07 15:04:30 -08:00
Igor Canadi 9f20395cd6 Turn -Wshadow back on
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.

Test Plan: compiles

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28131
2014-11-06 11:14:28 -08:00
Igor Canadi 9f7fc3ac45 Turn on -Wshadow
Summary:
...and fix all the errors :)

Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.

Test Plan: compiles

Reviewers: yhchiang, rven, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27711
2014-10-31 11:59:54 -07:00
Yueh-Hsuan Chiang 98849a35fa Apply InfoLogLevel to the logs in table/block_based_table_reader.cc
Summary:
Apply InfoLogLevel to the logs in table/block_based_table_reader.cc

Also, add missing checks for the returned status in BlockBasedTable::Open

Test Plan: make

Reviewers: sdong, ljin, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28005
2014-10-31 11:41:15 -07:00
Yueh-Hsuan Chiang 217cc217d7 Apply InfoLogLevel to the logs in table/meta_blocks.cc
Summary: Apply InfoLogLevel to the logs in table/meta_blocks.cc

Test Plan: make

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27903
2014-10-29 17:55:19 -07:00
Yueh-Hsuan Chiang fd95745a59 Fix compile error in table/plain_table_index.cc
Summary:
Fix compile error in table/plain_table_index.cc

Test Plan:
make
2014-10-29 17:42:38 -07:00
Yueh-Hsuan Chiang e7ad69b9fe Apply InfoLogLevel to the logs in table/plain_table_index.cc
Summary: Apply InfoLogLevel to the logs in table/plain_table_index.cc

Test Plan: make

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27909
2014-10-29 17:08:40 -07:00
Yueh-Hsuan Chiang bbd9c53457 Apply InfoLogLevel to the logs in table/block_based_table_builder.cc
Summary: Apply InfoLogLevel to the logs in table/block_based_table_builder.cc

Test Plan: make

Reviewers: igor, ljin, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27921
2014-10-29 17:08:20 -07:00
Igor Canadi 412b7f85bb Include atomic in mock_table.h 2014-10-28 18:10:55 -07:00
Igor Canadi abac3d6476 TableMock + framework for mock classes
Summary:
This diff replaces BlockBasedTable in flush_job_test with TableMock, making it depend on less things and making it closer to an unit test than integration test.

It also introduces a framework to compile mock classes -- Any file named *mock.cc will not be compiled into the build. It will only get compiled into the tests. What way we can mock out most other classes, Version, VersionSet, DBImpl, etc.

Test Plan: flush_job_test

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27681
2014-10-28 17:52:32 -07:00
Igor Canadi c1c68bce43 remove atomic_pointer.h references 2014-10-27 15:12:20 -07:00
Lei Jin f1841985e4 dynamic inplace_update options
Summary:
Make inplace_update_support and inplace_update_num_locks dynamic.
inplace_callback becomes immutable
We are almost free of references to cfd->options() in db_impl

Test Plan: unit test

Reviewers: igor, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25293
2014-10-27 12:10:13 -07:00
Lei Jin 839c376bd1 fix table_test
Summary:
SaveValue expects an internal key but I previously added to table a
user key

Test Plan:
ran the test
2014-10-22 13:53:35 -07:00
Lei Jin 0fd985f427 Avoid reloading filter on Get() if cache_index_and_filter_blocks == false
Summary:
This fixes the case that filter policy is missing in SST file, but we
open the table with filter policy on and cache_index_and_filter_blocks =
false. The current behavior is that we will try to load it every time on
Get() but fail.

Test Plan: unit test

Reviewers: yhchiang, igor, rven, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25455
2014-10-22 11:52:35 -07:00
Lei Jin 2dd9bfe3a8 Sanitize block-based table index type and check prefix_extractor
Summary:
Respond to issue reported
https://www.facebook.com/groups/rocksdb.dev/permalink/651090261656158/
Change the Sanitize signature to take both DBOptions and CFOptions

Test Plan: unit test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25041
2014-10-17 21:18:36 -07:00
Igor Canadi d6987216c9 Merge pull request #327 from dalgaaf/wip-da-SCA-20141001
Fix some issues from SCA
2014-10-02 10:59:52 -07:00
Lei Jin 5ec53f3edf make compaction related options changeable
Summary:
make compaction related options changeable. Most of changes are tedious,
following the same convention: grabs MutableCFOptions at the beginning
of compaction under mutex, then pass it throughout the job and register
it in SuperVersion at the end.

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23349
2014-10-01 16:19:16 -07:00
Danny Al-Gaaf 28a6e31583 table/block_based_table_builder.cc: remove unused variable
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-10-01 10:49:09 +02:00
Danny Al-Gaaf 6b6cedbb1b table/format.cc: reduce scope of some variables
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-10-01 10:49:08 +02:00
Danny Al-Gaaf 55652043c8 table/cuckoo_table_reader.cc: pass func parameter by reference
Fix for:

[table/cuckoo_table_reader.cc:196]: (performance) Function
 parameter 'target' should be passed by reference.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-10-01 10:49:08 +02:00
Danny Al-Gaaf d517c83648 in_table_factory.cc: use correct format specifier
Use %zu instead of %zd since size_t and uint32_t are unsigned.

Fix for:

[table/plain_table_factory.cc:55]: (warning) %zd in format string (no. 1)
 requires 'ssize_t' but the argument type is 'size_t {aka unsigned long}'.
[table/plain_table_factory.cc:58]: (warning) %zd in format string (no. 1)
 requires 'ssize_t' but the argument type is 'size_t {aka unsigned long}'.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
Danny Al-Gaaf 063471bf76 table/table_test.cc: pass func parameter by reference
Fix for:

[table/table_test.cc:1218]: (performance) Function parameter
 'prefix' should be passed by reference.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
Danny Al-Gaaf 93548ce8f4 table/cuckoo_table_reader.cc: pass func parameter by ref
Fix for:

[table/cuckoo_table_reader.cc:198]: (performance) Function
 parameter 'file_data' should be passed by reference.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:32 +02:00
Danny Al-Gaaf 8ce050b51b table/bloom_block.*: pass func parameter by reference
[table/bloom_block.h:29]: (performance) Function parameter
 'keys_hashes' should be passed by reference.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-30 23:30:31 +02:00
Lei Jin 2faf49d5f1 use GetContext to replace callback function pointer
Summary:
Intead of passing callback function pointer and its arg on Table::Get()
interface, passing GetContext. This makes the interface cleaner and
possible better perf. Also adding a fast pass for SaveValue()

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24057
2014-09-29 11:09:09 -07:00
Lei Jin 2dc6f62bb9 handle kDelete type in cuckoo builder
Summary:
when I changed std::vector<std::string, std::string> to std::string to
store key/value pairs in builder, I missed the handling for kDeletion
type. As a result, value_size_ can be wrong if the first add key is for
deletion.
The is captured by ./cuckoo_table_db_test

Test Plan:
./cuckoo_table_db_test
./cuckoo_table_reader_test
./cuckoo_table_builder_test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24045
2014-09-29 10:25:21 -07:00
Lei Jin d439451fab delay initialization of cuckoo table iterator
Summary:
cuckoo table iterator creation is quite expensive since it needs to load
all data and sort them. After compaction, RocksDB creates a new iterator
of the new file to make sure it is in good state. That makes the DB
creation quite slow. Delay the iterator db sort to the seek time to
speed it up.

Test Plan: db_bench

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23775
2014-09-25 16:45:37 -07:00
Lei Jin 94997eab5e reduce memory usage of cuckoo table builder
Summary:
builder currently buffers all key value pairs as a vector of
pair<string, string>. That is too much due to std::string
overhead. It wasn't able to fit 1B key/values (12bytes total) in 100GB
of ram. Switch to use a plain string to store the key/value sequence and
use only 12GB of ram as a result.

Test Plan: db_bench

Reviewers: igor, sdong, yhchiang

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23763
2014-09-25 16:34:24 -07:00
Lei Jin c6275956e2 improve memory efficiency of cuckoo reader
Summary:
When creating a new iterator, instead of storing mapping from key to
bucket id for sorting, store only bucket id and read key from mmap file
based on the id. This reduces from 20 bytes per entry to only 4 bytes.

Test Plan: db_bench

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23757
2014-09-25 16:15:23 -07:00
Lei Jin 581442d446 option to choose module when calculating CuckooTable hash
Summary:
Using module to calculate hash makes lookup ~8% slower. But it has its
benefit: file size is more predictable, more space enffient

Test Plan: db_bench

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23691
2014-09-25 13:53:27 -07:00
Lei Jin 3c68006109 CompactedDBImpl
Summary:
Add a CompactedDBImpl that will enabled when calling OpenForReadOnly()
and the DB only has one level (>0) of files. As a performan comparison,
CuckooTable performs 2.1M/s with CompactedDBImpl vs. 1.78M/s with
ReadOnlyDBImpl.

Test Plan: db_bench

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23553
2014-09-25 11:14:01 -07:00
Lei Jin 0a29ce5393 re-enable BlockBasedTable::SetupForCompaction()
Summary:
It was commented out in D22545 by accident. Keep the option in
ImmutableOptions for now. I can make it dynamic in
https://reviews.facebook.net/D23349

Test Plan: make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23865
2014-09-23 14:18:57 -07:00
Igor Canadi 55af370756 Remove TODO for checking index checksums 2014-09-23 13:02:23 -07:00
Igor Canadi 3d74f09979 Fix compile 2014-09-22 15:19:20 -07:00