Summary: Cache shard bit 4 is sometimes too small and 6 is a more common value picked by users. Make that default. It shouldn't hurt much to change options.max_file_opening_threads default to be 16, which will reduce the worst case DB open time.
Test Plan: Run all existing tests.
Reviewers: IslamAbdelRahman, yhchiang, kradhakrishnan, igor, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, MarkCallaghan, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D55047
Summary:
Added the column family name to the properties block. This property
is omitted only if the property is unavailable, such as when RepairDB()
writes SST files.
In a next diff, I will change RepairDB to use this new property for
deciding to which column family an existing SST file belongs. If this
property is missing, it will add it to the "unknown" column family (same
as its existing behavior).
Test Plan:
New unit test:
$ ./db_table_properties_test --gtest_filter=DBTablePropertiesTest.GetColumnFamilyNameProperty
Reviewers: IslamAbdelRahman, yhchiang, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D55605
Summary:
When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache.
What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention.
Test Plan:
'export TEST_TMPDIR=/dev/shm/ && DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32' is OK.
I didn't run the Java tests, I don't have Java set up on my devserver.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56133
Summary:
When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache.
What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention.
When the table reader is destroyed, it releases the pinned blocks (if there were any). This has to happen before the cache is destroyed, so I had to introduce a TableReader::Close(), to guarantee the order of destruction.
Test Plan:
Added two unit tests for this. Existing unit tests run fine (default is pin_l0_filter_and_index_blocks_in_cache=false).
DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32
Mac: OK.
Linux: with D55287 patched in it's OK.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D54801
Summary:
This reverts commit 73c31377bb, which mistakenly
reverts 73c31377bb that fixes a bug when both
whole_key_filtering and prefix_extractor are set
Test Plan: revert the patch
Reviewers: anthony, IslamAbdelRahman, rven, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D52707
Summary: Add a new option to BlockBasedTableOptions that will allow us to change the restart interval for the index block
Test Plan: unit tests
Reviewers: yhchiang, anthony, andrewkr, sdong
Reviewed By: sdong
Subscribers: march, dhruba
Differential Revision: https://reviews.facebook.net/D53721
Summary:
table_test is failing because we are creating a temp InternalComparator
14:27:28 [ RUN ] BlockBasedTableTest.NoopTransformSeek
14:27:28 pure virtual method called
14:27:28 terminate called without an active exception
14:27:28 /bin/sh: line 7: 2346261 Aborted (core dumped) ./$t
Test Plan: make table_test -j64 && ./table_test --gtest_filter="BlockBasedTableTest.NoopTransformSeek"
Reviewers: igor, sdong, anthony, rven
Reviewed By: rven
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D52671
Summary:
This patch reverts commit 57605d7ef3 as it will
cause BlockBasedTableTest.NoopTransformSeek test crashes in some environment.
Test Plan: revert the patch
Reviewers: anthony, IslamAbdelRahman, rven, kradhakrishnan, sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D52623
Summary:
When both whole_key_filtering and prefix_extractor are set, RocksDB will
mistakenly encode prefix + whole key into the database instead of
simply whole key when BlockBasedTable is used. This patch fixes this bug.
Test Plan: Add a test in table_test
Reviewers: anthony, IslamAbdelRahman, rven, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D52233
PrefixMayMatch previously seeked in the prefix index using an internal
key with a sequence number of 0. This would cause the prefix index seek
to fall off the end if the last key in the index had a user-key greater
than or equal to the key being looked for. Falling off the end of the
index in turn results in PrefixMayMatch returning false if the index is
in memory.
Summary: If block_restart_interval gets set to less than 1 an assert will be triggered in BlockBuilder::BlockBuilder(). This prevents the user from doing this by silently setting any value less than 1 to 1.
Test Plan: Added a test (in BlockBasedTableTest in table_test) that checks invalid values to make sure that they are reset to the expected values. The block_restart_interval value is checked along with block_size_deviation which also silently sets the value if it is outside a specific range.
Reviewers: yoshinorim, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D52509
Summary:
myrocks seems to build rocksdb using
-Wmissing-field-initializers (and treats warnings as errors). This diff
adds that flag to the rocksdb build, and fixes the compilation failures
that result. I have not checked for any other differences in the build
flags for rocksdb build as part of myrocks.
Test Plan: make check
Reviewers: sdong, rven
Reviewed By: rven
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D52443
Summary:
This diff adds support for concurrent adds to the skiplist memtable
implementations. Memory allocation is made thread-safe by the addition of
a spinlock, with small per-core buffers to avoid contention. Concurrent
memtable writes are made via an additional method and don't impose a
performance overhead on the non-concurrent case, so parallelism can be
selected on a per-batch basis.
Write thread synchronization is an increasing bottleneck for higher levels
of concurrency, so this diff adds --enable_write_thread_adaptive_yield
(default off). This feature causes threads joining a write batch
group to spin for a short time (default 100 usec) using sched_yield,
rather than going to sleep on a mutex. If the timing of the yield calls
indicates that another thread has actually run during the yield then
spinning is avoided. This option improves performance for concurrent
situations even without parallel adds, although it has the potential to
increase CPU usage (and the heuristic adaptation is not yet mature).
Parallel writes are not currently compatible with
inplace updates, update callbacks, or delete filtering.
Enable it with --allow_concurrent_memtable_write (and
--enable_write_thread_adaptive_yield). Parallel memtable writes
are performance neutral when there is no actual parallelism, and in
my experiments (SSD server-class Linux and varying contention and key
sizes for fillrandom) they are always a performance win when there is
more than one thread.
Statistics are updated earlier in the write path, dropping the number
of DB mutex acquisitions from 2 to 1 for almost all cases.
This diff was motivated and inspired by Yahoo's cLSM work. It is more
conservative than cLSM: RocksDB's write batch group leader role is
preserved (along with all of the existing flush and write throttling
logic) and concurrent writers are blocked until all memtable insertions
have completed and the sequence number has been advanced, to preserve
linearizability.
My test config is "db_bench -benchmarks=fillrandom -threads=$T
-batch_size=1 -memtablerep=skip_list -value_size=100 --num=1000000/$T
-level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999
-disable_auto_compactions --max_write_buffer_number=8
-max_background_flushes=8 --disable_wal --write_buffer_size=160000000
--block_size=16384 --allow_concurrent_memtable_write" on a two-socket
Xeon E5-2660 @ 2.2Ghz with lots of memory and an SSD hard drive. With 1
thread I get ~440Kops/sec. Peak performance for 1 socket (numactl
-N1) is slightly more than 1Mops/sec, at 16 threads. Peak performance
across both sockets happens at 30 threads, and is ~900Kops/sec, although
with fewer threads there is less performance loss when the system has
background work.
Test Plan:
1. concurrent stress tests for InlineSkipList and DynamicBloom
2. make clean; make check
3. make clean; DISABLE_JEMALLOC=1 make valgrind_check; valgrind db_bench
4. make clean; COMPILE_WITH_TSAN=1 make all check; db_bench
5. make clean; COMPILE_WITH_ASAN=1 make all check; db_bench
6. make clean; OPT=-DROCKSDB_LITE make check
7. verify no perf regressions when disabled
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: MarkCallaghan, IslamAbdelRahman, anthony, yhchiang, rven, sdong, guyg8, kradhakrishnan, dhruba
Differential Revision: https://reviews.facebook.net/D50589
Summary:
I added this line in my previous patch D48999 (which is incorrect)
We should not release the iterator since releasing it will evict the blocks from cache
Test Plan:
Run the test under valgrind
make check
Reviewers: rven, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D52161
Summary:
This patch update the Iterator API to introduce new functions that allow users to keep the Slices returned by key() valid as long as the Iterator is not deleted
ReadOptions::pin_data : If true keep loaded blocks in memory as long as the iterator is not deleted
Iterator::IsKeyPinned() : If true, this mean that the Slice returned by key() is valid as long as the iterator is not deleted
Also add a new option BlockBasedTableOptions::use_delta_encoding to allow users to disable delta_encoding if needed.
Benchmark results (using https://phabricator.fb.com/P20083553)
```
// $ du -h /home/tec/local/normal.4K.Snappy/db10077
// 6.1G /home/tec/local/normal.4K.Snappy/db10077
// $ du -h /home/tec/local/zero.8K.LZ4/db10077
// 6.4G /home/tec/local/zero.8K.LZ4/db10077
// Benchmarks for shard db10077
// _build/opt/rocks/benchmark/rocks_copy_benchmark \
// --normal_db_path="/home/tec/local/normal.4K.Snappy/db10077" \
// --zero_db_path="/home/tec/local/zero.8K.LZ4/db10077"
// First run
// ============================================================================
// rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s
// ============================================================================
// BM_StringCopy 1.73s 576.97m
// BM_StringPiece 103.74% 1.67s 598.55m
// ============================================================================
// Match rate : 1000000 / 1000000
// Second run
// ============================================================================
// rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s
// ============================================================================
// BM_StringCopy 611.99ms 1.63
// BM_StringPiece 203.76% 300.35ms 3.33
// ============================================================================
// Match rate : 1000000 / 1000000
```
Test Plan: Unit tests
Reviewers: sdong, igor, anthony, yhchiang, rven
Reviewed By: rven
Subscribers: dhruba, lovro, adsharma
Differential Revision: https://reviews.facebook.net/D48999
Summary:
Reverting https://reviews.facebook.net/D34269 for now
after I landed it a flaky test started continuously failing, I am almost sure this patch is not related to the test but I will revert it until I figure out why it's failing
Test Plan: make check
Reviewers: kradhakrishnan
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D50385
Summary:
This diff introduce new table properties that will be written for block based tables
These properties are
- comparator name
- merge operator name
- property collectors names
Test Plan:
- Added a new unit test to verify that these tests are written/read correctly
- Running all other tests right now (wont land until all tests finish)
Reviewers: rven, kradhakrishnan, igor, sdong, anthony, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D34269
Summary:
We are cleaning up dependencies.
This diff takes a first step at moving memtable files to their own
directory called memtable. In future diffs, we will move other memtable
files from db to memtable.
Test Plan: make check
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D48915
Summary:
Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
Test Plan: Run all existing tests.
Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
Reviewed By: rven
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D48549
Summary:
Fix tests that compile under ROCKSDB_LITE but currently failing.
table_test:
RandomizedLongDB test is using internal stats which is not supported in ROCKSDB_LITE
compaction_job_test:
Using CompactionJobStats which is not supported
perf_context_test:
KeyComparisonCount test try to open DB in ReadOnly mode which is not supported
Test Plan: run the tests under ROCKSDB_LITE
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D48585
Summary: Pass column family ID through TablePropertiesCollectorFactory::CreateTablePropertiesCollector() so that users can identify which column family this file is for and handle it differently.
Test Plan: Add unit test scenarios in tests related to table properties collectors to verify the information passed in is correct.
Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D48411
Summary: Add 2 new counters BLOCK_CACHE_BYTES_WRITE, BLOCK_CACHE_BYTES_READ to keep track of how many bytes were written to the cache and how many bytes that we read from cache
Test Plan: make check
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D48195
Summary:
PlainTableReader now only allows mmap-mode. Add the support to non-mmap mode for more flexibility.
Refactor the codes to move all logic of reading data to PlainTableKeyDecoder, and consolidate the calls to Read() call and ReadVarint32() call. Implement the calls for both of mmap and non-mmap case seperately. For non-mmap mode, make copy of keys in several places when we need to move the buffer after reading the keys.
Test Plan: Add the mode of non-mmap case in plain_table_db_test. Run it in valgrind mode too.
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D47187
Summary:
Refactoring NewTableReader to accept TableReaderOptions
This will make it easier to add new options in the future, for example in this diff https://reviews.facebook.net/D46071
Test Plan: run existing tests
Reviewers: igor, yhchiang, anthony, rven, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D46179
Summary:
This optimizes the case when (cache_index_and_filter_blocks=1) and bloom filter is not present in the cache.
Previously we did:
1. Read meta block from file
2. Read the filter position from the meta block
3. Read the filter
Now, we pre-load the filter position on Table::Open(), so we can skip steps (1) and (2) on bloom filter cache miss. Instead of 2 IOs, we do only 1.
Test Plan: make check
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D46047
Summary:
This diff is a collection of cleanups that were initially part of D43179.
Additionally it adds a unified way of defining key-value maps that use a
Comparator for sorting (this was previously implemented in four different
places).
Test Plan: make clean check all
Reviewers: rven, anthony, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D45993
Summary: Add ZSTD compression type. The same way as adding LZ4.
Test Plan: run all tests. Generate files in db_bench. Make sure reads succeed. But the SST files cannot be opened in older versions. Also some other adhoc tests.
Reviewers: rven, anthony, IslamAbdelRahman, kradhakrishnan, igor
Reviewed By: igor
Subscribers: MarkCallaghan, maykov, yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D45747
Summary:
While working on https://reviews.facebook.net/D43179 , I found
duplicate code in the tests. This patch removes it.
Test Plan: make clean all check
Reviewers: igor, sdong, rven, anthony, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43263
Summary: Remove plain table tests from table_test since plain table is not supported in ROCKSDB_LITE
Test Plan: table_test
Reviewers: igor, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42153
Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
Test Plan: Run all existing unit tests.
Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42321
Summary: Optimistic transactions supporting begin/commit/rollback semantics. Currently relies on checking the memtable to determine if there are any collisions at commit time. Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty. You should probably start with transaction.h to get an overview of what is currently supported.
Test Plan: Added a new test, but still need to look into stress testing.
Reviewers: yhchiang, igor, rven, sdong
Reviewed By: sdong
Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D33435
Summary: Now we add warnings when user configures compression and the compression is not supported.
Test Plan:
Configured compression to non-supported values. Observed messages in my log:
2015/03/26-12:17:57.586341 7ffb8a496840 [WARN] Compression type chosen for level 2 is not supported: LZ4. RocksDB will not compress data on level 2.
2015/03/26-12:19:10.768045 7f36f15c5840 [WARN] Compression type chosen is not supported: LZ4. RocksDB will not compress data.
Reviewers: rven, sdong, yhchiang
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35979
Summary:
Currently users have no idea a key is add, delete or merge from TablePropertiesCollector call back. Add a new function to add it.
Also refactor the codes so that
(1) make table property collector and internal table property collector two separate data structures with the later one now exposed
(2) table builders only receive internal table properties
Test Plan: Add cases in table_properties_collector_test to cover both of old and new ways of using TablePropertiesCollector.
Reviewers: yhchiang, igor.sugak, rven, igor
Reviewed By: rven, igor
Subscribers: meyering, yoshinorim, maykov, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D35373
Summary:
We have addded new stats and perf_context for measuring the merge and filter operation time consumption.
We have bounded all the merge operations within the GUARD statment and collected the total time for these operations in the DB.
Test Plan: WIP
Reviewers: rven, yhchiang, kradhakrishnan, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D34377
Summary:
Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different.
In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest.
There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides.
```lang=bash
% cat ~/transform
#!/bin/sh
files=$(git ls-files '*test\.cc')
for file in $files
do
if grep -q "rocksdb::test::RunAllTests()" $file
then
if grep -Eq '^class \w+Test {' $file
then
perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file
perl -pi -e 's/^(TEST)/${1}_F/g' $file
fi
perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file
perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file
fi
done
% sh ~/transform
% make format
```
Second iteration of this diff contains only scripted changes.
Third iteration contains manual changes to fix last errors and make it compilable.
Test Plan:
Build and notice no errors.
```lang=bash
% USE_CLANG=1 make check -j55
```
Tests are still testing.
Reviewers: meyering, sdong, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35157
Summary:
gtest does not use exceptions to fail a unit test by design, and `ASSERT*`s are implemented using `return`. As a consequence we cannot use `ASSERT*` in a function that does not return `void` value ([[ https://code.google.com/p/googletest/wiki/AdvancedGuide#Assertion_Placement | 1]]), and have to fix our existing code. This diff does this in a generic way, with no manual changes.
In order to detect all existing `ASSERT*` that are used in functions that doesn't return void value, I change the code to generate compile errors for such cases.
In `util/testharness.h` I defined `EXPECT*` assertions, the same way as `ASSERT*`, and redefined `ASSERT*` to return `void`. Then executed:
```lang=bash
% USE_CLANG=1 make all -j55 -k 2> build.log
% perl -naF: -e 'print "-- -number=".$F[1]." ".$F[0]."\n" if /: error:/' \
build.log | xargs -L 1 perl -spi -e 's/ASSERT/EXPECT/g if $. == $number'
% make format
```
After that I reverted back change to `ASSERT*` in `util/testharness.h`. But preserved introduced `EXPECT*`, which is the same as `ASSERT*`. This will be deleted once switched to gtest.
This diff is independent and contains manual changes only in `util/testharness.h`.
Test Plan:
Make sure all tests are passing.
```lang=bash
% USE_CLANG=1 make check
```
Reviewers: igor, lgalanis, sdong, yufei.zhu, rven, meyering
Reviewed By: meyering
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33333
Summary: These changes are necessary to make tests look more generic, and avoid feature conflicts with gtest.
Test Plan:
Make sure no build errors, and all test are passing.
```
% make check
```
Reviewers: igor, meyering
Reviewed By: meyering
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35145
Summary:
Pre-fetching is a common operation performed by data stores for
disk/flash based systems as part of database startup.
This is part of task 5197184.
Test Plan: Run the newly added unit test
Reviewers: rven, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33933
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.
Prerequisites: bear and clang 3.5 build with extra tools
```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```
Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```
Reviewers: igor, kradhakrishnan, rven, meyering, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D34077
Summary:
Adding -W and -Wextra to CXXFLAGS provoked this failure:
table/table_test.cc:1854:56: error: missing initializer for member ‘rocksdb::TestArgs::format_version’ [-Werror=missing-field-initializers]
TestArgs args = { DB_TEST, false, 16, kNoCompression };
^
Add the missing, 5th value (format_version).
Test Plan:
Run "make EXTRA_CXXFLAGS='-W -Wextra'" and see fewer errors.
Reviewers: ljin, sdong, igor.sugak, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33765
Summary:
This diff adds BlockBasedTable format_version = 2. New format version brings better compressed block format for these compressions:
1) Zlib -- encode decompressed size in compressed block header
2) BZip2 -- encode decompressed size in compressed block header
3) LZ4 and LZ4HC -- instead of doing memcpy of size_t encode size as varint32. memcpy is very bad because the DB is not portable accross big/little endian machines or even platforms where size_t might be 8 or 4 bytes.
It does not affect format for snappy.
If you write a new database with format_version = 2, it will not be readable by RocksDB versions before 3.10. DB::Open() will return corruption in that case.
Test Plan:
Added a new test in db_test.
I will also run db_bench and verify VSIZE when block_cache == 1GB
Reviewers: yhchiang, rven, MarkCallaghan, dhruba, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31461
Summary:
In this diff I add another parameter to BlockBasedTableOptions that will let users specify block based table's format. This will greatly simplify block based table's format changes in the future.
First format change that this will support is encoding decompressed size in Zlib and BZip2 blocks. This diff is blocking https://reviews.facebook.net/D31311.
Test Plan: Added a unit tests. More tests to come as part of https://reviews.facebook.net/D31311.
Reviewers: dhruba, MarkCallaghan, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31383
Summary:
Introduces a new class for managing write buffer memory across column
families. We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.
Test Plan: Added SharedWriteBuffer unit test to db_test.cc
Reviewers: sdong, rven, ljin, igor
Reviewed By: igor
Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim
Differential Revision: https://reviews.facebook.net/D22581
Summary:
In some environment such as android, the c++ library does not have
std::to_string. This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.
Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D29181
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out
Test Plan: make all check
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D29073