Commit graph

139 commits

Author SHA1 Message Date
Igor Canadi f9d0530213 Don't care about signed/unsigned compare
Summary:
We need to stop these:
https://github.com/facebook/rocksdb/pull/99
https://github.com/facebook/rocksdb/pull/83

Test Plan: no

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16905
2014-03-17 09:41:41 -07:00
Igor Canadi 9c8ad62691 DB Sanity Test
Summary:
@kailiu mentioned on meeting yesterday that we sometimes have trouble opening DB created by old version with the new version. This will be very important to test for column families, since I'm changing disk format for the MANIFEST.

I added a tool that can help us test that. Usage:
./db_sanity_test <path> create
will create a bunch of DBs under <path>
<change RocksDB version>
./db_sanity_test <path> verify
will verify consistency of DBs created under <path>

Test Plan: ran the db_sanity_test

Reviewers: kailiu, dhruba, haobo

Reviewed By: kailiu

CC: leveldb, kailiu, xjin

Differential Revision: https://reviews.facebook.net/D16605
2014-03-06 11:36:39 -08:00
kailiu 906f3dca72 Add a hash-index component for block
Summary:
this is the key component extracted from diff: https://reviews.facebook.net/D14271
I separate it to a dedicated patch to make the review easier.

Test Plan: added a unit test and passed it.

Reviewers: haobo, sdong, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16245
2014-03-03 21:11:49 -08:00
kailiu 6b9da48a03 Get rid of all optimization flags in debug mode 2014-03-03 21:06:24 -08:00
Kai Liu 6ba1084f24 Fix some compilation bugs in different platforms
Summary:

detect some problems when testing my 3rd party release tool.
2014-02-27 22:20:17 -08:00
Lei Jin b2795b799e thread local pointer storage
Summary:
This is not a generic thread local implementation in the sense that it
only takes pointer. But it does support multiple instances per thread
and lets user plugin function to perform cleanup when thread exits or an
instance gets destroyed.

Test Plan: unit test for now

Reviewers: haobo, igor, sdong, dhruba

Reviewed By: igor

CC: leveldb, kailiu

Differential Revision: https://reviews.facebook.net/D16131
2014-02-25 17:47:37 -08:00
sdong 01c27be5fb A simple benchmark to measure WAL append latency
Summary: A simple benchmark that simulates WAL append. It can be used to test different platform/file system's performance on WAL.

Test Plan: run it.

Reviewers: haobo, kailiu

Reviewed By: haobo

CC: igor, dhruba, i.am.jin.lei, yhchiang, leveldb, nkg-

Differential Revision: https://reviews.facebook.net/D16239
2014-02-24 14:39:32 -08:00
sdong b2d29675c8 Add a test in prefix_test to verify correctness of results
Summary:
Add a test to verify HashLinkList and HashSkipList (mainly for the former one) returns the correct results when inserting the same bucket in the different orders.

Some other changes:
(1) add the test to test list
(2) fix compile error
(3) add header

Test Plan: ./prefix_test

Reviewers: haobo, kailiu

Reviewed By: haobo

CC: igor, yhchiang, i.am.jin.lei, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D16143
2014-02-19 17:00:34 -08:00
sdong b5140a0361 Fix table_reader_bench and add it to "make"
Summary: Fix table_reader_bench after some interface changes. Add it to make to avoid future breaking

Test Plan: make table_reader_bench and run it with different options.

Reviewers: kailiu, haobo

Reviewed By: haobo

CC: igor, leveldb

Differential Revision: https://reviews.facebook.net/D16107
2014-02-12 18:31:02 -08:00
Kai Liu d4b789fdee Add LIBRARY back to make dbg 2014-02-10 20:15:09 -08:00
kailiu 1d08140e81 Compile -O2 by default and add make dbg
Summary: To speed up the compilation while allowing us to compile in debug mode.

Test Plan:
make: see -O2 enabled
make dbg: didn't see -O2

Reviewers: igor

Reviewed By: igor

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D15969
2014-02-06 16:11:35 -08:00
Kai Liu fd0ffbc7ca Disable the html-based coverage report by default 2014-02-06 12:58:13 -08:00
Igor Canadi ac92420fc5 Merge branch 'master' into performance
Conflicts:
	db/db_impl.h
2014-01-30 10:09:23 -08:00
Kai Liu b7db241118 LIBNAME in Makefile is not really configurable
Summary:
In new third-party release tool, `LIBNAME=<customized_library> make`
will not really change the LIBNAME.

However it's very odd that the same approach works with old third-party
release tools. I checked previous rocksdb version and both librocksdb.a
and librocksdb_debug.a were correctly generated and copied to the
right place.

Test Plan:
`LIBNAME=hello make -j32` generates hello.a
`make -j32` generates librocksdb.a

Reviewers: igor, sdong, haobo, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15555
2014-01-29 11:35:05 -08:00
kailiu a5e220f5ef Merge branch 'master' into performance
Conflicts:
	Makefile
	db/db_impl.cc
	db/db_test.cc
	db/memtable_list.cc
	db/memtable_list.h
	table/block_based_table_reader.cc
	table/table_test.cc
	util/cache.cc
	util/coding.cc
2014-01-28 10:35:55 -08:00
kailiu f131d4c280 Add a make target for shared library
Summary:
Previous we made `make release` also compile shared library. However it takes a long time to complete.

To make our development process more efficient. I added a new make target shared_lib.

User can of course run `make <library_name>` for direct compilation. However the <library_name> changed under certain condition. Thus we need `make shared_lib` to get rid of the memorization from users' side.

Test Plan: make shared_lib

Reviewers: igor, sdong, haobo, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15309
2014-01-24 11:56:01 -08:00
Kai Liu 23576d773f Remove the extra line in "make release"
Summary:

that line was introduced during merge.
2014-01-16 17:01:34 -08:00
kailiu 1304d8c8ce Merge branch 'master' into performance
Conflicts:
	Makefile
	db/db_impl.cc
	db/db_impl.h
	db/db_test.cc
	db/memtable.cc
	db/memtable.h
	db/version_edit.h
	db/version_set.cc
	include/rocksdb/options.h
	util/hash_skiplist_rep.cc
	util/options.cc
2014-01-15 23:12:31 -08:00
kailiu 481c77e526 Move the compilation of the shared libraries to "make release"
Compiling the shared libraries took a long time. Thus to speed up the development speed, it still makes sense to be separated from regular compilation.
2014-01-14 13:54:33 -08:00
Kai Liu d702d8073e A script that automatically reformat affected lines
Summary:
Added a script that reformat only the affected lines in a given diff.

I planned to make that file as pre-commit hook but looks it's a little bit more difficult than I thought. Since I don't want to spend too much time on this task right now, I eventually added a "make command" to achieve this with a few additional key strokes.

Also make the clang-format solely inherited from Google's style -- there are still debates on some of the style issues, but we can address them later once we reach a consensus.

Test Plan: Did some ugly format change and ran "make format", all affected lines are formatted as expected.

Reviewers: igor, sdong, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15147
2014-01-14 12:21:24 -08:00
kailiu ac2fe72832 Compile dynamic library by default
Summary:
Per request, some users need to use dynamic rocksdb library instead of static one.

However currently the dynamic libraries have to be manually compiled by default, which is inconvenient. I made dymamic libraries to be compiled by default.

Test Plan: make clean; make; make clean;

Reviewers: haobo, sdong, dhruba, igor

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15117
2014-01-14 00:28:10 -08:00
kailiu e72aa37cc5 Merge branch 'master' into performance
Conflicts:
	db/table_cache.cc
2014-01-02 16:34:59 -08:00
Igor Canadi 345fb94d26 moving autovector_test after db_test 2014-01-02 03:30:29 -08:00
kailiu f1cec73a76 Merge branch 'master' into performance
Conflicts:
	db/db_impl.cc
	db/db_test.cc
	db/memtable.cc
	db/version_set.cc
	include/rocksdb/statistics.h
2013-12-27 12:23:17 -08:00
kailiu c01676e46d Implement autovector
Summary:
A vector that leverages pre-allocated stack-based array to achieve better
performance for array with small amount of items.

Test Plan:
Added tests for both correctness and performance

Here is the performance benchmark between vector and autovector

Please note that in the test "Creation and Insertion Test", the test case were designed with the motivation described below:

* no element inserted: internal array of std::vector may not really get
  initialize.
* one element inserted: internal array of std::vector must have
  initialized.
* kSize elements inserted. This shows the most time we'll spend if we
  keep everything in stack.
* 2 * kSize elements inserted. The internal vector of
  autovector must have been initialized.

Note: kSize is the capacity of autovector

  =====================================================
  Creation and Insertion Test
  =====================================================
  created 100000 vectors:
  	each was inserted with 0 elements
  	total time elapsed: 128000 (ns)
  created 100000 autovectors:
  	each was inserted with 0 elements
  	total time elapsed: 3641000 (ns)
  created 100000 VectorWithReserveSizes:
  	each was inserted with 0 elements
  	total time elapsed: 9896000 (ns)
  -----------------------------------
  created 100000 vectors:
  	each was inserted with 1 elements
  	total time elapsed: 11089000 (ns)
  created 100000 autovectors:
  	each was inserted with 1 elements
  	total time elapsed: 5008000 (ns)
  created 100000 VectorWithReserveSizes:
  	each was inserted with 1 elements
  	total time elapsed: 24271000 (ns)
  -----------------------------------
  created 100000 vectors:
  	each was inserted with 4 elements
  	total time elapsed: 39369000 (ns)
  created 100000 autovectors:
  	each was inserted with 4 elements
  	total time elapsed: 10121000 (ns)
  created 100000 VectorWithReserveSizes:
  	each was inserted with 4 elements
  	total time elapsed: 28473000 (ns)
  -----------------------------------
  created 100000 vectors:
  	each was inserted with 8 elements
  	total time elapsed: 75013000 (ns)
  created 100000 autovectors:
  	each was inserted with 8 elements
  	total time elapsed: 18237000 (ns)
  created 100000 VectorWithReserveSizes:
  	each was inserted with 8 elements
  	total time elapsed: 42464000 (ns)
  -----------------------------------
  created 100000 vectors:
  	each was inserted with 16 elements
  	total time elapsed: 102319000 (ns)
  created 100000 autovectors:
  	each was inserted with 16 elements
  	total time elapsed: 76724000 (ns)
  created 100000 VectorWithReserveSizes:
  	each was inserted with 16 elements
  	total time elapsed: 68285000 (ns)
  -----------------------------------
  =====================================================
  Sequence Access Test
  =====================================================
  performed 100000 sequence access against vector
  	size: 4
  	total time elapsed: 198000 (ns)
  performed 100000 sequence access against autovector
  	size: 4
  	total time elapsed: 306000 (ns)
  -----------------------------------
  performed 100000 sequence access against vector
  	size: 8
  	total time elapsed: 565000 (ns)
  performed 100000 sequence access against autovector
  	size: 8
  	total time elapsed: 512000 (ns)
  -----------------------------------
  performed 100000 sequence access against vector
  	size: 16
  	total time elapsed: 1076000 (ns)
  performed 100000 sequence access against autovector
  	size: 16
  	total time elapsed: 1070000 (ns)
  -----------------------------------

Reviewers: dhruba, haobo, sdong, chip

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14655
2013-12-26 15:03:47 -08:00
Igor Canadi e914b6490d Reorder tests
Summary:
db_test should be the first to execute because it finds the most bugs.

Also, when third parties report issues, we don't want ldb error message, we prefer to have db_test error message. For example, see thread: https://github.com/facebook/rocksdb/issues/25

Test Plan: make check

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14715
2013-12-18 13:37:06 -08:00
Haobo Xu 3c02c363b3 [RocksDB] [Performance Branch] Added dynamic bloom, to be used for memable non-existing key filtering
Summary: as title

Test Plan: dynamic_bloom_test

Reviewers: dhruba, sdong, kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14385
2013-12-11 00:15:14 -08:00
Igor Canadi 0e2c966f58 Merge pull request #29 from sepeth/fix-shared-lib-build
Build shared lib and put linker flags to the end
2013-12-10 09:19:12 -08:00
Doğan Çeçen f6012ab826 Fix shared lib build 2013-12-10 09:35:22 +02:00
Igor Canadi fb9fce4fc3 [RocksDB] BackupableDB
Summary:
In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.

There are multiple things you can configure:
1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
2. sync - if true, it *guarantees* backup consistency on machine reboot
3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
5. More things you can find in BackupableDBOptions

Here is the directory structure I use:

   backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
               0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
               files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
               files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files

All the files are ref counted and deleted immediatelly when they get out of scope.

Some other stuff in this diff:
1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.

Test Plan:
I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.

Also, `make asan_check`

Reviewers: dhruba, haobo, emayanke

Reviewed By: dhruba

CC: leveldb, haobo

Differential Revision: https://reviews.facebook.net/D14295
2013-12-09 14:06:52 -08:00
Kai Liu 8c424456fc Make the default compilation debug-friendly 2013-12-02 20:05:16 -08:00
Siying Dong b59d4d5a50 A Simple Plain Table
Summary:
A Simple plain table format. No block structure. When creating the table reader, scanning the full table to create indexes.

Test Plan:Add unit test

Reviewers:haobo,dhruba,kailiu

CC:

Task ID: #

Blame Rev:
2013-11-20 18:44:22 -08:00
Igor Canadi 8906ab59f7 Split asan_check into asan_check and asan_crash_test 2013-11-19 16:44:40 -08:00
Igor Canadi 92d905026b make asan_check
Summary: Add asan_check rule to Makefile. After we add this, we will create Jenkins run that will check for asan errors!

Test Plan: make asan_check

Reviewers: dhruba, kailiu, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14205
2013-11-19 16:33:24 -08:00
kailiu 1415f8820d Improve the "table stats"
Summary:
The primary motivation of the changes is to make it easier to figure out the inside of the tables.

* rename "table stats" to "table properties" since now we have more than "integers" to store in the property block.
* Add filter block size to the basic table properties.
* Whenever a table is built, we'll log the table properties (the sample output is in Test Plan).
* Make an api to expose deleted keys.

Test Plan:
Passed all existing test. and the sample output of table stats:

    ==================================================================
        Basic Properties
    ------------------------------------------------------------------
                  # data blocks: 1
                      # entries: 1

                   raw key size: 9
           raw average key size: 9
                 raw value size: 9
         raw average value size: 0

                data block size: 25
               index block size: 27
              filter block size: 18
         (estimated) table size: 70

                  filter policy: rocksdb.BuiltinBloomFilter
    ==================================================================
        User collected properties: InternalKeyPropertiesCollector
    ------------------------------------------------------------------
                    kDeletedKeys: 1
    ==================================================================

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14187
2013-11-19 16:29:42 -08:00
Igor Canadi f611aba559 Move the compiler back to 4.8.1 + more small fixes
Summary:
1. Moved the compiler back to 4.8.1 and uses Centos 5.2 binaries if OS is Centos 5.2.

2. Fixes this issue: https://github.com/facebook/rocksdb/issues/7

3. We use lot of c++11 features, so we can't pretend we can compile without them. Makes it a first class dependency.

4. Fix blob_store_test, which failes on Ubuntu with "too many files opened" error

5. Removed dependency on port/port_chromium.h, which does not even exist on our system

Test Plan: make clean; make check

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14145
2013-11-18 11:40:16 -08:00
kailiu 97d8e573a6 make util/env_posix.cc work under mac
Summary: This diff invoves some more complicated issues in the posix environment.

Test Plan: works under mac os. will need to verify dev box.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14061
2013-11-16 23:44:39 -08:00
Kai Liu aed9f1fa5e The updated sed still doesn't work in mac, revert. 2013-11-12 21:40:25 -08:00
Kai Liu f3b3316a07 Fix a sed command issue that cannot generated *.d files
Summary:

The original sed command is not recognized by mac's sed, which generates ".d-e" extension instead of ".d"

Test Plan:

make clean && make -j32
2013-11-12 21:26:19 -08:00
kailiu 21587760b9 Fixing the warning messages captured under mac os # Consider using git commit -m 'One line title' && arc diff. # You will save time by running lint and unit in the background.
Summary: The work to make sure mac os compiles rocksdb is not completed yet. But at least we can start cleaning some warnings captured only by g++ from mac os..

Test Plan: ran make in mac os

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14049
2013-11-12 20:05:28 -08:00
Haobo Xu fe4a449472 [RocksDB] prefixhash memtable test
Summary: as title, half baked test for prefixhash memtable. Also contains deadlock test option

Test Plan: run it

Reviewers: igor, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13887
2013-11-05 23:20:10 -08:00
Siying Dong 7caadf2e52 A very simple benchmark to measure Table implemenation's Get() And Iterator performance
Summary: It is a very simple benchmark to measure a Table implementation's Get() and iterator performance if all the data is in memory.

Test Plan: N/A

Reviewers: dhruba, haobo, kailiu

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13743
2013-10-31 13:38:54 -07:00
Siying Dong d4eec30ed0 Make "Table" pluggable
Summary: This patch makes Table and TableBuilder a abstract class and make all the implementation of the current table into BlockedBasedTable and BlockedBasedTable Builder.

Test Plan: Make db_test.cc to work with block based table. Add a new test simple_table_db_test.cc where a different simple table format is implemented.

Reviewers: dhruba, haobo, kailiu, emayanke, vamsi

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13521
2013-10-28 17:54:09 -07:00
Kai Liu 994575c134 Support user-defined table stats collector
Summary:
1. Added a new option that support user-defined table stats collection.
2. Added a deleted key stats collector in `utilities`

Test Plan:
Added a unit test for newly added code.
Also ran make check to make sure other tests are not broken.

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13491
2013-10-28 15:45:14 -07:00
Igor Canadi 7e2c1ba173 BlobStore Benchmark
Summary:
Finally, arc diff works again! This has been sitting in my repo for a while.

I would like some comments on my BlobStore benchmark. We don't have to check this in.

Also, I don't do any fsync in the BlobStore, so this is all extremely fast. I'm not sure what durability guarantees we need from the BlobStore.

Test Plan: Nope

Reviewers: dhruba, haobo, kailiu, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13527
2013-10-23 17:31:12 -07:00
Kai Liu b37fda842d Improve the comment for the shared library in Make file 2013-10-22 22:40:16 -07:00
Igor Canadi fc4616d898 External Value Store
Summary:
Developing a capability for storing values on external backing file(s).

This is just a highly unoptimized first pass - supports:
1) Allocating some portion of external file to be used to store value
2) Freeing the range, enabling it to be reused by other values

As next steps, I plan to:
1) Create some kind of stress testing. Once I can measure stuff, I can focus on optimizing.
2) Optimize locking.
3) Optimize freelist data structure. Currently we have O(n) for both freeing and allocation.
4) Figure out how to do recovery.

Test Plan: Created a unit test.

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13389
2013-10-16 17:33:49 -07:00
Dhruba Borthakur 0a9f873f4b Removed scribe, thrift and java modules.
Summary: Removed scribe, thrift and java modules.

Test Plan:
make release
make check

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13293
2013-10-04 15:36:00 -07:00
Kai Liu 4c6dc7a9ae Fix the gcov/lcov related issues
Summary:

Jenkin reports errors that:

* Linking error on some machines. The error message shows it cannot find some gcov related symbols.
* lcov error due to the version issues.

Test Plan:

run make in different platforms

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-08-22 17:01:06 -07:00
Simha Venkataramaiah 60bf2b7d4a Add APIs to query SST file metadata and to delete specific SST files
Summary: An api to query the level, key ranges, size etc for each SST file and an api to delete a specific file from the db and all associated state in the bookkeeping datastructures.

Notes: Editing the manifest version does not release the obsolete files right away. However deleting the file directly will mess up the iterator. We may need a more aggressive/timely file deletion api.

I have used std::unique_ptr - will switch to boost:: since this is external. thoughts?

Unit test is fragile right now as it expects the compaction at certain levels.

Test Plan: unittest

Reviewers: dhruba, vamsi, emayanke

CC: zshao, leveldb, haobo

Task ID: #

Blame Rev:
2013-08-22 15:27:19 -07:00