Commit Graph

510 Commits

Author SHA1 Message Date
Igor Canadi 26f5dd9a5a TablePropertiesCollectorFactory
Summary:
This diff addresses task #4296714 and rethinks how users provide us with TablePropertiesCollectors as part of Options.

Here's description of task #4296714:
       I'm debugging #4295529 and noticed that our count of user properties kDeletedKeys is wrong. We're sharing one single InternalKeyPropertiesCollector with all Table Builders. In LOG Files, we're outputting number of kDeletedKeys as connected with a single table, while it's actually the total count of deleted keys since creation of the DB.

       For example, this table has 3155 entries and 1391828 deleted keys.

The problem with current approach that we call methods on a single TablePropertiesCollector for all the tables we create. Even worse, we could do it from multiple threads at the same time and TablePropertiesCollector has no way of knowing which table we're calling it for.

Good part: Looks like nobody inside Facebook is using Options::table_properties_collectors. This means we should be able to painfully change the API.

In this change, I introduce TablePropertiesCollectorFactory. For every table we create, we call `CreateTablePropertiesCollector`, which creates a TablePropertiesCollector for a single table. We then use it sequentially from a single thread, which means it doesn't have to be thread-safe.

Test Plan:
Added a test in table_properties_collector_test that fails on master (build two tables, assert that kDeletedKeys count is correct for the second one).
Also, all other tests

Reviewers: sdong, dhruba, haobo, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18579
2014-05-13 12:30:55 -07:00
Igor Canadi d08073aafc Merge pull request #141 from dallasmarlow/master
only use MAP_HUGETLB when the kernel supports it
2014-05-11 15:44:32 -07:00
Dallas Marlow 557fbc9b3b arena spacing 2014-05-11 10:22:28 -04:00
Igor Canadi 038a477b53 Make it easier to start using RocksDB
Summary:
This diff is addressing multiple things with a single goal -- to make RocksDB easier to use:
* Add some functions to Options that make RocksDB easier to tune.
* Add example code for both simple RocksDB and RocksDB with Column Families.
* Rewrite our README.md

Regarding Options, I took a stab at something we talked about for a long time:
* https://www.facebook.com/groups/rocksdb.dev/permalink/563169950448190/

I added functions:
* IncreaseParallelism() -- easy, increases the thread pool and max_background_compactions
* OptimizeLevelStyleCompaction(memtable_memory_budget) -- the easiest way to optimize rocksdb for less stalls with level style compaction. This is very likely not ideal configuration. Feel free to suggest improvements. I used some of Mark's suggestions from here: https://github.com/facebook/rocksdb/issues/54
* OptimizeUniversalStyleCompaction(memtable_memory_budget) -- optimize for universal compaction.

Test Plan: compiled rocksdb. ran examples.

Reviewers: dhruba, MarkCallaghan, haobo, sdong, yhchiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18621
2014-05-10 10:49:33 -07:00
Dallas Marlow 030db3d17e testing 2014-05-09 18:58:39 -04:00
Tyler Neely deb89401fd have proprocessor choose correct mmap args 2014-05-09 14:49:25 -04:00
Igor Canadi fec4269966 Fix more gflag namespace issues 2014-05-09 08:41:02 -07:00
Igor Canadi af7453a673 autovector::resize
Summary: Resize the autovector!

Test Plan: test

Reviewers: sdong

Reviewed By: sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18543
2014-05-08 13:50:49 -07:00
sdong 1c6a027d23 HashLinkedList::Iterator: remove an ununsed class variable
Summary: This variable is not used. Remove it.

Test Plan: build.

Reviewers: haobo, igor, yhchiang

Reviewed By: haobo

CC: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18525
2014-05-08 13:28:52 -07:00
dallas marlow f41cde3105 remove anon mmap allocation flag `MAP_HUGETLB` 2014-05-08 11:45:44 -04:00
sdong 3a171dcb51 Pass logger to memtable rep and TLB page allocation error logged to info logs
Summary:
TLB page allocation errors are now logged to info logs, instead of stderr.
In order to do that, mem table rep's factory functions take a info logger now.

Test Plan: make all check

Reviewers: haobo, igor, yhchiang

Reviewed By: yhchiang

CC: leveldb, yhchiang, dhruba

Differential Revision: https://reviews.facebook.net/D18471
2014-05-05 16:43:37 -07:00
Igor Canadi d2569fea47 log_and_apply_bench on a new benchmark framework
Summary:
db_test includes Benchmark for LogAndApply. This diff removes it from db_test and puts it into a separate log_and_apply bench. I just wanted to play around with our new benchmark framework and figure out how it works.

I would also like to show you how great it is! I believe right set of microbenchmarks can speed up our productivity a lot and help catch early regressions.

Test Plan: no

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18261
2014-05-05 11:11:48 -07:00
Igor Canadi 6785a52b1b Temporary remove perror() calls before we can log from inside of arena 2014-05-05 07:13:48 -07:00
sdong 4a7c747064 Revert "Revert "Allow allocating dynamic bloom, plain table indexes and hash linked list from huge page TLB""
And make the default 0 for hash linked list memtable

This reverts commit d69dc64be7.
2014-05-04 13:56:29 -07:00
Igor Canadi d69dc64be7 Revert "Allow allocating dynamic bloom, plain table indexes and hash linked list from huge page TLB"
This reverts commit 7dafa3a1d7.
2014-05-04 08:37:09 -07:00
Igor Canadi 4ecfbcf865 ApplyToAllCacheEntries
Summary: Added a method that executes a callback on every cache entry.

Test Plan: added a unit test

Reviewers: haobo

Reviewed By: haobo

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D18441
2014-05-02 16:24:04 -04:00
Igor Canadi 0afc8bc29a xxHash
Summary:
Originally: https://github.com/facebook/rocksdb/pull/87/files

I'm taking over to apply some finishing touches

Test Plan: will add tests

Reviewers: dhruba, haobo, sdong, yhchiang, ljin

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18315
2014-05-01 14:09:32 -04:00
Igor Canadi df70047669 Flush stale column families
Summary:
Added a new option `max_total_wal_size`. Once the total WAL size goes over that, we make an attempt to flush all column families that still have data in the earliest WAL file.

By default, I calculate `max_total_wal_size` dynamically, that should be good-enough for non-advanced customers.

Test Plan: Added a test

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18345
2014-04-30 14:33:40 -04:00
sdong 7dafa3a1d7 Allow allocating dynamic bloom, plain table indexes and hash linked list from huge page TLB
Summary: Add an option to allocate a piece of memory from huge page TLB. Add options to trigger it in dynamic bloom, plain table indexes andhash linked list hash table.

Test Plan: make all check

Reviewers: haobo, ljin

Reviewed By: haobo

CC: nkg-, dhruba, leveldb, igor, yhchiang

Differential Revision: https://reviews.facebook.net/D18357
2014-04-30 11:02:26 -07:00
Igor Canadi 66f88c43a5 Some fixes as preparation for release 2014-04-30 09:03:24 -07:00
Yueh-Hsuan Chiang 9d9d2965cb Add a new mem-table representation based on cuckoo hash.
Summary:
= Major Changes =
* Add a new mem-table representation, HashCuckooRep, which is based cuckoo hash.
  Cuckoo hash uses multiple hash functions.  This allows each key to have multiple
  possible locations in the mem-table.

  - Put: When insert a key, it will try to find whether one of its possible
    locations is vacant and store the key.  If none of its possible
    locations are available, then it will kick out a victim key and
    store at that location.  The kicked-out victim key will then be
    stored at a vacant space of its possible locations or kick-out
    another victim.  In this diff, the kick-out path (known as
    cuckoo-path) is found using BFS, which guarantees to be the shortest.

 - Get: Simply tries all possible locations of a key --- this guarantees
   worst-case constant time complexity.

 - Time complexity: O(1) for Get, and average O(1) for Put if the
   fullness of the mem-table is below 80%.

 - Default using two hash functions, the number of hash functions used
   by the cuckoo-hash may dynamically increase if it fails to find a
   short-enough kick-out path.

 - Currently, HashCuckooRep does not support iteration and snapshots,
   as our current main purpose of this is to optimize point access.

= Minor Changes =
* Add IsSnapshotSupported() to DB to indicate whether the current DB
  supports snapshots.  If it returns false, then DB::GetSnapshot() will
  always return nullptr.

Test Plan:
Run existing tests.  Will develop a test specifically for cuckoo hash in
the next diff.

Reviewers: sdong, haobo

Reviewed By: sdong

CC: leveldb, dhruba, igor

Differential Revision: https://reviews.facebook.net/D16155
2014-04-29 17:13:46 -07:00
Igor Canadi 38693d99c4 Fix more signed/unsigned comparsions 2014-04-29 12:40:18 -07:00
Igor Canadi d0b006ab13 Fix unsigned and signed integer compare 2014-04-29 12:29:22 -07:00
Igor Canadi f868dcbbed Support for adding TTL-ed column family
Summary:
This enables user to add a TTL column family to normal DB.

Next step should be to expand StackableDB and create StackableColumnFamily, such that users can for example add geo-spatial column families to normal DB.

Test Plan: added a test

Reviewers: dhruba, haobo, ljin

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18201
2014-04-28 20:34:20 -07:00
Igor Canadi 72ff275e3c Fix TransactionLogIterator EOF caching
Summary:
When TransactionLogIterator comes to EOF, it calls UnmarkEOF and continues reading. However, if glibc cached the EOF status of the file, it will get EOF again, even though the new data might have been written to it.

This has been causing errors in Mac OS.

Test Plan: test passes, was failing before

Reviewers: dhruba, haobo, sdong

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18381
2014-04-28 23:30:27 -04:00
Lei Jin 3995e801ab kill ReadOptions.prefix and .prefix_seek
Summary:
also add an override option total_order_iteration if you want to use full
iterator with prefix_extractor

Test Plan: make all check

Reviewers: igor, haobo, sdong, yhchiang

Reviewed By: haobo

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D17805
2014-04-25 12:21:34 -07:00
Igor Canadi 46b3076c91 Better stack trace in MAC
Summary:
Now this gives us the real deal stack trace:

    Assertion failed: (false), function GetProperty, file db/db_impl.cc, line 4072.
    Received signal 6 (Abort trap: 6)
    #0   0x7fff57ce39b9
    #1   abort (in libsystem_c.dylib) + 125
    #2   basename (in libsystem_c.dylib) + 0
    #3   rocksdb::DBImpl::GetProperty(rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*) (in db_test) (db_impl.cc:4072)
    #4   rocksdb::_Test_Empty::_Run() (in db_test) (testharness.h:68)
    #5   rocksdb::_Test_Empty::_RunIt() (in db_test) (db_test.cc:1005)
    #6   rocksdb::test::RunAllTests() (in db_test) (testharness.cc:60)
    #7   main (in db_test) (db_test.cc:6697)
    #8   start (in libdyld.dylib) + 1

Test Plan: added artificial assert, saw great stack trace

Reviewers: haobo, dhruba, ljin

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18309
2014-04-25 09:50:51 -04:00
Igor Canadi fc3127e8de Install stack trace handlers in unit tests
Summary: Sometimes, our tests fail because of normal `assert` call. It would be helpful to see stack trace in that case, too.

Test Plan: Added `assert(false)` and verified it prints out stack trace

Reviewers: haobo, dhruba, sdong, ljin, yhchiang

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18291
2014-04-24 13:52:20 -04:00
Igor Canadi f9f8965e96 Print out stack trace in mac, too
Summary: While debugging Mac-only issue with ThreadLocalPtr, this was very useful. Let's print out stack trace in MAC OS, too.

Test Plan: Verified that somewhat useful stack trace was generated on mac. Will run PrintStack() on linux, too.

Reviewers: ljin, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18189
2014-04-23 09:11:35 -04:00
Lei Jin 547bb6a626 simplify ThreadLocalPtr a little bit
Summary:
make singleton a static member instead of dynamic object. This should
also avoid the race on unique_ptr

Test Plan: make all check

Reviewers: igor, haobo, sdong

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18177
2014-04-22 21:13:34 -07:00
Igor Canadi 1068d2fa60 Revert "Better port::Mutex::AssertHeld() and AssertNotHeld()"
This reverts commit ddafceb6c2.
2014-04-22 18:38:10 -07:00
Igor Canadi ddafceb6c2 Better port::Mutex::AssertHeld() and AssertNotHeld()
Summary:
Using ThreadLocalPtr as a flag to determine if a mutex is locked or not enables us to implement AssertNotHeld(). It also makes AssertHeld() actually correct.

I had to remove port::Mutex as a dependency for util/thread_local.h, but that's fine since we can just use std::mutex :)

Test Plan: make check

Reviewers: ljin, dhruba, haobo, sdong, yhchiang

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18171
2014-04-22 17:26:21 -07:00
Igor Canadi c2da9e5997 Flush before Fsync()/Sync()
Summary: Calling Fsync()/Sync() on a file should give the guarantee that whatever you written to the file is now persisted. This is currently not the case, since we might have some data left in application cache as we do Fsync()/Sync(). For example, BuildTable() calls Fsync() without the flush, assuming all sst data is now persisted, but it's actually not. This may result in big inconsistencies.

Test Plan: no test

Reviewers: sdong, dhruba, haobo, ljin, yhchiang

Reviewed By: sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18159
2014-04-21 17:45:04 -07:00
Igor Canadi ba16c1f410 Move benchmark timing to Env::NowNanos() 2014-04-21 17:43:51 -07:00
Igor Canadi 8dc34364d2 Rename "benchmark" back to "bench".
Also, make `benchharness.cc` not compiled into rocksdb library.
2014-04-21 13:12:15 -07:00
Igor Canadi 05c168658e Relax env_test::AllocateTest 2014-04-21 12:56:32 -07:00
Pratyush Seth ff1b5df4c6 Added benchmark functionality on the lines of folly/Benchmark.h
Summary: Added benchmark functionality on the lines of folly/Benchmark.h

Test Plan: Added unit tests

Reviewers: igor, haobo, sdong, ljin, yhchiang, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17973
2014-04-21 12:29:55 -07:00
Igor Canadi c7076a7a05 Fix Allocate test
Summary: For some reason, on a subset of our continuous build machines, preallocation is allocating 8 block more than it should be. Let's relax the test a little bit -- now we require the test to allocate *at least* the number of blocks as we told them to.

Test Plan: no

Reviewers: ljin, haobo, sdong

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18141
2014-04-21 12:12:02 -07:00
Igor Canadi 86ae8203e6 Fix ifdef NDEBUG 2014-04-17 14:29:28 -07:00
Igor Canadi 62551b1c4e Don't compile sync_point if NDEBUG
Summary:
We don't really need sync_point.o if we're compiling with NDEBUG.

This diff depends on D17823

Test Plan: compiles

Reviewers: haobo, ljin, sdong

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17829
2014-04-17 10:49:58 -07:00
Igor Canadi 588bca2020 RocksDBLite
Summary:
Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.

Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)

Test Plan: compiles :)

Reviewers: dhruba, haobo, ljin, sdong, yhchiang

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17835
2014-04-15 13:39:26 -07:00
Igor Canadi 23c8f89b57 Revert "Don't compile ldb tool into static library"
This reverts commit e296577ef6.
2014-04-15 11:29:02 -07:00
Igor Canadi e296577ef6 Don't compile ldb tool into static library
Summary:
This is first step of my effort to reduce size of librocksdb.a for use in mobile.

ldb object files are huge and are ment to be used as a command line tool. I moved them to `tools/` directory and include them only when compiling `ldb`

This diff reduced librocksdb.a from 42MB to 39MB on my mac (not stripped).

Test Plan: ran ldb

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17823
2014-04-15 10:52:39 -07:00
Igor Canadi 9433e359d1 Dont build version if compiling for IOS 2014-04-11 10:54:47 -07:00
Igor Canadi 6c0fc5d3ba Fix some InfoLogLevels 2014-04-10 16:49:49 -07:00
Igor Canadi ddef6841b3 Renamed InfoLogLevel::DEBUG to InfoLogLevel::DEBUG_LEVEL
Summary: XCode for some reason injects `#define DEBUG 1` into our code, which makes compile fail because we use `DEBUG` keyword for other stuff. This diff fixes the issue by renaming `DEBUG` to `DEBUG_LEVEL`.

Test Plan: compiles

Reviewers: dhruba, haobo, sdong, yhchiang, ljin

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17709
2014-04-10 15:27:42 -07:00
Igor Canadi 4daea66343 Turn on -Wmissing-prototypes
Summary: Compiling for iOS has by default turned on -Wmissing-prototypes, which causes rocksdb to fail compiling. This diff turns on -Wmissing-prototypes in our compile options and cleans up all functions with missing prototypes.

Test Plan: compiles

Reviewers: dhruba, haobo, ljin, sdong

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17649
2014-04-09 21:17:14 -07:00
Lei Jin 4824014e3b speed up db_bench filluniquerandom mode
Summary:
filluniquerandom is painfully slow due to the naive bitmap check to find
out if a key has been seen before. Majority of time is spent on searching
the last few keys. Split a giant BitSet to smaller ones so that we can
quickly check if a BitSet is full and thus can skip quickly.

It used to take over one hour to filluniquerandom for 100M keys, now it
takes about 10 mins.

Test Plan:
unit test
also verified correctness in db_bench and make sure all keys are
generated

Reviewers: igor, haobo, yhchiang

Reviewed By: igor

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D17607
2014-04-09 11:25:21 -07:00
Igor Canadi b947fdc89d Column family support for DB::OpenForReadOnly()
Summary: When opening DB in read-only mode, client can choose to only specify a subset of column families ("default" column family can't be omitted, though)

Test Plan: added a unit test in column_family_test

Reviewers: haobo, sdong, ljin, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17565
2014-04-09 09:56:17 -07:00
Lei Jin 92c1eb0291 macros for perf_context
Summary: This will allow us to disable them completely for iOS or for better performance

Test Plan: will run make all check

Reviewers: igor, haobo, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17511
2014-04-08 10:58:07 -07:00