Commit graph

3656 commits

Author SHA1 Message Date
Tomislav Novak ec70fea4c4 Fix a comparison in DBIter::FindPrevUserKey()
Summary:
When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()`
advances the underlying iterator _past_ the current key (`saved_key_`); see
`MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_`
points to an entry with the same user key as `saved_key_`. As a result,
`it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the
_next_, instead of the previous, entry (new test, written by @lovro, reproduces
the bug).

This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than
`saved_key_`.

Test Plan: db_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba, lovro

Differential Revision: https://reviews.facebook.net/D40791
2015-06-29 17:04:03 -07:00
Yueh-Hsuan Chiang 501591c423 Make column_family_test runnable in ROCKSDB_LITE
Summary: Make column_family_test runnable in ROCKSDB_LITE.

Test Plan: column_family_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40251
2015-06-29 14:39:01 -07:00
krad 91cb82f34e Merge branch 'master' of github.com:facebook/rocksdb 2015-06-29 11:47:24 -07:00
Igor Canadi 09f5a4b486 set -e in fb_compile_mongo.sh
Summary: Based on @anthony's feedback, we want to fail early if our static linking fails.

Test Plan: none

Reviewers: anthony

Reviewed By: anthony

Subscribers: dhruba, anthony, leveldb

Differential Revision: https://reviews.facebook.net/D40839
2015-06-29 11:43:25 -07:00
krad 6199cba998 Fix race in unit test.
Summary: Avoid falling victim to race condition.

Test Plan: Run the unit test

Reviewers: sdong igor

CC: leveldb@

Task ID: #7312624

Blame Rev:
2015-06-29 11:40:21 -07:00
Igor Canadi 0a019d74a0 Use malloc_usable_size() for accounting block cache size
Summary:
Currently, when we insert something into block cache, we say that the block cache capacity decreased by the size of the block. However, size of the block might be less than the actual memory used by this object. For example, 4.5KB block will actually use 8KB of memory. So even if we configure block cache to 10GB, our actually memory usage of block cache will be 20GB!

This problem showed up a lot in testing and just recently also showed up in MongoRocks production where we were using 30GB more memory than expected.

This diff will fix the problem. Instead of counting the block size, we will count memory used by the block. That way, a block cache configured to be 10GB will actually use only 10GB of memory.

I'm using non-portable function and I couldn't find info on portability on Google. However, it seems to work on Linux, which will cover majority of our use-cases.

Test Plan:
1. fill up mongo instance with 80GB of data
2. restart mongo with block cache size configured to 10GB
3. do a table scan in mongo
4. memory usage before the diff: 12GB. memory usage after the diff: 10.5GB

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40635
2015-06-26 11:48:09 -07:00
Igor Canadi 4cbc4e6f88 Call merge operators with empty values
Summary: It's not really nice to call user's API with garbage data in new_value. This diff makes sure that new_value is empty before calling the merge operator.

Test Plan: Added assert to Merge operator in merge_test

Reviewers: sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40773
2015-06-26 11:35:46 -07:00
Igor Canadi 619167ee66 Fix mac compile
Summary: as title

Test Plan: make check

Reviewers: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40785
2015-06-26 10:29:24 -07:00
Igor Canadi 472e64d39e Improve fb_compile_mongo.sh
Summary: If we create a new temp directory for each build, scons will recompile everything because we have different parameters. Instead, let's set up a constant path to our static lib. That way we won't have to recompile.

Test Plan: Run fb_compile_mongo.sh twice -- second time it didn't recompile everything

Reviewers: MarkCallaghan, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40707
2015-06-26 10:24:08 -07:00
Venkatesh Radhakrishnan c9cd404bcd Make flush check for shutdown
Summary:
Fixes task 7156865 where a compaction causes a hang in flush
memtable if CancelAllBackgroundWork was called prior to it.
Stack trace is in : https://phabricator.fb.com/P19848829
We end up waiting for a flush which will never happen because there are no background threads.

Test Plan: PreShutdownFlush

Reviewers: sdong, igor

Reviewed By: sdong, igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40617
2015-06-25 14:43:25 -07:00
Poornima Chozhiyath Raman 4fb09c6871 Updating SeekToLast with upper bound
Summary: #7124486: RocksDB's Iterator.SeekToLast should seek to the last key before iterate_upper_bound if presents

Test Plan: ./db_iter_test run successfully with the new testcase

Reviewers: rven, yhchiang, igor, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40425
2015-06-25 09:44:30 -07:00
Igor Canadi dadc429767 Reproducible MongoRocks compile with FB toolchain
Summary:
Added a script that will compile MongoRocks with the same flags as RocksDB binary. On FB infra, we can now do:

  cd ~/rocksdb; make static_lib
  cd ~/mongo; ~/rocksdb/build_tools/fb_compile_mongo.sh

No need to upgrade the g++ on the devbox (like Aaron and I did) or maintain a separate script to compile (like Mark did)

fb_compile_mongo.sh gets the settings from fbcode_config.sh, so it also makes it easier to upgrade the environment one day.

Test Plan: Compiled mongod with new script. Also, ldd output looks good: https://phabricator.fb.com/P19891602

Reviewers: AaronFeldman, MarkCallaghan, anthony

Reviewed By: anthony

Subscribers: anthony, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40659
2015-06-24 15:09:55 -07:00
Yueh-Hsuan Chiang 62a8fd154a Make stringappend_test runnable in ROCKSDB_LITE
Summary: Make stringappend_test runnable in ROCKSDB_LITE

Test Plan: stringappend_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40593
2015-06-24 15:01:43 -07:00
Yueh-Hsuan Chiang 48da7a9cad Improve the comment for BYTES_READ in statistics.
Summary:
BYTES_READ only count the number of logical bytes read from
the DB::Get() function.  It neither includes all logical bytes read
nor indicates IO read bytes.

This patch improves the comment for BYTES_READ.

Test Plan: Only change comment.

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40599
2015-06-24 15:00:51 -07:00
Yueh-Hsuan Chiang 72cab88959 Block redis_test in ROCKSDB_LITE
Summary: Block redis_test in ROCKSDB_LITE as utilities not supported in ROCKSDB_LITE.

Test Plan: redis_test

Reviewers: sdong, igor, rven, anthony, kradhakrishnan, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40587
2015-06-24 01:44:21 -07:00
Yueh-Hsuan Chiang dec2c9f564 Make table_properties_collector_test runnable in ROCKSDB_LITE
Summary: Make table_properties_collector_test runnable in ROCKSDB_LITE

Test Plan: table_properties_collector_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40581
2015-06-24 01:38:53 -07:00
Yueh-Hsuan Chiang 0b1ffe2e1d Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS
Summary:
Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS as
those environment doe not have compile option --no-as-needed

  ld: unknown option: --no-as-needed
  clang: error: linker command failed with exit code 1 (use -v to see invocation)

Test Plan: make shared_lib

Reviewers: meyering, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40353
2015-06-23 16:32:59 -07:00
Islam AbdelRahman 674b1181cf Bottommost level compaction option
Summary: Replace force_bottommost_level_compaction in CompactRangeOption with an option that allow the user to (always skip, always compact, compact if compaction filter is present) the bottommost level for level based compaction.

Test Plan: make check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40527
2015-06-23 13:32:40 -07:00
Giuseppe Ottaviano 782a1590f9 Implement a table-level row cache
Summary:
Implementation of a table-level row cache.
It only caches point queries done through the `DB::Get` interface, queries done through the `Iterator` interface will completely skip the cache.

Supports snapshots and merge operations.

Test Plan: Ran `make valgrind_check commit-prereq`

Reviewers: igor, philipp, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D39849
2015-06-23 10:25:45 -07:00
krad de85e4cadf Introduce WAL recovery consistency levels
Summary:
The "one size fits all" approach with WAL recovery will only introduce inconvenience for our varied clients as we go forward. The current recovery is a bit heuristic. We introduce the following levels of consistency while replaying the WAL.

1. RecoverAfterRestart (kTolerateCorruptedTailRecords)

This mocks the current recovery mode.

2. RecoverAfterCleanShutdown (kAbsoluteConsistency)

This is ideal for unit test and cases where the store is shutdown cleanly. We tolerate no corruption or incomplete writes.

3. RecoverPointInTime (kPointInTimeRecovery)

This is ideal when using devices with controller cache or file systems which can loose data on restart. We recover upto the point were is no corruption or incomplete write.

4. RecoverAfterDisaster (kSkipAnyCorruptRecord)

This is ideal mode to recover data. We tolerate corruption and incomplete writes, and we hop over those sections that we cannot make sense of salvaging as many records as possible.

Test Plan:
(1) Run added unit test to cover all levels.
(2) Run make check.

Reviewers: leveldb, sdong, igor

Subscribers: yoshinorim, dhruba

Differential Revision: https://reviews.facebook.net/D38487
2015-06-22 15:28:12 -07:00
Islam AbdelRahman 530534fceb Fix trivial move merge
Summary: Fixing bad merge

Test Plan: make -j64 check (this is not enough to verify the fix)

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40521
2015-06-22 15:20:30 -07:00
krad 7015fd81c4 Add read_nanos to IOStatsContext.
Summary: MyRocks need a mechanism to track read outliers. We need to expose this
stat.

Test Plan: None

Reviewers: sdong

CC: leveldb

Task ID: #7152512

Blame Rev:
2015-06-22 11:09:35 -07:00
Aaron Feldman 7160f5d80c Fix broken gflags link
Summary: Fix broken gflags link

Test Plan: Follow the link

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40503
2015-06-22 09:31:52 -07:00
Venkatesh Radhakrishnan e1d3c7dbe4 Fixing valgrind error in checkpoint_test
Summary: Fixed a valgrind issue in checkpoint_test

Test Plan: valgrind on checkpoint_test

Reviewers: igor, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40455
2015-06-19 20:21:23 -07:00
Michael Callahan 3bdec09cb7 Remove ldb_tests.py from make check until it is working again.
Summary: Recent checkin added ldb_test.py to the make check target but the test fails.  Remove it again for now and make task.

Test Plan: No more ldb_tests.py running

Reviewers: igor, anthony

Reviewed By: anthony

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40449
2015-06-19 17:41:49 -07:00
Michael Callahan 15325bf55b First version of rocksdb_dump and rocksdb_undump.
Summary: Hack up rocksdb_dump and rocksdb_undump utilities to get this task rolling/promote discussion.

Test Plan: Dump/undump databases recursively to see if nothing is lost.

Reviewers: sdong, yhchiang, rven, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37269
2015-06-19 16:24:36 -07:00
Venkatesh Radhakrishnan 04251e1e3a Add wal files to Checkpoint for multiple column families.
Summary:
When there are multiple column families, the flush in
GetLiveFiles is not atomic, so that there are entries in the wal files
which are needed to get a consisten RocksDB. We now add the log files to
the checkpoint.

Test Plan:
CheckpointCF - This test forces more data to be written to
the other column families after the flush of the first column family but
before the second.

Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40323
2015-06-19 16:08:31 -07:00
Aaron Feldman 18cc5018b7 Fix memory leaks in PinnedUsageTest
Summary: See title

Test Plan: Run valgrind ./cache_test

Reviewers: igor

Reviewed By: igor

Subscribers: anthony, dhruba

Differential Revision: https://reviews.facebook.net/D40419
2015-06-19 09:43:08 -07:00
Igor Canadi bf03f59c11 Disable CompressLevelCompaction() if Zlib is not supported
Summary: CompressLevelCompaction() depends on Zlib. We should skip it when zlib is not present.

Test Plan: `make check` without zlib

Reviewers: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40401
2015-06-18 18:46:26 -07:00
Yueh-Hsuan Chiang df719d4964 Make autovector_test runnable in ROCKSDB_LITE
Summary: Make autovector_test runnable in ROCKSDB_LITE

Test Plan: autovector_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40245
2015-06-18 15:58:00 -07:00
Yueh-Hsuan Chiang 4d6d47688c Block geodb_test in ROCKSDB_LITE
Summary:
Block geodb_test in ROCKSDB_LITE as geodb is not supported
in ROCKSDB_LITE

Test Plan: geodb_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40335
2015-06-18 15:57:19 -07:00
Yueh-Hsuan Chiang 71b438c4a6 Remove unused target --- compactor_test
Summary:
Remove compactor_test, which depends on a directory not exist
in our code base.
    make compactor_test
    GEN      util/build_version.cc
    GEN      util/build_version.cc
    make: *** No rule to make target `utilities/compaction/compactor_test.o', needed by `compactor_test'.  Stop.

Test Plan: verify the output message of make compactor_test

Reviewers: rven, anthony, kradhakrishnan, igor, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40341
2015-06-18 15:54:52 -07:00
Yueh-Hsuan Chiang eade498bda Block utilities/write_batch_with_index in ROCKSDB_LITE
Summary:
Block utilities/write_batch_with_index in ROCKSDB_LITE as we
don't include anly utilities in ROCKSDB_LITE

Test Plan: write_batch_with_index_test

Reviewers: rven, anthony, kradhakrishnan, IslamAbdelRahman, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40347
2015-06-18 15:54:05 -07:00
Igor Canadi 760e9a94de Fail DB::Open() when the requested compression is not available
Summary:
Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.

This patch fails DB::Open() if we don't support the compression that is specified in the options.

Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39687
2015-06-18 14:55:05 -07:00
Aaron Feldman 69bb210d58 Add Cache.GetPinnedUsageUsage()
Summary:
  Add the funcion Cache.GetPinnedUsage() to return the memory size of entries
  that are in use by the system (that is, all the entries not in the LRU list).

Test Plan:
  Run ./cache_test and examine PinnedUsageTest.

Reviewers: tnovak, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40305
2015-06-18 13:56:31 -07:00
Islam AbdelRahman 4eabbdb7ec Skip bottommost level compaction if possible
Summary:
This is https://reviews.facebook.net/D39999 but after introducing an option to force compaction the bottom most level

Changes in this patch
- Introduce force_bottommost_level_compaction to CompactRangeOptions that force compacting bottommost level during compaction
- Skip bottommost level compaction if we dont have a compaction filter and force_bottommost_level_compaction options is not set

Although tests pass on my machine but I suspect that there maybe some tests that I am not aware of that  should use force_bottommost_level_compaction to pass in a deterministic way

Test Plan:
make check
adding new tests

Reviewers: igor, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40059
2015-06-18 11:03:31 -07:00
Igor Canadi 4b8bb62f0a Don't dump DBOptions for each column family
Summary: Currently we dump DBOptions for each column family options we dump. This leads to duplicate lines in our LOG file. This diff fixes that.

Test Plan: Check out the LOG

Reviewers: sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: IslamAbdelRahman, yoshinorim, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39729
2015-06-18 10:15:54 -07:00
Poornima Chozhiyath Raman 176f0bedce Merge branch 'master' of github.com:facebook/rocksdb
D40233: Replace %llu with format macros in ParsedInternalKey::DebugString())
2015-06-18 10:12:55 -07:00
Yueh-Hsuan Chiang bb1c74ce18 Fixed a bug of CompactionStats in multi-level universal compaction case
Summary:
Universal compaction can involves in multiple levels.  However,
the current implementation of bytes_readn and bytes_readnp1
(and some other stats with postfix `n` and `np1`) assumes compaction
can only have two levels.

This patch fixes this bug and redefines bytes_readn and bytes_readnp1:
* bytes_readnp1: the number of bytes read in the compaction output level.
* bytes_readn: the total number of bytes read minus bytes_readnp1

Test Plan: Add a test in compaction_job_stats_test

Reviewers: igor, sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40239
2015-06-17 23:40:34 -07:00
Poornima Chozhiyath Raman a66b8157df Merge branch 'master' of github.com:facebook/rocksdb
D40233: Replace %llu with format macros in ParsedInternalKey::DebugString())
2015-06-17 20:44:52 -07:00
Poornima Chozhiyath Raman f06be62fd2 Replace %llu with format macros in ParsedInternalKey::DebugString())
Test Plan: successfully compiled the code

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40233
2015-06-17 20:44:26 -07:00
Igor Canadi 2dc3910b5e Add --benchmark_write_rate_limit option to db_bench
Summary:
So far, we benchmarked RocksDB by writing as fast as possible. With this change, we're able to limit our write throughput, which should help us better understand how RocksDB performes under varying write workloads.

Specifically, I'm currently interested in the shape of the graph that has write throughput on one axis and write rate on another. This should help us with designing our stall system, as we have started to do with D36351.

Test Plan:
    $ ./db_bench --benchmarks=fillrandom --benchmark_write_rate_limit=1000000
    fillrandom   :     118.523 micros/op 8437 ops/sec;    0.9 MB/s
    $ ./db_bench --benchmarks=fillrandom --benchmark_write_rate_limit=2000000
    fillrandom   :      59.136 micros/op 16910 ops/sec;    1.9 MB/s

Reviewers: MarkCallaghan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39759
2015-06-17 16:44:52 -07:00
Islam AbdelRahman 12e030a992 Use CompactRangeOptions for CompactRange
Summary:
This diff update DB::CompactRange to use RangeCompactionOptions instead of using multiple parameters
Old CompactRange is still available but deprecated

Test Plan:
make all check
make rocksdbjava
USE_CLANG=1 make all
OPT=-DROCKSDB_LITE make release

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40209
2015-06-17 14:36:14 -07:00
Igor Canadi c89369f57c Move dockerbuild.sh to build_tools/
Summary: That's where we keep build tools :)

Test Plan: none

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39741
2015-06-17 14:09:12 -07:00
Igor Canadi 4716ab4d16 Merge pull request #638 from HolodovAlexander/master
C api: human-readable statistics
2015-06-17 13:16:20 -07:00
Igor Canadi 25d600569d Clean up InstallSuperVersion
Summary:
We go to great lengths to make sure MaybeScheduleFlushOrCompaction() is called outside of write thread. But anyway, it's still called in the mutex, so it's not that much cheaper.

This diff removes the "optimization" and cleans up the code a bit.

Test Plan: make check

Reviewers: rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40113
2015-06-17 12:37:59 -07:00
Yueh-Hsuan Chiang 1369f015ee Only initialize the ThreadStatusData when necessary.
Summary:
Before this patch, any function call to ThreadStatusUtil might automatically initialize and register the thread status data.  However, if it is the user-thread making this call, the allocated thread-status-data will never be released as such threads are not managed by rocksdb.

In this patch, I remove the automatic-initialization part.  Thread-status data is only initialized and uninitialized in Env during the thread creation and destruction.

Test Plan:
db_test
thread_list_test
listener_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40017
2015-06-17 11:21:18 -07:00
Yueh-Hsuan Chiang 1a08d0beb5 Block c_test in ROCKSDB_LITE
Summary: Block c_test in ROCKSDB_LITE as it's not supported in ROCKSDB_LITE.

Test Plan: c_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40257
2015-06-17 10:54:51 -07:00
sdong 40f562e747 Allow GetApproximateSize() to include mem table size if it is skip list memtable
Summary:
Add an option in GetApproximateSize() so that the result will include estimated sizes in mem tables.
To implement it, implement an estimated count from the beginning to a key in skip list. The approach is to count to find the entry, how many Next() is issued from each level, and sum them with a weight that is <branching factor> ^ <level>.

Test Plan: Add a test case

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40119
2015-06-16 18:13:23 -07:00
Igor Canadi d59d90bb1f db_bench periodically writes QPS to CSV file
Summary:
This is part of an effort to better understand and optimize RocksDB stalls under high load. I added a feature to db_bench to periodically write QPS to CSV files. That way we can nicely see how our QPS changes in time (especially when DB is stalled) and can do a better job of evaluating our stall system (i.e. we want the QPS to be as constant as possible, as opposed to having bunch of stalls)

Cool part of CSV files is that we can easily graph them -- there are a bunch of tools available.

Test Plan:
Ran ./db_bench --report_interval_seconds=10 --benchmarks=fillrandom --num=10000000
and observed this in report.csv:

secs_elapsed,interval_qps
10,2725860
20,1980480
30,1863456
40,1454359
50,1460389

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40047
2015-06-12 14:31:53 -07:00