Find a file
Zhichao Cao ce8e88d2d7 Generate mixed workload with Get, Put, Seek in db_bench (#4788)
Summary:
Based on the specific workload models (key access distribution, value size distribution, and iterator scan length distribution, the QPS variation), the MixGraph benchmark generate the synthetic workload according to these distributions which can reflect the real-world workload characteristics.

After user enable the tracing function, they will get the trace file. By analyzing the trace file with the trace_analyzer tool, user can generate a set of statistic data files including. The *_accessed_key_stats.txt,  *-accessed_value_size_distribution.txt, *-iterator_length_distribution.txt, and *-qps_stats.txt are mainly used to fit the Matlab model fitting. After that, user can get the parameters of the workload distributions (the modeling details are described: [here](https://github.com/facebook/rocksdb/wiki/RocksDB-Trace%2C-Replay%2C-and-Analyzer))

The key access distribution follows the The two-term power model. The probability density function is: `f(x) = ax^{b}+c`. The corresponding parameters are key_dist_a, key_dist_b, and key_dist_c in db_bench

For the value size distribution and iterator scan length distribution, they both follow the Generalized Pareto Distribution. The probability density function is `f(x) = (1/sigma)(1+k*(x-theta)/sigma))^{-1-1/k)`. The parameters are: value_k, value_theta, value_sigma and iter_k, iter_theta, iter_sigma. For more information about the Generalized Pareto Distribution, users can find the [wiki](https://en.wikipedia.org/wiki/Generalized_Pareto_distribution) and [Matalb page](https://www.mathworks.com/help/stats/generalized-pareto-distribution.html)

As for the QPS, it follows the diurnal pattern. So Sine is a good model to fit it. `F(x) = sine_a*sin(sine_b*x + sine_c) + sine_d`. The trace_will tell you the average QPS in the print out resutls, which is sine_d. After user fit the "*-qps_stats.txt" to the Matlab model, user can get the sine_a, sine_b, and sine_c. By using the 4 parameters, user can control the QPS variation including the period, average, changes.

To use the bench mark, user can indicate the following parameters as examples:
```
-benchmarks="mixgraph" -key_dist_a=0.002312 -key_dist_b=0.3467 -value_k=0.9233 -value_sigma=226.4092 -iter_k=2.517 -iter_sigma=14.236 -mix_get_ratio=0.7 -mix_put_ratio=0.25 -mix_seek_ratio=0.05 -sine_mix_rate_interval_milliseconds=500 -sine_a=15000 -sine_b=1 -sine_d=20000
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4788

Differential Revision: D13573940

Pulled By: sagar0

fbshipit-source-id: e184c27e07b4f1bc0b436c2be36c5090c1fb0222
2019-01-22 10:44:26 -08:00
buckifier Fix skylark incompatible build files in rocksdb 2019-01-07 13:37:40 -08:00
build_tools Fix spelling errors (#4827) 2019-01-02 11:17:57 -08:00
cache Revert "Move MemoryAllocator option from Cache to BlockBasedTableOpti… (#4697) 2018-11-21 11:29:57 -08:00
cmake Make FindZLIB consistent with official definitions (#4823) 2019-01-02 12:49:57 -08:00
coverage
db Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
docs Insane line length detected (#4813) 2018-12-21 14:54:34 -08:00
env Introduce a CPU time counter in perf_context (#4741) 2018-12-20 12:03:44 -08:00
examples
hdfs Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
include/rocksdb Use chrono::time_point instead of time_t (#4868) 2019-01-16 09:51:05 -08:00
java Fix typos in comments (#4819) 2018-12-26 09:43:56 -08:00
memtable WriteBufferManger doens't cost to cache if no limit is set (#4695) 2018-11-18 16:55:43 -08:00
monitoring Add a new per level counter for block cache hit (#4796) 2018-12-21 13:20:05 -08:00
options Concurrent task limiter for compaction thread control (#4332) 2018-12-13 13:18:28 -08:00
port Detect if Jemalloc is linked with the binary (#4844) 2019-01-03 16:30:12 -08:00
table Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
third-party Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
tools Generate mixed workload with Get, Put, Seek in db_bench (#4788) 2019-01-22 10:44:26 -08:00
util Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
utilities Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
.clang-format
.gitignore
.lgtm.yml
.travis.yml Fix printf formatting on MacOS (#4533) 2018-10-19 14:46:09 -07:00
appveyor.yml Add RocksJava build to AppVeyor 2019-01-03 10:44:44 -08:00
AUTHORS
CMakeLists.txt Remove some components (#4101) 2019-01-10 13:30:09 -08:00
CODE_OF_CONDUCT.md
CONTRIBUTING.md
COPYING
DEFAULT_OPTIONS_HISTORY.md
DUMP_FORMAT.md
HISTORY.md Update HISTORY.md with new use of ZSTD_CDict (#4901) 2019-01-19 19:17:50 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
issue_template.md
LANGUAGE-BINDINGS.md
LICENSE.Apache
LICENSE.leveldb
Makefile Fix downloaded filename of snappy (#4870) 2019-01-11 10:29:40 -08:00
README.md
ROCKSDB_LITE.md
src.mk Remove some components (#4101) 2019-01-10 13:30:09 -08:00
TARGETS Remove some components (#4101) 2019-01-10 13:30:09 -08:00
thirdparty.inc
USERS.md Adding IOTA Foundation to USERS.MD (#4436) 2018-10-02 10:03:46 -07:00
Vagrantfile
WINDOWS_PORT.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.