Commit Graph

124 Commits

Author SHA1 Message Date
Peter Dillinger 249eff0f30 Stats for redundant insertions into block cache (#6681)
Summary:
Since read threads do not coordinate on loading data into block
cache, two threads between Lookup and Insert can end up loading and
inserting the same data. This is particularly concerning with
cache_index_and_filter_blocks since those are hot and more likely to
be race targets if ejected from (or not pre-populated in) the cache.

Particularly with moves toward disaggregated / network storage, the cost
of redundant retrieval might be high, and we should at least have some
hard statistics from which we can estimate impact.

Example with full filter thrashing "cliff":

    $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
    ...
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 14181
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 476
    rocksdb.block.cache.data.add COUNT : 12749
    rocksdb.block.cache.data.add.redundant COUNT : 18
    rocksdb.block.cache.filter.add COUNT : 1003
    rocksdb.block.cache.filter.add.redundant COUNT : 217
    rocksdb.block.cache.index.add COUNT : 429
    rocksdb.block.cache.index.add.redundant COUNT : 241
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 1182223
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 302728
    rocksdb.block.cache.data.add COUNT : 31425
    rocksdb.block.cache.data.add.redundant COUNT : 12
    rocksdb.block.cache.filter.add COUNT : 795455
    rocksdb.block.cache.filter.add.redundant COUNT : 130238
    rocksdb.block.cache.index.add COUNT : 355343
    rocksdb.block.cache.index.add.redundant COUNT : 172478
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681

Test Plan: Some manual testing (above) and unit test covering key metrics is included

Reviewed By: ltamasi

Differential Revision: D21134113

Pulled By: pdillinger

fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
2020-04-27 13:20:27 -07:00
Andrew Kryczka e60ea7fe57 fix compiler errors with -DNPERF_CONTEXT (#6642)
Summary:
as titled
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6642

Test Plan:
```
$ EXTRA_CXXFLAGS="-DNPERF_CONTEXT" DEBUG_LEVEL=0 make -j48 db_bench
```

Reviewed By: riversand963

Differential Revision: D20842313

Pulled By: ajkr

fbshipit-source-id: a830cad312ca681591f06749242279503b101df2
2020-04-03 13:24:16 -07:00
Burton Li df62cd5b35 Fix msvc debug test failures (#6579)
Summary:
1. stats_history_test: one slice of stats history is 12526 Bytes, which is greater than original assumption.
![image](https://user-images.githubusercontent.com/17753898/77381970-5a611a80-6d3c-11ea-9d64-59d2e3c04f79.png)
2. table_test: in VerifyBlockAccessTrace function, release trace reader before delete trace file.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6579

Reviewed By: siying

Differential Revision: D20767373

Pulled By: pdillinger

fbshipit-source-id: e8647d665cbe83a3f5429639c6219b50c0912124
2020-04-03 09:54:25 -07:00
sdong fdf882ded2 Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433)
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433

Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.

Differential Revision: D19977691

fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
2020-02-20 12:09:57 -08:00
sdong 24c9dce825 Remove include math.h (#6373)
Summary:
We see some odd errors complaining math. However, it doesn't seem that it is needed to be included. Remove the include of math.h. Just removing it from db_bench doesn't seem to break anything. Replacing sqrt from std::sqrt seems to work for histogram.cc
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6373

Test Plan: Watch Travis and appveyor to run.

Differential Revision: D19730068

fbshipit-source-id: d3ad41defcdd9f51c2da1a3673fb258f5dfacf47
2020-02-05 21:00:49 -08:00
sdong e8263dbdaa Apply formatter to recent 200+ commits. (#5830)
Summary:
Further apply formatter to more recent commits.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5830

Test Plan: Run all existing tests.

Differential Revision: D17488031

fbshipit-source-id: 137458fd94d56dd271b8b40c522b03036943a2ab
2019-09-20 12:04:26 -07:00
Maysam Yabandeh 6ec6a4a9a4 Remove snap_refresh_nanos option (#5826)
Summary:
The snap_refresh_nanos option didn't bring much benefit. Remove the feature to simplify the code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5826

Differential Revision: D17467147

Pulled By: maysamyabandeh

fbshipit-source-id: 4f950b046990d0d1292d7fc04c2ccafaf751c7f0
2019-09-18 20:26:04 -07:00
Shylock Hg 9eb3e1f77d Use delete to disable automatic generated methods. (#5009)
Summary:
Use delete to disable automatic generated methods instead of private, and put the constructor together for more clear.This modification cause the unused field warning, so add unused attribute to disable this warning.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5009

Differential Revision: D17288733

fbshipit-source-id: 8a767ce096f185f1db01bd28fc88fef1cdd921f3
2019-09-11 18:09:00 -07:00
Wilfried Goesgens fbab9913e2 upgrade gtest 1.7.0 => 1.8.1 for json result writing
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5332

Differential Revision: D17242232

fbshipit-source-id: c0d4646556a1335e51ac7382b986ca7f6ced7b64
2019-09-09 11:24:11 -07:00
git-hulk cdb6334e68 MOD: trim last space and comma in perf context and iostat context ToString()
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5755

Differential Revision: D17165190

Pulled By: riversand963

fbshipit-source-id: a3a4633961bfe019bf360f97a4c4d36464e7fa0b
2019-09-03 12:27:17 -07:00
jsteemann a2e46eae46 fix compiling with `-DNPERF_CONTEXT` (#5704)
Summary:
This was previously broken, as the performance context-related
macro signatures in file monitoring/perf_context_imp.h
deviated for the case when NPERF_CONTEXT was defined and when it
was not.

Update the macros for the `-DNPERF_CONTEXT` case, so it compiles.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5704

Differential Revision: D16867746

fbshipit-source-id: 05539724cb1f7955ecc42828365836a677759ad9
2019-08-16 14:38:08 -07:00
Maysam Yabandeh 208556ee13 WritePrepared: fix Get without snapshot (#5664)
Summary:
if read_options.snapshot is not set, ::Get will take the last sequence number after taking a super-version and uses that as the sequence number. Theoretically max_eviceted_seq_ could advance this sequence number. This could lead ::IsInSnapshot that will be invoked by the ReadCallback to notice the absence of the snapshot. In this case, the ReadCallback should have passed a non-value to snap_released so that it could be set by the ::IsInSnapshot. The patch does that, and adds a unit test to verify it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5664

Differential Revision: D16614033

Pulled By: maysamyabandeh

fbshipit-source-id: 06fb3fd4aacd75806ed1a1acec7961f5d02486f2
2019-08-05 13:41:21 -07:00
Zhongyi Xie cfdf2116d3 Exclude StatsHistoryTest.ForceManualFlushStatsCF test from lite mode (#5529)
Summary:
Recent commit 3886dddc3b introduced a new test which is not compatible with lite mode and breaks contrun test:
```
[ RUN      ] StatsHistoryTest.ForceManualFlushStatsCF
monitoring/stats_history_test.cc:642: Failure
Expected: (cfd_stats->GetLogNumber()) < (cfd_test->GetLogNumber()), actual: 15 vs 15
```
This PR excludes the test from lite mode to appease the failing test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5529

Differential Revision: D16080892

Pulled By: miasantreble

fbshipit-source-id: 2f8a22758f71250cd9f204046404226ddc13b028
2019-07-01 16:37:08 -07:00
Zhongyi Xie 3886dddc3b force flushing stats CF to avoid holding old logs (#5509)
Summary:
WAL records RocksDB writes to all column families. When user flushes a a column family, the old WAL will not accept new writes but cannot be deleted yet because it may still contain live data for other column families. (See https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log#life-cycle-of-a-wal for detailed explanation)
Because of this, if there is a column family that receive very infrequent writes and no manual flush is called for it, it could prevent a lot of WALs from being deleted. PR https://github.com/facebook/rocksdb/pull/5046 introduced persistent stats column family which is a good example of such column families. Depending on the config, it may have long intervals between writes, and user is unaware of it which makes it difficult to call manual flush for it.
This PR addresses the problem for persistent stats column family by forcing a flush for persistent stats column family when 1) another column family is flushed 2) persistent stats column family's log number is the smallest among all column families, this way persistent stats column family will  keep advancing its log number when necessary, allowing RocksDB to delete old WAL files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5509

Differential Revision: D16045896

Pulled By: miasantreble

fbshipit-source-id: 286837b633e988417f0096ff38384742d3b40ef4
2019-07-01 11:56:43 -07:00
Zhongyi Xie ddd088c8b9 fix rocksdb lite and clang contrun test failures (#5477)
Summary:
recent commit 671d15cbdd introduced some test failures:
```
===== Running stats_history_test
[==========] Running 9 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 9 tests from StatsHistoryTest
[ RUN      ] StatsHistoryTest.RunStatsDumpPeriodSec
monitoring/stats_history_test.cc:63: Failure
dbfull()->SetDBOptions({{"stats_dump_period_sec", "0"}})
Not implemented: Not supported in ROCKSDB LITE

db/db_options_test.cc:28:11: error: unused variable 'kMicrosInSec' [-Werror,-Wunused-const-variable]
const int kMicrosInSec = 1000000;
```
This PR fixes these failures
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5477

Differential Revision: D15871814

Pulled By: miasantreble

fbshipit-source-id: 0a7023914d2c1784d9d2d3f5bfb47310d4855394
2019-06-17 21:16:29 -07:00
Zhongyi Xie 671d15cbdd Persistent Stats: persist stats history to disk (#5046)
Summary:
This PR continues the work in https://github.com/facebook/rocksdb/pull/4748 and https://github.com/facebook/rocksdb/pull/4535 by adding a new DBOption `persist_stats_to_disk` which instructs RocksDB to persist stats history to RocksDB itself. When statistics is enabled, and  both options `stats_persist_period_sec` and `persist_stats_to_disk` are set, RocksDB will periodically write stats to a built-in column family in the following form: key -> (timestamp in microseconds)#(stats name), value -> stats value. The existing API `GetStatsHistory` will detect the current value of `persist_stats_to_disk` and either read from in-memory data structure or from the hidden column family on disk.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5046

Differential Revision: D15863138

Pulled By: miasantreble

fbshipit-source-id: bb82abdb3f2ca581aa42531734ac799f113e931b
2019-06-17 15:21:50 -07:00
Zhongyi Xie d68f9f4580 simplify include directive involving inttypes (#5402)
Summary:
When using `PRIu64` type of printf specifier, current code base does the following:
```
#ifndef __STDC_FORMAT_MACROS
#define __STDC_FORMAT_MACROS
#endif
#include <inttypes.h>
```
However, this can be simplified to
```
#include <cinttypes>
```
as long as flag `-std=c++11` is used.
This should solve issues like https://github.com/facebook/rocksdb/issues/5159
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5402

Differential Revision: D15701195

Pulled By: miasantreble

fbshipit-source-id: 6dac0a05f52aadb55e9728038599d3d2e4b59d03
2019-06-06 13:56:07 -07:00
Siying Dong e9e0101ca4 Move test related files under util/ to test_util/ (#5377)
Summary:
There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
ve them to a new directory test_util/, so that util/ is cleaner.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377

Differential Revision: D15551366

Pulled By: siying

fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
2019-05-30 11:25:51 -07:00
Yuchi Chen 78a6e07c83 Fix compilation errors for 32bits/LITE/ios build. (#5220)
Summary:
When I build RocksDB for 32bits/LITE/iOS environment, some errors like the following.

`
table/block_based_table_reader.cc:971:44: error: implicit conversion loses integer precision: 'uint64_t'
      (aka 'unsigned long long') to 'size_t' (aka 'unsigned long') [-Werror,-Wshorten-64-to-32]
    size_t block_size = props_block_handle.size();
           ~~~~~~~~~~   ~~~~~~~~~~~~~~~~~~~^~~~~~

./util/file_reader_writer.h:177:8: error: private field 'env_' is not used [-Werror,-Wunused-private-field]
  Env* env_;
       ^
`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5220

Differential Revision: D15023481

Pulled By: siying

fbshipit-source-id: 1b5d121d3016f2b0a8a9a2cc1bd638479357f9f7
2019-04-22 16:02:16 -07:00
Siying Dong 85b2bde3dd Still implement StatisticsImpl::measureTime() (#5181)
Summary:
Since Statistics::measureTime() is deprecated, StatisticsImpl::measureTime() is not implemented. We realized that users might have a wrapped Statistics implementation in which measureTime() is implemented as forwarded to StatisticsImpl, and causes assert failure. In order to make the change less intrusive, we implement StatisticsImpl::measureTime(). We will revisit whether we need to remove it after several releases.

Also, add a test to make sure that a Statistics implementation using the old interface still works.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5181

Differential Revision: D14907089

Pulled By: siying

fbshipit-source-id: 29b6202fd04e30ed6f6adcaeb1000e87f10d1e1a
2019-04-12 11:00:35 -07:00
anand76 fefd4b98c5 Introduce a new MultiGet batching implementation (#5011)
Summary:
This PR introduces a new MultiGet() API, with the underlying implementation grouping keys based on SST file and batching lookups in a file. The reason for the new API is twofold - the definition allows callers to allocate storage for status and values on stack instead of std::vector, as well as return values as PinnableSlices in order to avoid copying, and it keeps the original MultiGet() implementation intact while we experiment with batching.

Batching is useful when there is some spatial locality to the keys being queries, as well as larger batch sizes. The main benefits are due to -
1. Fewer function calls, especially to BlockBasedTableReader::MultiGet() and FullFilterBlockReader::KeysMayMatch()
2. Bloom filter cachelines can be prefetched, hiding the cache miss latency

The next step is to optimize the binary searches in the level_storage_info, index blocks and data blocks, since we could reduce the number of key comparisons if the keys are relatively close to each other. The batching optimizations also need to be extended to other formats, such as PlainTable and filter formats. This also needs to be added to db_stress.

Benchmark results from db_bench for various batch size/locality of reference combinations are given below. Locality was simulated by offsetting the keys in a batch by a stride length. Each SST file is about 8.6MB uncompressed and key/value size is 16/100 uncompressed. To focus on the cpu benefit of batching, the runs were single threaded and bound to the same cpu to eliminate interference from other system events. The results show a 10-25% improvement in micros/op from smaller to larger batch sizes (4 - 32).

Batch   Sizes

1        | 2        | 4         | 8      | 16  | 32

Random pattern (Stride length 0)
4.158 | 4.109 | 4.026 | 4.05 | 4.1 | 4.074        - Get
4.438 | 4.302 | 4.165 | 4.122 | 4.096 | 4.075 - MultiGet (no batching)
4.461 | 4.256 | 4.277 | 4.11 | 4.182 | 4.14        - MultiGet (w/ batching)

Good locality (Stride length 16)
4.048 | 3.659 | 3.248 | 2.99 | 2.84 | 2.753
4.429 | 3.728 | 3.406 | 3.053 | 2.911 | 2.781
4.452 | 3.45 | 2.833 | 2.451 | 2.233 | 2.135

Good locality (Stride length 256)
4.066 | 3.786 | 3.581 | 3.447 | 3.415 | 3.232
4.406 | 4.005 | 3.644 | 3.49 | 3.381 | 3.268
4.393 | 3.649 | 3.186 | 2.882 | 2.676 | 2.62

Medium locality (Stride length 4096)
4.012 | 3.922 | 3.768 | 3.61 | 3.582 | 3.555
4.364 | 4.057 | 3.791 | 3.65 | 3.57 | 3.465
4.479 | 3.758 | 3.316 | 3.077 | 2.959 | 2.891

dbbench command used (on a DB with 4 levels, 12 million keys)-
TEST_TMPDIR=/dev/shm numactl -C 10  ./db_bench.tmp -use_existing_db=true -benchmarks="readseq,multireadrandom" -write_buffer_size=4194304 -target_file_size_base=4194304 -max_bytes_for_level_base=16777216 -num=12000000 -reads=12000000 -duration=90 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5011

Differential Revision: D14348703

Pulled By: anand1976

fbshipit-source-id: 774406dab3776d979c809522a67bedac6c17f84b
2019-04-11 14:28:26 -07:00
Simon Grätzer d9d3cacaf5 Add a missing define to monitoring/iostats_context_imp.h (#5136)
Summary:
I think when PR https://github.com/facebook/rocksdb/pull/4889 added the `IOSTATS_CPU_TIMER_GUARD` define to this header file, the noop version in the `#else` branch was forgotten.

Not sure if this is common, but on my MacOS machine it breaks my build
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5136

Differential Revision: D14727727

Pulled By: siying

fbshipit-source-id: 1076e56bdbe6ecda01d461b371dabf7f1593a149
2019-04-02 11:56:18 -07:00
Shi Feng 01e6badbb6 Introduce CPU timers for iterator seek and next (#5076)
Summary:
Introduce CPU timers for iterator seek and next operations. Seek
counter includes SeekToFirst, SeekToLast and SeekForPrev, w/ the
caveat that SeekToLast timer doesn't include some post processing
time if upper bound is defined.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5076

Differential Revision: D14525218

Pulled By: fredfsh

fbshipit-source-id: 03ba25df3b22b06c072621e4de0eacfa1445f0d9
2019-03-26 16:32:13 -07:00
Siying Dong aef763b6d6 Make statistics's stats_level change thread-safe (#5030)
Summary:
Right now, users can change statistics.stats_level while DB is running, but TSAN may report
data race. We make stats_level_ to be atomic, and access them using accessors.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5030

Differential Revision: D14267519

Pulled By: siying

fbshipit-source-id: 37d7ebeff7a43a406230143422a16af899163f73
2019-03-01 10:42:09 -08:00
Siying Dong 5e298f865b Add two more StatsLevel (#5027)
Summary:
Statistics cost too much CPU for some use cases. Add two stats levels
so that people can choose to skip two types of expensive stats, timers and
histograms.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5027

Differential Revision: D14252765

Pulled By: siying

fbshipit-source-id: 75ecec9eaa44c06118229df4f80c366115346592
2019-02-28 10:27:59 -08:00
Zhongyi Xie c4f5d0aa15 add GetStatsHistory to retrieve stats snapshots (#4748)
Summary:
This PR adds public `GetStatsHistory` API to retrieve stats history in the form of an std map. The key of the map is the timestamp in microseconds when the stats snapshot is taken, the value is another std map from stats name to stats value (stored in std string). Two DBOptions are introduced: `stats_persist_period_sec` (default 10 minutes) controls the intervals between two snapshots are taken; `max_stats_history_count` (default 10) controls the max number of history snapshots to keep in memory. RocksDB will stop collecting stats snapshots if `stats_persist_period_sec` is set to 0.

(This PR is the in-memory part of https://github.com/facebook/rocksdb/pull/4535)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4748

Differential Revision: D13961471

Pulled By: miasantreble

fbshipit-source-id: ac836d401ecb84ea92216bf9966f969dedf4ad04
2019-02-20 15:52:54 -08:00
Zhongyi Xie 00ed41daee Allow copy for PerfContext objects (#4919)
Summary:
Existing implementation of PerfContext does not define copy constructor or assignment operator, which could potentially cause problems when user create copies and resets the builtin one. This PR address the issue by providing these two constructors with deep copy semantics.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4919

Differential Revision: D13960406

Pulled By: miasantreble

fbshipit-source-id: 36aab5aaee65d4480f537e4e22148faa45e8e334
2019-02-05 14:29:08 -08:00
Alexander Zinoviev 32a6dd9a41 Add a new CPU time counter to compaction report (#4889)
Summary:
Measure CPU time consumed for a compaction and report it in the stats report
Enable NowCPUNanos() to work for MacOS
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4889

Differential Revision: D13701276

Pulled By: zinoale

fbshipit-source-id: 5024e5bbccd4dd10fd90d947870237f436445055
2019-01-29 17:24:00 -08:00
Andrew Kryczka 8ec3e72551 Cache dictionary used for decompressing data blocks (#4881)
Summary:
- If block cache disabled or not used for meta-blocks, `BlockBasedTableReader::Rep::uncompression_dict` owns the `UncompressionDict`. It is preloaded during `PrefetchIndexAndFilterBlocks`.
- If block cache is enabled and used for meta-blocks, block cache owns the `UncompressionDict`, which holds dictionary and digested dictionary when needed. It is never prefetched though there is a TODO for this in the code. The cache key is simply the compression dictionary block handle.
- New stats for compression dictionary accesses in block cache: "BLOCK_CACHE_COMPRESSION_DICT_*" and "compression_dict_block_read_count"
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4881

Differential Revision: D13663801

Pulled By: ajkr

fbshipit-source-id: bdcc54044e180855cdcc57639b493b0e016c9a3f
2019-01-23 18:15:47 -08:00
Alexander Zinoviev 80bf8975fd Add a new per level counter for block cache hit (#4796)
Summary:
Add a new per level counter for block cache hits, increase it by one on every successful attempt to get an entry from cache.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4796

Differential Revision: D13513688

Pulled By: zinoale

fbshipit-source-id: 104df038f1232e3356e162eb2d8ca138e34a8281
2018-12-21 13:20:05 -08:00
Siying Dong da1c64b6e7 Introduce a CPU time counter in perf_context (#4741)
Summary:
Introduce the first CPU timing counter, perf_context.get_cpu_nanos. This opens a door to more CPU counters in the future.
Only Posix Env has it implemented using clock_gettime() with CLOCK_THREAD_CPUTIME_ID. How accurate the counter is depends on the platform.
Make PerfStepTimer to take an Env as an argument, and sometimes pass it in. The direct reason is to make the unit tests to use SpecialEnv where we can ingest logic there. But in long term, this is a good change.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4741

Differential Revision: D13287798

Pulled By: siying

fbshipit-source-id: 090361049d9d5095d1d1a369fe1338d2e2e1c73f
2018-12-20 12:03:44 -08:00
Adam Singer a914a1c6dc Add `getMin`, `getMax`, `getCount`, `getSum` to `HistogramData` class object. (#4742)
Summary:
Expose common stats min,max,count,sum via statistics JNI. These stats are not fully exposed on the Java side as is, but are available on the native side.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4742

Differential Revision: D13403766

Pulled By: ajkr

fbshipit-source-id: 5b70f7bd3fb7490aab73dcbd09f13490fce5c773
2018-12-14 14:28:44 -08:00
DorianZheng 2670fe8c73 Get `CompactionJobInfo` from CompactFiles
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4716

Differential Revision: D13207677

Pulled By: ajkr

fbshipit-source-id: d0ccf5a66df6cbb07288b0c5ebad81fd9df3926b
2018-12-13 14:21:24 -08:00
Anand Ananthabhotla 1b01d23be2 Add PerfContext counters for index/filter block cache stats (#4540)
Summary:
Add counters to track block cache index/filter hits and misses. We currently count aggregate hits and misses, which includes index/filter/data blocks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4540

Differential Revision: D10459652

Pulled By: anand1976

fbshipit-source-id: 0c59eee7f12f5103dcb6686f0e7995babe63d425
2018-12-07 15:07:56 -08:00
Adam Singer 3fa80f0e85 Remove enable_internal_stats (#4714)
Summary:
Simple patch to address comments in [statistics.h#L65](https://github.com/facebook/rocksdb/blob/master/monitoring/statistics.h#L65|statistics.h#L65)  `TODO(ajkr): clean this up since there are no internal stats anymore`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4714

Differential Revision: D13208093

Pulled By: ajkr

fbshipit-source-id: 4468badb850592411147539f859082644f5296f6
2018-11-27 12:58:58 -08:00
Adam Singer 1db4a096d4 Test mapping of Histograms and HistogramsNameMap (#4720)
Summary:
Adding sanity check test for mapping of `Histograms` and `HistogramsNameMap`

```
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from StatisticsTest
[ RUN      ] StatisticsTest.SanityTickers
[       OK ] StatisticsTest.SanityTickers (0 ms)
[ RUN      ] StatisticsTest.SanityHistograms
[       OK ] StatisticsTest.SanityHistograms (0 ms)
[----------] 2 tests from StatisticsTest (0 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (0 ms total)
[  PASSED  ] 2 tests.
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4720

Differential Revision: D13217061

Pulled By: ajkr

fbshipit-source-id: 6427f4e684c36b2f3c3440808b74fee86a364683
2018-11-27 10:48:30 -08:00
Soli f1837595a3 FIX #3278: Move global const object definitions from .h to .cc (#4691)
Summary:
Summary

We should declare constants in headers and define them in source files.
But this commit is only aimed at compound types.

I don't know if it is necessary to do the same thing to fundamental types.

I used this command to find all of the constant definitions in header files.

`find . -name "*.h" | xargs grep -e "^const .*=.*"`

And here is what I found:

```
./db/version_edit.h:const uint64_t kFileNumberMask = 0x3FFFFFFFFFFFFFFF;
./include/rocksdb/env.h:const size_t kDefaultPageSize = 4 * 1024;
./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = {
./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = {
./include/rocksdb/table.h:const uint32_t kPlainTableVariableLength = 0;
./include/rocksdb/utilities/transaction_db.h:const uint32_t kInitialMaxDeadlocks = 5;
./port/port_posix.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max();
./port/port_posix.h:const int kMaxInt32 = std::numeric_limits<int32_t>::max();
./port/port_posix.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max();
./port/port_posix.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max();
./port/port_posix.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max();
./port/win/port_win.h:const uint32_t kMaxUint32 = UINT32_MAX;
./port/win/port_win.h:const int kMaxInt32 = INT32_MAX;
./port/win/port_win.h:const int64_t kMaxInt64 = INT64_MAX;
./port/win/port_win.h:const uint64_t kMaxUint64 = UINT64_MAX;
./port/win/port_win.h:const size_t kMaxSizet = UINT64_MAX;
./port/win/port_win.h:const size_t kMaxSizet = UINT_MAX;
./port/win/port_win.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max();
./port/win/port_win.h:const int kMaxInt32 = std::numeric_limits<int>::max();
./port/win/port_win.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max();
./port/win/port_win.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max();
./port/win/port_win.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max();
./port/win/port_win.h:const bool kLittleEndian = true;
./table/cuckoo_table_factory.h:const uint32_t kCuckooMurmurSeedMultiplier = 816922183;
./table/data_block_hash_index.h:const uint8_t kNoEntry = 255;
./table/data_block_hash_index.h:const uint8_t kCollision = 254;
./table/data_block_hash_index.h:const uint8_t kMaxRestartSupportedByHashIndex = 253;
./table/data_block_hash_index.h:const size_t kMaxBlockSizeSupportedByHashIndex = 1u << 16;
./table/data_block_hash_index.h:const double kDefaultUtilRatio = 0.75;
./table/filter_block.h:const uint64_t kNotValid = ULLONG_MAX;
./table/format.h:const int kMagicNumberLengthByte = 8;
./third-party/fbson/FbsonJsonParser.h:const char* const kJsonDelim = " ,]}\t\r\n";
./third-party/fbson/FbsonJsonParser.h:const char* const kWhiteSpace = " \t\n\r";
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const BiggestInt kMaxBiggestInt =
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestStyleFlag[] = "death_test_style";
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestUseFork[] = "death_test_use_fork";
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kInternalRunDeathTestFlag[] = "internal_run_death_test";
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char* pets[] = {"cat", "dog"};
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const size_t kProtobufOneLinerMaxLength = 50;
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const int kMaxStackTraceDepth = 100;
./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const T* WithParamInterface<T>::parameter_ = NULL;
./util/coding.h:const unsigned int kMaxVarint64Length = 10;
./util/filename.h:const size_t kFormatFileNumberBufSize = 38;
./util/testutil.h:const SliceTransform* RandomSliceTransform(Random* rnd, int pre_defined = -1);
./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef";
./util/trace_replay.h:const unsigned int kTraceTimestampSize = 8;
./util/trace_replay.h:const unsigned int kTraceTypeSize = 1;
./util/trace_replay.h:const unsigned int kTracePayloadLengthSize = 4;
./util/trace_replay.h:const unsigned int kTraceMetadataSize =
./utilities/cassandra/serialize.h:const int64_t kCharMask = 0xFFLL;
./utilities/cassandra/serialize.h:const int32_t kBitsPerByte = 8;
```

And these 3 lines are related to this commit:

```
./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = {
./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = {
./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef";
```

Any comments would be appreciated.
Thanks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4691

Differential Revision: D13208049

Pulled By: ajkr

fbshipit-source-id: e5ee55fdaec5447fc5798c6721e2821e7cdc0d5b
2018-11-26 21:32:03 -08:00
Zhongyi Xie d6ec288703 Add PerfContextByLevel to provide per level perf context information (#4226)
Summary:
Current implementation of perf context is level agnostic. Making it hard to do performance evaluation for the LSM tree. This PR adds `PerfContextByLevel` to decompose the counters by level.
This will be helpful when analyzing point and range query performance as well as tuning bloom filter
Also replaced __thread with thread_local keyword for perf_context
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4226

Differential Revision: D10369509

Pulled By: miasantreble

fbshipit-source-id: f1ced4e0de5fcebdb7f9cff36164516bc6382d82
2018-10-17 11:19:40 -07:00
cngzhnp 64324e329e Support pragma once in all header files and cleanup some warnings (#4339)
Summary:
As you know, almost all compilers support "pragma once" keyword instead of using include guards. To be keep consistency between header files, all header files are edited.

Besides this, try to fix some warnings about loss of data.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4339

Differential Revision: D9654990

Pulled By: ajkr

fbshipit-source-id: c2cf3d2d03a599847684bed81378c401920ca848
2018-09-05 18:13:31 -07:00
Yanqin Jin 26d67e357e Support group commits of version edits (#3944)
Summary:
This PR supports the group commit of multiple version edit entries corresponding to different column families. Column family drop/creation still cannot be grouped. This PR is a subset of [PR 3752](https://github.com/facebook/rocksdb/pull/3752).
Closes https://github.com/facebook/rocksdb/pull/3944

Differential Revision: D8432536

Pulled By: riversand963

fbshipit-source-id: 8f11bd05193b6c0d9272d82e44b676abfac113cb
2018-06-28 12:34:39 -07:00
Yanqin Jin 7f850b889d Remove 'ALIGNAS' from StatisticsImpl. (#4061)
Summary:
Remove over-alignment on `StatisticsImpl` whose benefit is vague and causes UBSAN check to fail due to `std::make_shared` not respecting the over-alignment requirement.

Test plan
```
$ make clean && COMPILE_WITH_UBSAN=1 OPT=-g make -j16 ubsan_check
```
Closes https://github.com/facebook/rocksdb/pull/4061

Differential Revision: D8656506

Pulled By: riversand963

fbshipit-source-id: db355ae9c7bdd2c9e9c5e63cabba13d8d82cc5f9
2018-06-27 20:59:45 -07:00
Daniel Black 346d1069c3 Align StatisticsImpl / StatisticsData (#4036)
Summary:
Pinned the alignment of StatisticsData to the cacheline size rather than just extending its size (which could go over two cache lines)if unaligned in allocation.

Avoid compile errors in the process as per individual commit messages.

strengthen static_assert to CACHELINE rather than the highest common multiple.
Closes https://github.com/facebook/rocksdb/pull/4036

Differential Revision: D8582844

Pulled By: yiwu-arbug

fbshipit-source-id: 363c37029f28e6093e06c60b987bca9aa204bc71
2018-06-25 22:58:19 -07:00
Tomas Kolda 906a602c2c Build and tests fixes for Solaris Sparc (#4000)
Summary:
Here are some fixes for build on Solaris Sparc.

It is also fixing CRC test on BigEndian platforms.
Closes https://github.com/facebook/rocksdb/pull/4000

Differential Revision: D8455394

Pulled By: ajkr

fbshipit-source-id: c9289a7b541a5628139c6b77e84368e14dc3d174
2018-06-15 12:42:53 -07:00
Andrew Kryczka 508a09fd62 Print histogram count and sum in statistics string
Summary:
Previously it only printed percentiles, even though our histogram keeps track of count and sum (and more). There have been many times we want to know more than the percentiles. For example, we currently want sum of "rocksdb.compression.times.nanos" and sum of "rocksdb.decompression.times.nanos", which would allow us to know the relative cost of compression vs decompression.

This PR adds count and sum to the string printed by `StatisticsImpl::ToString`. This is a bit risky as there are definitely parsers assuming the old format. I will mention it in HISTORY.md and hope for the best...
Closes https://github.com/facebook/rocksdb/pull/3863

Differential Revision: D8038831

Pulled By: ajkr

fbshipit-source-id: 0465b72e4b0cbf18ef965f4efe402601d16d5b5c
2018-05-21 11:12:47 -07:00
Siying Dong 7ccb35f653 In instrumented mutex, take timing once for both of perf_context and statistics
Summary: Closes https://github.com/facebook/rocksdb/pull/3427

Differential Revision: D6827236

Pulled By: siying

fbshipit-source-id: d8a2cc525c90df625510565669f2659014259a8a
2018-05-17 12:56:53 -07:00
Mike Kolupaev affe01b0d5 Improve write time breakdown stats
Summary:
There's a group of stats in PerfContext for profiling the write path. They break down the write time into WAL write, memtable insert, throttling, and everything else. We use these stats a lot for figuring out the cause of slow writes.

These stats got a bit out of date and are now categorizing some interesting things as "everything else", and also do some double counting. This PR fixes it and adds two new stats: time spent waiting for other threads of the batch group, and time spent waiting for scheduling flushes/compactions. Probably these will be enough to explain all the occasional abnormally slow (multiple seconds) writes that we're seeing.
Closes https://github.com/facebook/rocksdb/pull/3602

Differential Revision: D7251562

Pulled By: al13n321

fbshipit-source-id: 0a2d0f5a4fa5677455e1f566da931cb46efe2a0d
2018-04-23 17:58:54 -07:00
David Lai 3be9b36453 comment unused parameters to turn on -Wunused-parameter flag
Summary:
This PR comments out the rest of the unused arguments which allow us to turn on the -Wunused-parameter flag. This is the second part of a codemod relating to https://github.com/facebook/rocksdb/pull/3557.
Closes https://github.com/facebook/rocksdb/pull/3662

Differential Revision: D7426121

Pulled By: Dayvedde

fbshipit-source-id: 223994923b42bd4953eb016a0129e47560f7e352
2018-04-12 17:59:16 -07:00
Fosco Marotto d518fe1da6 uint64_t and size_t changes to compile for iOS
Summary:
In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t.  This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported.

This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types.

Also up for discussion: is iOS worth supporting?  Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience.
Closes https://github.com/facebook/rocksdb/pull/3503

Differential Revision: D7106457

Pulled By: gfosco

fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e
2018-03-06 12:43:51 -08:00
Chinmay Kamat 9fc72d6f16 Compilation fixes for powerpc build, -Wparentheses-equality error and missing header guards
Summary:
This pull request contains miscellaneous compilation fixes.

Thanks,
Chinmay
Closes https://github.com/facebook/rocksdb/pull/3462

Differential Revision: D6941424

Pulled By: sagar0

fbshipit-source-id: fe9c26507bf131221f2466740204bff40a15614a
2018-02-09 14:12:43 -08:00
Tamir Duberstein cd5092e168 Suppress unused warnings
Summary:
- Use `__unused__` everywhere
- Suppress unused warnings in Release mode
    + This currently affects non-MSVC builds (e.g. mingw64).
Closes https://github.com/facebook/rocksdb/pull/3448

Differential Revision: D6885496

Pulled By: miasantreble

fbshipit-source-id: f2f6adacec940cc3851a9eee328fafbf61aad211
2018-02-02 12:27:07 -08:00
Andrew Kryczka 5a7e08468a fix ThreadStatus for bottom-pri compaction threads
Summary:
added `ThreadType::BOTTOM_PRIORITY` which is used in the `ThreadStatus` object to indicate the thread is used for bottom-pri compactions. Previously there was a bug where we mislabeled such threads as `ThreadType::LOW_PRIORITY`.
Closes https://github.com/facebook/rocksdb/pull/3270

Differential Revision: D6559428

Pulled By: ajkr

fbshipit-source-id: 96b1a50a9c19492b1a5fd1b77cf7061a6f9f1d1c
2017-12-14 14:57:49 -08:00
anand1976 d394a6bb48 Add a ticker stat for number of keys skipped during iteration
Summary:
This diff adds a new ticker stat, NUMBER_ITER_SKIP, to count the
number of internal keys skipped during iteration. Keys can be skipped
due to deletes, or lower sequence number, or higher sequence number
than the one requested.

Also, fix the issue when StatisticsData is naturally aligned on cacheline boundary,
padding becomes a zero size array, which the Windows compiler doesn't
like. So add a cacheline worth of padding in that case to keep it happy.
We cannot conditionally add padding as gcc doesn't allow using sizeof
in preprocessor directives.
Closes https://github.com/facebook/rocksdb/pull/3177

Differential Revision: D6353897

Pulled By: anand1976

fbshipit-source-id: 441d5a09af9c4e22e7355242dfc0c7b27aa0a6c2
2017-11-20 21:26:37 -08:00
Manuel Ung e03377c7fd Add lock wait time as a perf context counter
Summary:
Adds two new counters:

`key_lock_wait_count` counts how many times a lock was blocked by another transaction and had to wait, instead of being granted the lock immediately.
`key_lock_wait_time` counts the time spent acquiring locks.
Closes https://github.com/facebook/rocksdb/pull/3107

Differential Revision: D6217332

Pulled By: lth

fbshipit-source-id: 55d4f46da5550c333e523263422fd61d6a46deb9
2017-11-06 10:57:19 -08:00
Prashant D 3c208e7616 HistogramStat: Handle divide by zero situation
Summary:
The num() might return cur_num as 0 and we are making sure that
cur_num will not be 0 down the path. The mult variable is being set to
100.0/cur_num which makes program crash when cur_num is 0.
Closes https://github.com/facebook/rocksdb/pull/3105

Differential Revision: D6222594

Pulled By: ajkr

fbshipit-source-id: 986154709897ff4dbbeb0e8aa81eb8c0b2a2db76
2017-11-02 11:41:50 -07:00
Dmitri Smirnov d2a65c59e1 Fix unused var warnings in Release mode
Summary:
MSVC does not support unused attribute at this time. A separate assignment line fixes the issue probably by being counted as usage for MSVC and it no longer complains about unused var.
Closes https://github.com/facebook/rocksdb/pull/3048

Differential Revision: D6126272

Pulled By: maysamyabandeh

fbshipit-source-id: 4907865db45fd75a39a15725c0695aaa17509c1f
2017-10-23 14:27:04 -07:00
Aliaksei Sandryhaila a48a398e7c Use RAII instead of pointers in cf_info_map
Summary:
There is no need for smart pointers in cf_info_map, so use RAII. This should also placate valgrind.
Closes https://github.com/facebook/rocksdb/pull/2943

Differential Revision: D5932941

Pulled By: asandryh

fbshipit-source-id: 2c37df88573a9df2557880a31193926e4425e054
2017-09-28 14:26:47 -07:00
Siying Dong c319792059 Directly refernce perf_context internally.
Summary:
After 7f6c02dda1, the same get_perf_context() is called both of internally and externally. However, I found internally this is not got inlined. I don't know why this is the case, but directly referencing perf_context is the logical way to do.
Closes https://github.com/facebook/rocksdb/pull/2892

Differential Revision: D5843789

Pulled By: siying

fbshipit-source-id: b49777d8809f35847699291bb7f8ea2754c3af49
2017-09-15 17:15:10 -07:00
Andrew Kryczka 3b23b1d8c6 fix HistogramWindowingImpl copy-{assignment,constructor}
Summary:
their arguments had a typo.
Closes https://github.com/facebook/rocksdb/pull/2821

Differential Revision: D5752408

Pulled By: ajkr

fbshipit-source-id: f2d84489c0c615b12a790f04c42d35cc0dccb02d
2017-09-01 11:11:56 -07:00
Andrew Kryczka 3b9a000c15 fix inclusive-exclusiveness of histogram ToString
Summary:
I spent too much time thinking about histograms lately and realized boundary values fall into the lower bucket, not the upper bucket. It's because we're using `std::map::lower_bound` here: 867fe92e5e/monitoring/histogram.cc (L53).  Fixed histogram's `ToString()` to reflect this.
Closes https://github.com/facebook/rocksdb/pull/2817

Differential Revision: D5751159

Pulled By: ajkr

fbshipit-source-id: 67432bb45849eec9b5bcc0d095551dbc0ee81766
2017-09-01 10:56:34 -07:00
Andrew Kryczka 867fe92e5e Scale histogram bucket size by constant factor
Summary:
The goal is to reduce the number of histogram buckets, particularly now that we print these histograms for each column family. I chose 1.5 as the factor. We can adjust it later to either make buckets more granular or make fewer buckets.
Closes https://github.com/facebook/rocksdb/pull/2139

Differential Revision: D4872076

Pulled By: ajkr

fbshipit-source-id: 87790d782a605506c3d24190a028cecbd7aa564a
2017-08-21 17:10:40 -07:00
Andrew Kryczka ed0a4c93ef perf_context measure user bytes read
Summary:
With this PR, we can measure read-amp for queries where perf_context is enabled as follows:

```
SetPerfLevel(kEnableCount);
Get(1, "foo");
double read_amp = static_cast<double>(get_perf_context()->block_read_byte / get_perf_context()->get_read_bytes);
SetPerfLevel(kDisable);
```

Our internal infra enables perf_context for a sampling of queries. So we'll be able to compute the read-amp for the sample set, which can give us a good estimate of read-amp.
Closes https://github.com/facebook/rocksdb/pull/2749

Differential Revision: D5647240

Pulled By: ajkr

fbshipit-source-id: ad73550b06990cf040cc4528fa885360f308ec12
2017-08-18 11:43:33 -07:00
Andrew Kryczka 5de98f2d50 approximate histogram stats to save cpu
Summary:
sounds like we're willing to tradeoff minor inaccuracy in stats for speed. start with histogram stats. ticker stats will be harder (and, IMO, we shouldn't change them in this manner) as many test cases rely on them being exactly correct.
Closes https://github.com/facebook/rocksdb/pull/2720

Differential Revision: D5607884

Pulled By: ajkr

fbshipit-source-id: 1b754cda35ea6b252d1fdd5aa3cfb58866506372
2017-08-11 13:13:12 -07:00
Siying Dong 21696ba502 Replace dynamic_cast<>
Summary:
Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available.
Some nontrivial changes:
1. Add Comparator::GetRootComparator() to get around the internal comparator hack
2. Add the two experiemental functions to DB
3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string
4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric.
Closes https://github.com/facebook/rocksdb/pull/2645

Differential Revision: D5502723

Pulled By: siying

fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
2017-07-28 16:27:16 -07:00
Siying Dong 3c327ac2d0 Change RocksDB License
Summary: Closes https://github.com/facebook/rocksdb/pull/2589

Differential Revision: D5431502

Pulled By: siying

fbshipit-source-id: 8ebf8c87883daa9daa54b2303d11ce01ab1f6f75
2017-07-15 16:11:23 -07:00
Aaron Gao 0025a36409 revert perf_context and io_stats to __thread
Summary:
https://github.com/facebook/rocksdb/pull/2380 introduces a regression by replacing __thread with ThreadLocalPtr. Revert the thread local implementation back.
Closes https://github.com/facebook/rocksdb/pull/2485

Differential Revision: D5308050

Pulled By: lightmark

fbshipit-source-id: 2676e9c22edf76e8133d3f4c50e2711e11a95480
2017-06-26 15:27:17 -07:00
Siying Dong 5582123dee Sample number of reads per SST file
Summary:
We estimate number of reads per SST files, by updating the counter per file in sampled read requests. This information can later be used to trigger compactions to improve read performacne.
Closes https://github.com/facebook/rocksdb/pull/2417

Differential Revision: D5193528

Pulled By: siying

fbshipit-source-id: b4241c5ad0eaf444b61afb53f8e6290d9f5da2df
2017-06-12 07:12:08 -07:00
Aaron Gao 7f6c02dda1 using ThreadLocalPtr to hide ROCKSDB_SUPPORT_THREAD_LOCAL from public…
Summary:
… headers

https://github.com/facebook/rocksdb/pull/2199 should not reference RocksDB-specific macros (like ROCKSDB_SUPPORT_THREAD_LOCAL in this case) to public headers, `iostats_context.h` and `perf_context.h`. We shouldn't do that because users have to provide these compiler flags when building their binary with RocksDB.

We should hide the thread local global variable inside our implementation and just expose a function api to retrieve these variables. It may break some users for now but good for long term.

make check -j64
Closes https://github.com/facebook/rocksdb/pull/2380

Differential Revision: D5177896

Pulled By: lightmark

fbshipit-source-id: 6fcdfac57f2e2dcfe60992b7385c5403f6dcb390
2017-06-02 17:26:19 -07:00
Andrew Kryczka ac39d6bec5 Core-local statistics
Summary:
This diff changes `StatisticsImpl` from a thread-local approach to a core-local one. The goal is to perform faster aggregations, particularly for applications that have many threads. There should be no behavior change.
Closes https://github.com/facebook/rocksdb/pull/2258

Differential Revision: D5016258

Pulled By: ajkr

fbshipit-source-id: 7d4d165b4a91d8110f0409d113d1be91f22d31a9
2017-05-23 10:42:59 -07:00
Nikhil Benesch 11c5d4741a cross-platform compatibility improvements
Summary:
We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS.

See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build.

I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream.
Closes https://github.com/facebook/rocksdb/pull/2199

Differential Revision: D5054042

Pulled By: yiwu-arbug

fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171
2017-05-15 16:15:38 -07:00
Siying Dong d616ebea23 Add GPLv2 as an alternative license.
Summary: Closes https://github.com/facebook/rocksdb/pull/2226

Differential Revision: D4967547

Pulled By: siying

fbshipit-source-id: dd3b58ae1e7a106ab6bb6f37ab5c88575b125ab4
2017-04-27 18:06:12 -07:00
Andrew Kryczka efc361ef7d Add user stats Reset API
Summary:
It resets all the ticker and histogram stats to zero. Needed to change the locking a bit since Reset() is the only operation that manipulates multiple tickers/histograms together, and that operation should be seen as atomic by other operations that access tickers/histograms.
Closes https://github.com/facebook/rocksdb/pull/2213

Differential Revision: D4952232

Pulled By: ajkr

fbshipit-source-id: c0475c3e4c7b940120d53891b69c3091149a0679
2017-04-26 15:57:01 -07:00
Tomas Kolda c1fbf91b2f Fixing Solaris Sparc crash due to cached TLS
Summary:
Workaround for Solaris gcc binary. Program is crashing, because when TLS of perf context that is used twice on same frame, it is damaged thus Segmentation fault.

Issue: #2153
Closes https://github.com/facebook/rocksdb/pull/2187

Differential Revision: D4922274

Pulled By: siying

fbshipit-source-id: 549105ebce9a8ce08a737f4d6b9f2312ebcde9a8
2017-04-20 11:06:09 -07:00
Sagar Vemuri 343b59d6ee Move various string utility functions into string_util
Summary:
This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.

Check the sub-commits for easier reviewing.
Closes https://github.com/facebook/rocksdb/pull/2094

Differential Revision: D4837730

Pulled By: sagar0

fbshipit-source-id: 344278a
2017-04-06 14:54:12 -07:00
Siying Dong d2dce5611a Move some files under util/ to separate dirs
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090

Differential Revision: D4833681

Pulled By: siying

fbshipit-source-id: 2fd8bef
2017-04-05 19:09:16 -07:00