Commit Graph

8570 Commits

Author SHA1 Message Date
anand76 afa2420c2b Introduce a new storage specific Env API (#5761)
Summary:
The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.

This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.

The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.

This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.

The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761

Differential Revision: D18868376

Pulled By: anand1976

fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
2019-12-13 14:48:41 -08:00
Peter Dillinger 58d46d1915 Add useful idioms to Random API (OneInOpt, PercentTrue) (#6154)
Summary:
And clean up related code, especially in stress test.

(More clean up of db_stress_test_base.cc coming after this.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6154

Test Plan: make check, make blackbox_crash_test for a bit

Differential Revision: D18938180

Pulled By: pdillinger

fbshipit-source-id: 524d27621b8dbb25f6dff40f1081e7c00630357e
2019-12-13 14:30:14 -08:00
Levi Tamasi 6d54eb3dc2 Do not create/install new SuperVersion if nothing was deleted during memtable trim (#6169)
Summary:
We have observed an increase in CPU load caused by frequent calls to
`ColumnFamilyData::InstallSuperVersion` from `DBImpl::TrimMemtableHistory`
when using `max_write_buffer_size_to_maintain` to limit the amount of
memtable history maintained for transaction conflict checking. As it turns out,
this is caused by the code creating and installing a new `SuperVersion` even if
no memtables were actually trimmed. The patch adds a check to avoid this.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6169

Test Plan:
Compared `perf` output for

```
./db_bench -benchmarks=randomtransaction -optimistic_transaction_db=1 -statistics -stats_interval_seconds=1 -duration=90 -num=500000 --max_write_buffer_size_to_maintain=16000000 --transaction_set_snapshot=1 --threads=32
```

before and after the change. With the fix, the call chain `rocksdb::DBImpl::TrimMemtableHistory` ->
`rocksdb::ColumnFamilyData::InstallSuperVersion` -> `rocksdb::ThreadLocalPtr::StaticMeta::Scrape`
no longer registers in the `perf` report.

Differential Revision: D19031509

Pulled By: ltamasi

fbshipit-source-id: 02686fce594e5b50eba0710e4b28a9b808c8aa20
2019-12-13 13:29:29 -08:00
Kefu Chai ac304adf46 cmake: do not build tests for Release build and cleanups (#5916)
Summary:
fixes https://github.com/facebook/rocksdb/issues/2445
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5916

Differential Revision: D19031236

fbshipit-source-id: bc3107b6b25a01958677d7cb411b1f381aae91c6
2019-12-13 12:48:06 -08:00
Maysam Yabandeh fec7302a9d Enable unordered_write in stress tests (#6164)
Summary:
With WritePrepared transactions configured with two_write_queues, unordered_write will offer the same guarantees as vanilla rocksdb and thus can be enabled in stress tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6164

Test Plan:
```
make -j32 crash_test_with_txn

Differential Revision: D18991899

Pulled By: maysamyabandeh

fbshipit-source-id: eece5e96b4169b67d7931e5c0afca88540a113e1
2019-12-13 10:25:04 -08:00
Levi Tamasi 583c6953d8 Move out valid blobs from the oldest blob files during compaction (#6121)
Summary:
The patch adds logic that relocates live blobs from the oldest N non-TTL
blob files as they are encountered during compaction (assuming the BlobDB
configuration option `enable_garbage_collection` is `true`), where N is defined
as the number of immutable non-TTL blob files multiplied by the value of
a new BlobDB configuration option called `garbage_collection_cutoff`.
(The default value of this parameter is 0.25, that is, by default the valid blobs
residing in the oldest 25% of immutable non-TTL blob files are relocated.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6121

Test Plan: Added unit test and tested using the BlobDB mode of `db_bench`.

Differential Revision: D18785357

Pulled By: ltamasi

fbshipit-source-id: 8c21c512a18fba777ec28765c88682bb1a5e694e
2019-12-13 10:13:05 -08:00
Jermy Li c2029f9716 Support concurrent CF iteration and drop (#6147)
Summary:
It's easy to cause coredump when closing ColumnFamilyHandle with unreleased iterators, especially iterators release is controlled by java GC when using JNI.

This patch fixed concurrent CF iteration and drop, we let iterators(actually SuperVersion) hold a ColumnFamilyData reference to prevent the CF from being released too early.

fixed https://github.com/facebook/rocksdb/issues/5982
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6147

Differential Revision: D18926378

fbshipit-source-id: 1dff6d068c603d012b81446812368bfee95a5e15
2019-12-12 19:04:48 -08:00
myasuka 4b74035e40 Correct java docs of RocksDB options (#6123)
Summary:
Correct javadocs of several RocksDB option classes to not mislead RocksJava users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6123

Differential Revision: D18989044

Pulled By: pdillinger

fbshipit-source-id: a5ac6a415e5311084b10d973d354e6925788f01e
2019-12-12 18:10:03 -08:00
奏之章 c4ce8e637f Fix RangeDeletion bug (#6062)
Summary:
Read keys from a snapshot that a range deletion were added after the snapshot  was created and this range deletion was inside an immutable memtable, we will get wrong key set.
More detail rest in codes.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6062

Differential Revision: D18966785

Pulled By: pdillinger

fbshipit-source-id: 38a60bb1e2d0a1dbfc8ec641617200b6a02b86c3
2019-12-12 15:18:02 -08:00
Connor a844591201 wait pending memtable writes on file ingestion or compact range (#6113)
Summary:
**Summary:**
This PR fixes two unordered_write related issues:
- ingestion job may skip the necessary memtable flush https://github.com/facebook/rocksdb/issues/6026
- compact range may cause memtable is flushed before pending unordered write finished
    1. `CompactRange` triggers memtable flush but doesn't wait for pending-writes
    2.  there are some pending writes but memtable is already flushed
    3.  the memtable related WAL is removed( note that the pending-writes were recorded in that WAL).
    4.  pending-writes write to newer created memtable
    5. there is a restart
    6. missing the previous pending-writes because WAL is removed but they aren't included in SST.

**How to solve:**
- Wait pending memtable writes before ingestion job check memtable key range
- Wait pending memtable writes before flush memtable.
**Note that: `CompactRange` calls `RangesOverlapWithMemtables` too without waiting for pending waits, but I'm not sure whether it affects the correctness.**

**Test Plan:**
make check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6113

Differential Revision: D18895674

Pulled By: maysamyabandeh

fbshipit-source-id: da22b4476fc7e06c176020e7cc171eb78189ecaf
2019-12-12 14:08:02 -08:00
sdong 814d4e7ce0 Improve instructions to install formatter (#6162)
Summary:
While the instruction of installing "make format" dependencies works on some platforms, it is hard to use for some others. Improve it a little bit.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6162

Test Plan: Run "make format" on an envrionment missing the dependencies and see the instructions printed out

Differential Revision: D18970773

fbshipit-source-id: fd21b31053407cc171a6675f781a556a1c3e8945
2019-12-12 14:04:01 -08:00
Maysam Yabandeh a796c06fef Fix build breakage from lock_guard error (#6161)
Summary:
This change fixes a source issue that caused compile time error which breaks build for many fbcode services in that setup. The size() member function of channel is a const member, so member variables accessed within it are implicitly const as well. This caused error when clang fails to resolve to a constructor that takes std::mutex because the suitable constructor got rejected due to loss of constness for its argument. The fix is to add mutable modifier to the lock_ member of channel.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6161

Differential Revision: D18967685

Pulled By: maysamyabandeh

fbshipit-source-id: 698b6a5153c3c92eeacb842c467aa28cc350d432
2019-12-12 13:50:27 -08:00
Adam Retter b433bbefe9 Add missing mutable DBOptions to RocksJava (#6152)
Summary:
As requested in https://github.com/facebook/rocksdb/issues/6127
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6152

Differential Revision: D18955608

Pulled By: pdillinger

fbshipit-source-id: 3e1367d944e44d5f1675a422f7dd2451c86feb6f
2019-12-12 12:01:19 -08:00
Levi Tamasi 3b607610df Do not update SST <-> blob file mapping if compaction failed
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6156

Test Plan: Extended unit tests.

Differential Revision: D18943867

Pulled By: ltamasi

fbshipit-source-id: b3669d2dd6af08e987ad1a59d6712ae2514da0b1
2019-12-12 11:30:45 -08:00
Maysam Yabandeh 8613ee2e94 Enable all txn write policies in crash test (#6158)
Summary:
Currently the default txn write policy in crash tests is WRITE_PREPARED. The patch randomly picks the write policy at the start of the crash test.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6158

Test Plan:
```
make -j32 crash_test_with_txn
```

Differential Revision: D18946307

Pulled By: maysamyabandeh

fbshipit-source-id: f77d7a94f99a08791ef9626da153d284bf521950
2019-12-12 10:43:49 -08:00
Levi Tamasi e1dfe80fe0 Mark BlobIndex::DebugString const
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6157

Test Plan: make check

Differential Revision: D18944259

Pulled By: ltamasi

fbshipit-source-id: 7fb29447b52d801215bd6ab811e229a7fa2c763d
2019-12-11 17:19:43 -08:00
Maysam Yabandeh 1ad6fa9cc7 Enable txn in crash tests (#6155)
Summary:
Start daily crash tests with use_txn flag.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6155

Differential Revision: D18943630

Pulled By: maysamyabandeh

fbshipit-source-id: eea99a6ffd5f57fb9651f6ca7dab8fbf70379c87
2019-12-11 16:01:55 -08:00
Peter Dillinger d0ad3c59d8 Fix c_test:filter for various CACHE_LINE_SIZEs (#6153)
Summary:
This test was recently updated but failed to account for Bloom
schema variance by CACHE_LINE_SIZE. (Since CACHE_LINE_SIZE is not
defined in our C code, the test now simply allows a valid result for any
CACHE_LINE_SIZE, not just the current one.)

Unblock https://github.com/facebook/rocksdb/issues/5932
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6153

Test Plan:
ran unit test with builds TEST_CACHE_LINE_SIZE=128, =256, and
unset (64 on Intel)

Differential Revision: D18936015

Pulled By: pdillinger

fbshipit-source-id: e5e3852f95283d34d624632c1ae8d3adb2f2662c
2019-12-11 15:17:08 -08:00
奏之章 3717a88289 Fix UniversalCompaction trivial move bug (#6067)
Summary:
`curr.level` is `c->inputs_` index, not real level.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6067

Differential Revision: D18935726

fbshipit-source-id: 4354e6e9cd900ca56c96e9d770f0ab6634e45daf
2019-12-11 11:27:53 -08:00
ferhat elmas afdc58d478 Fix typos in history
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6116

Differential Revision: D18935622

fbshipit-source-id: 59f7a7bc9f0116ae6354ea217896622a34329d3c
2019-12-11 11:04:46 -08:00
Yi Wu 05a86318a7 Remove unused low_pri_write_rate_limiter_ (#6068)
Summary:
`low_pri_write_rate_limiter_` is not being used. Removing. `WriteController` has an internal low_pri rate limiter which is the real rate limiter for low-pri writes.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6068

Test Plan: make

Differential Revision: D18664120

fbshipit-source-id: dfe3e4de033cf3522b67781b383aad7d0936034c
2019-12-11 10:28:33 -08:00
Cheng Chang 77565d7532 Add example to show the effect of Get in snapshot isolation (#6059)
Summary:
Adds example to show the difference of reading from snapshot and from the latest state.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6059

Test Plan: cd examples && make transaction_example && ./transaction_example

Differential Revision: D18797616

fbshipit-source-id: f17a2cb12187092ea243159e6ccf55790859e0c0
2019-12-11 09:56:42 -08:00
Yanqin Jin 383f5071f0 Add SyncWAL to db_stress (#6149)
Summary:
Add SyncWAL to db_stress. Specify with `-sync_wal_one_in=N` so that it will be
called once every N operations on average.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6149

Test Plan:
```
$make db_stress
$./db_stress -sync_wal_one_in=100 -ops_per_thread=100000
```

Differential Revision: D18922529

Pulled By: riversand963

fbshipit-source-id: 4c0b8cb8fa21852722cffd957deddf688f12ea56
2019-12-10 21:55:25 -08:00
sdong 7a99162a74 db_stress: sometimes call CancelAllBackgroundWork() and Close() before closing DB (#6141)
Summary:
CancelAllBackgroundWork() and Close() are frequently used features but we don't cover it in stress test. Simply execute them before closing the DB with 1/2 chance.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6141

Test Plan: Run "db_stress".

Differential Revision: D18900861

fbshipit-source-id: 49b46ccfae120d0f9de3e0543b82fb6d715949d0
2019-12-10 20:04:52 -08:00
Adam Retter 984b6e71d6 Add Visual Studio 2015 to AppVeyor (#5446)
Summary:
This is required to compile on Windows with Visual Studio 2015, which is used for creating the RocksJava releases.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5446

Differential Revision: D18924811

fbshipit-source-id: a183a62e79a2af5aaf59cd08235458a172fe7dcb
2019-12-10 20:02:31 -08:00
Peter Dillinger a653857178 Add PauseBackgroundWork() to db_stress (#6148)
Summary:
Worker thread will occasionally call PauseBackgroundWork(),
briefly sleep (to avoid stalling itself) and then call
ContinueBackgroundWork().
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6148

Test Plan:
some running of 'make blackbox_crash_test' with temporary
printf output to confirm code occasionally reached.

Differential Revision: D18913886

Pulled By: pdillinger

fbshipit-source-id: ae9356a803390929f3165dfb6a00194692ba92be
2019-12-10 15:46:48 -08:00
Adam Simpkins 2bb5fc1280 Add an option to the CMake build to disable building shared libraries (#6122)
Summary:
Add an option to explicitly disable building shared versions of the
RocksDB libraries.  The shared libraries cannot be built in cases where
some dependencies are only available as static libraries.  This allows
still building RocksDB in these situations.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6122

Differential Revision: D18920740

fbshipit-source-id: d24f66d93c68a1e65635e6e0b663bae62c903bca
2019-12-10 15:20:50 -08:00
Yanqin Jin 2b060c1498 Use Env::GetChildren() instead of readdir (#6139)
Summary:
For more portability, switch from readdir to Env::GetChildren() in ldb's
manifest_dump subcommand.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6139

Test Plan:
```
$make check
```
Manually check ldb command.

Differential Revision: D18898197

Pulled By: riversand963

fbshipit-source-id: 92afca379e9fbe78ab70b2eb40d127daad8df5e2
2019-12-10 11:49:09 -08:00
sdong 14c38baca0 db_stress: sometimes validate compact range data (#6140)
Summary:
Right now, in db_stress, compact range is simply executed without any immediate data validation. Add a simply validation which compares hash for all keys within the compact range to stay the same against the same snapshot before and after the compaction.

Also, randomly tune most knobs of CompactRangeOptions.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6140

Test Plan: Run db_stress with "--compact_range_one_in=2000 --compact_range_width=100000000" for a while. Manually ingest some hacky code and observe the error path.

Differential Revision: D18900230

fbshipit-source-id: d96e75bc8c38dd5ec702571ffe7cf5f4ea93ee10
2019-12-10 11:41:50 -08:00
Jermy Li 1dd3194f56 Fix compile error "folly/xx.h file not found" on Mac OS (#6145)
Summary:
Error message when running `make` on Mac OS with master branch (v6.6.0):
```
$ make
$DEBUG_LEVEL is 1
Makefile:168: Warning: Compiling in debug mode. Don't use the resulting binary in production
third-party/folly/folly/synchronization/WaitOptions.cpp:6:10: fatal error: 'folly/synchronization/WaitOptions.h' file not found
#include <folly/synchronization/WaitOptions.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/ParkingLot.cpp:6:10: fatal error: 'folly/synchronization/ParkingLot.h' file not found
#include <folly/synchronization/ParkingLot.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/DistributedMutex.cpp:6:10: fatal error: 'folly/synchronization/DistributedMutex.h' file not found
#include <folly/synchronization/DistributedMutex.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/AtomicNotification.cpp:6:10: fatal error: 'folly/synchronization/AtomicNotification.h' file not found
#include <folly/synchronization/AtomicNotification.h>
         ^
1 error generated.
third-party/folly/folly/detail/Futex.cpp:6:10: fatal error: 'folly/detail/Futex.h' file not found
#include <folly/detail/Futex.h>
         ^
1 error generated.
  GEN      util/build_version.cc
$DEBUG_LEVEL is 1
Makefile:168: Warning: Compiling in debug mode. Don't use the resulting binary in production
third-party/folly/folly/synchronization/WaitOptions.cpp:6:10: fatal error: 'folly/synchronization/WaitOptions.h' file not found
#include <folly/synchronization/WaitOptions.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/ParkingLot.cpp:6:10: fatal error: 'folly/synchronization/ParkingLot.h' file not found
#include <folly/synchronization/ParkingLot.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/DistributedMutex.cpp:6:10: fatal error: 'folly/synchronization/DistributedMutex.h' file not found
#include <folly/synchronization/DistributedMutex.h>
         ^
1 error generated.
third-party/folly/folly/synchronization/AtomicNotification.cpp:6:10: fatal error: 'folly/synchronization/AtomicNotification.h' file not found
#include <folly/synchronization/AtomicNotification.h>
         ^
1 error generated.
third-party/folly/folly/detail/Futex.cpp:6:10: fatal error: 'folly/detail/Futex.h' file not found
#include <folly/detail/Futex.h>
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6145

Differential Revision: D18910812

fbshipit-source-id: 5a4475466c2d0601657831a0b48d34316b2f0816
2019-12-10 11:24:11 -08:00
Peter Dillinger 6380df5e10 Vary bloom_bits in db_crashtest (#6103)
Summary:
Especially with non-integral bits/key now supported,
db_crashtest should vary the bloom_bits configuration. The probabilities
look like this:

1/2 chance of a uniform int from 0 to 19. This includes overall 1/40
chance of 0 which disables the bloom filter.

1/2 chance of a float from a lognormal distribution with a median of 10.
This always produces positive values but with a decent chance of < 1
(overall ~1/40) or > 100 (overall ~1/40), the enforced/coerced
implementation limits.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6103

Test Plan:
start 'make blackbox_crash_test' several times and look at
configuration output

Differential Revision: D18734877

Pulled By: pdillinger

fbshipit-source-id: 4a38cb057d3b3fc1327f93199f65b9a9ffbd7316
2019-12-10 08:39:50 -08:00
sdong a68dff5c35 Apply formatter to some recent commits (#6138)
Summary:
Formatter somehow complains some recent lines changed. Apply them to make the formatter happy.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6138

Test Plan: See CI passes.

Differential Revision: D18895950

fbshipit-source-id: 7d1696cf3e3a682bc10a30cdca748a23c6565255
2019-12-09 15:49:49 -08:00
sdong a960287dee db_stress: Some code style improvements (#6137)
Summary:
Two changes:
1. Prevent static variables in a header file
2. Add "override" keyword when virtual functions are overridden.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6137

Test Plan: Build db_stress with or without LITE.

Differential Revision: D18892007

fbshipit-source-id: 295356427a34473b23ed36d6ed4ef3ae35a32db0
2019-12-09 14:38:42 -08:00
Peter Dillinger e43d2c4424 Fix & test rocksdb_filterpolicy_create_bloom_full (#6132)
Summary:
Add overrides needed in FilterPolicy wrapper to fix
rocksdb_filterpolicy_create_bloom_full (see issue https://github.com/facebook/rocksdb/issues/6129). Re-enabled
assertion in BloomFilterPolicy::CreateFilter that was being violated.
Expanded c_test to identify Bloom filter implementations by FP counts.
(Without the fix, updated test will trigger assertion and fail otherwise
without the assertion.)

Fixes https://github.com/facebook/rocksdb/issues/6129
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6132

Test Plan: updated c_test, also run under valgrind.

Differential Revision: D18864911

Pulled By: pdillinger

fbshipit-source-id: 08e81d7b5368b08e501cd402ef5583f2650c19fa
2019-12-09 12:21:14 -08:00
sdong 3c347821b7 Fix thread_local_test failure caused by recent io_uring change (#6136)
Summary:
thread_local_test now fails because it asserts no thread local instance is created when the test started. However, right now a thread local instance might be created when creating PosixEnv as a static variable. Fix the test by relaxing the assumption of starting from 0.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6136

Test Plan: Find an environment where the test fails, and see it passes with the fix applied.

Differential Revision: D18889224

fbshipit-source-id: 7946f3bfea81d236f7bb1554076696705b211b92
2019-12-09 12:03:30 -08:00
Ziyue Yang 7e2f831924 Fix wrong ExtractUserKey usage in BlockBasedTableBuilder::EnterUnbuff… (#6100)
Summary:
BlockBasedTableBuilder uses ExtractUserKey in EnterUnbuffered. This would
cause index filter building error, since user-provided timestamp is supported
by ExtractUserKeyAndStripTimestamp, and it's used in Add. This commit changes
ExtractUserKey to ExtractUserKeyAndStripTimestamp.

A test case is also added by modifying DBBasicTestWithTimestampWithParam_
PutAndGet test in db_basic_test to cover ExtractUserKeyAndStripTimestamp usage
in both kBuffered and kUnbuffered state of BlockBasedTableBuilder.

Before the ExtractUserKeyAndStripTimstamp fix:

```
$ ./db_basic_test --gtest_filter="*PutAndGet*"
Note: Google Test filter = *PutAndGet*
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from Timestamp/DBBasicTestWithTimestampWithParam
[ RUN      ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/0
db/db_basic_test.cc:2109: Failure
db_->Get(ropts, cfh, "key" + std::to_string(j), &value)
NotFound:
db/db_basic_test.cc:2109: Failure
db_->Get(ropts, cfh, "key" + std::to_string(j), &value)
NotFound:
db/db_basic_test.cc:2109: Failure
db_->Get(ropts, cfh, "key" + std::to_string(j), &value)
NotFound:
db/db_basic_test.cc:2109: Failure
db_->Get(ropts, cfh, "key" + std::to_string(j), &value)
NotFound:
db/db_basic_test.cc:2109: Failure
db_->Get(ropts, cfh, "key" + std::to_string(j), &value)
NotFound:
[  FAILED  ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/0, where GetParam() = false (1177 ms)
[ RUN      ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/1
[       OK ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/1 (1056 ms)
[----------] 2 tests from Timestamp/DBBasicTestWithTimestampWithParam (2233 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (2233 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/0, where GetParam() = false

 1 FAILED TEST
```

After the ExtractUserKeyAndStripTimstamp fix:

```
$ ./db_basic_test --gtest_filter="*PutAndGet*"
Note: Google Test filter = *PutAndGet*
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from Timestamp/DBBasicTestWithTimestampWithParam
[ RUN      ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/0
[       OK ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/0 (1417 ms)
[ RUN      ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/1
[       OK ] Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/1 (1041 ms)
[----------] 2 tests from Timestamp/DBBasicTestWithTimestampWithParam (2458 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (2458 ms total)
[  PASSED  ] 2 tests.
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6100

Differential Revision: D18769654

Pulled By: riversand963

fbshipit-source-id: 76c2cf2c9a5e0d85db95d98e812e6af0c2a15c6b
2019-12-09 10:57:02 -08:00
sdong d1ae2c3faf Fix an asan warning caused by the recent io_uring change (#6135)
Summary:
ASAN reports:

internal_repo_rocksdb/repo:db_test - MultiThreaded/MultiThreadedDBTest.MultiThreaded/43: fatal
==2692739==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6130000500ca at pc 0x0000006be780 bp 0x7efef85ccd20 sp 0x7efef85cc4d0
[CONTEXT] === How to use this, how to get the raw stack trace, and more: fburl.com/ASAN ===
[CONTEXT] READ of size 331 at 0x6130000500ca thread T195
[CONTEXT]      #0 db_test_bin+0x6be77f                     __interceptor_strlen.part.35
[CONTEXT]      https://github.com/facebook/rocksdb/issues/1 internal_repo_rocksdb/repo/include/rocksdb/slice.h:55 rocksdb::Slice::Slice(char const*)
[CONTEXT]      https://github.com/facebook/rocksdb/issues/2 internal_repo_rocksdb/repo/env/io_posix.cc:522 rocksdb::PosixRandomAccessFile::MultiRead(rocksdb::ReadRequest*, unsigned long)

I looked at env/io_posix.cc:522 but don't see a reason why the line needs to be there at all, because it is not used before overwritten. So it must be a line that is put there as a bug. Remove it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6135

Test Plan: Rerun the same test which passes after the fix. Run all the tests and make sure they all pass.

Differential Revision: D18880251

fbshipit-source-id: 3b84ac6a05b67b529c4202e0ceb4c047460f44f2
2019-12-09 10:25:09 -08:00
Peter Dillinger 3a6d9436e8 Use SpecialSkipListFactory in RecalculateScoreAfterPicking (#6125)
Summary:
Test DBTestUniversalCompaction.RecalculateScoreAfterPicking was
flaky on ARM, so it now uses SpecialSkipListFactory (like other tests)
for predictable memtable flushes.

Fixes https://github.com/facebook/rocksdb/issues/5736
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6125

Test Plan:
while ./db_universal_compaction_test; do :; done # for a
while on ARM and on Intel (both Linux)

Differential Revision: D18864821

Pulled By: pdillinger

fbshipit-source-id: 2f3ca0ea66ce420dcd6d41b0ec12377112a5a79f
2019-12-09 09:23:50 -08:00
sdong 7d79b32618 Break db_stress_tool.cc to a list of source files (#6134)
Summary:
db_stress_tool.cc now is a giant file. In order to main it easier to improve and maintain, break it down to multiple source files.
Most classes are turned into their own files. Separate .h and .cc files are created for gflag definiations. Another .h and .cc files are created for some common functions. Some test execution logic that is only loosely related to class StressTest is moved to db_stress_driver.h and db_stress_driver.cc. All the files are located under db_stress_tool/. The directory name is created as such because if we end it with either stress or test, .gitignore will ignore any file under it and makes it prone to issues in developements.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6134

Test Plan: Build under GCC7 with and without LITE on using GNU Make. Build with GCC 4.8. Build with cmake with -DWITH_TOOL=1

Differential Revision: D18876064

fbshipit-source-id: b25d0a7451840f31ac0f5ebb0068785f783fdf7d
2019-12-08 23:51:01 -08:00
suzanwen bac38c992a Isolate building db_bench from tests with `WITH_BENCHMARK_TOOLS` option. (#6098)
Summary:
Isolate `db_bench` from building tests, out of respect for the related comments.
Let building tests yields to `WITH_TEST=ON` AND `CMAKE_BUILD_TYPE=Debug` both,
and building `db_bench` yields to `WITH_BENCHMARK_TOOLS=ON`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6098

Test Plan: cmake -DCMAKE_BUILD_TYPE=Debug/Release -DWITH_TESTS=ON/OFF -DWITH_BENCHMARK_TOOLS=ON/OFF -DWITH_TOOLS=ON/OFF && make

Differential Revision: D18856891

Pulled By: riversand963

fbshipit-source-id: addbee8ad6abefb877843a313b4630cfab3ce4f0
2019-12-08 21:34:28 -08:00
sdong e3a82bb934 PosixRandomAccessFile::MultiRead() to use I/O uring if supported (#5881)
Summary:
Right now, PosixRandomAccessFile::MultiRead() executes read requests in parallel. In this PR, it leverages I/O Uring library to run it in parallel, even when page cache is enabled. This function will fall back if the kernel version doesn't support it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5881

Test Plan: Run the unit test on a kernel version supporting it and make sure all tests pass, and run a unit test on kernel version supporting it and see it pass. Before merging, will also run stress test and see it passes.

Differential Revision: D17742266

fbshipit-source-id: e05699c925ac04fdb42379456a4e23e4ebcb803a
2019-12-07 20:55:52 -08:00
Peter Dillinger 6db57bc37f Disable new Bloom filter assertion (#6128)
Summary:
A longstanding bug in our C interface can trigger this
assertion; see issue https://github.com/facebook/rocksdb/issues/6129. Disabling the assertion for now
(for 6.6.0) and will re-enable on fix of that bug.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6128

Differential Revision: D18854899

Pulled By: pdillinger

fbshipit-source-id: 9eb5294b9f11b208dc1a8cc148aaa31e47ff892b
2019-12-06 10:28:02 -08:00
Peter Dillinger ad528fe5ca Disable folly_synchronization_distributed_mutex_test on ARM for now (#6126)
Summary:
This test is crashing on ARM but is not yet production code.
Let's not let it block ARM CI. See PR https://github.com/facebook/rocksdb/issues/5932
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6126

Test Plan:
./folly_synchronization_distributed_mutex_test, on Linux/ARM,
on Linux/x86_64, and with LITE=1 on Linux/x86_64 (also disabled)

Differential Revision: D18836576

Pulled By: pdillinger

fbshipit-source-id: d8a36eea2f048e8330411d994435d1c58a15d978
2019-12-05 15:48:01 -08:00
Adam Simpkins 100b5e69f3 Fix build failure for db_stress tool when building with CMake (#6117)
Summary:
PR https://github.com/facebook/rocksdb/issues/5937 changed the db_stress tool to also require db_stress_tool.cc,
and updated the Makefile but not the CMakeLists.txt file.  This updates
the CMakeLists.txt file so that the CMake build succeeds again.

PR https://github.com/facebook/rocksdb/issues/5950 updated the Makefile build to package db_stress_tool.cc into
its own librocksdb_stress.a library.  I haven't done that here since
there didn't really seem to be much benefit: the Makefile-based build
does not install this library.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6117

Test Plan: Confirmed the CMake build succeeds on an Ubuntu 18.04 system.

Differential Revision: D18835053

Pulled By: riversand963

fbshipit-source-id: 6e2a66834716e73b1eb736d9b7159870defffec5
2019-12-05 15:34:54 -08:00
Jim Meyering cdc431ec81 build_tools/precommit_checker.py: don't hard-code a platform-afflicted python path (#6124)
Summary:
Use `#!/usr/bin/env python2.7` instead.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6124

Test Plan: `J=8 make commit_prereq`

Differential Revision: D18834668

Pulled By: ltamasi

fbshipit-source-id: cec40266cd5bcae8bf6cbe5a564ae78540deccc4
2019-12-05 11:49:17 -08:00
Yanqin Jin 4edb4284e7 Make folly-related targets comply with verbosity (#6120)
Summary:
Before this fix, `make all` will emit full compilation command when building
object files in the third-party/folly directory even if default verbosity is
0 (AM_DEFAULT_VERBOSITY).

Test Plan (devserver):
```
$make all | tee build.log
$make check
```
Check build.log to verify.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6120

Differential Revision: D18795621

Pulled By: riversand963

fbshipit-source-id: 04641a8359cd4fd55034e6e797ed85de29ee2fe2
2019-12-03 16:04:44 -08:00
Connor f32a311f0d Fix compliation error on GCC4.8.2 (#6106)
Summary:
```
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ./db/merge_context.h:7,
                 from ./db/dbformat.h:16,
                 from ./tools/block_cache_analyzer/block_cache_trace_analyzer.h:12,
                 from tools/block_cache_analyzer/block_cache_trace_analyzer.cc:8:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<std::basic_string<char>, long unsigned int>*, std::vector<std::pair<std::basic_string<char>, long unsigned int> > >; _Tp = std::pair<std::basic_string<char>, long unsigned int>; _Compare = rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2296:78:   required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<std::basic_string<char>, long unsigned int>*, std::vector<std::pair<std::basic_string<char>, long unsigned int> > >; _Compare = rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<std::basic_string<char>, long unsigned int>*, std::vector<std::pair<std::basic_string<char>, long unsigned int> > >; _Size = long int; _Compare = rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<std::basic_string<char>, long unsigned int>*, std::vector<std::pair<std::basic_string<char>, long unsigned int> > >; _Compare = rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1’
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:583:79:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: error: no match for call to ‘(rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1) (std::pair<std::basic_string<char>, long unsigned int>&, const std::pair<std::basic_string<char>, long unsigned int>&)’
    while (__comp(*__first, __pivot))
                                   ^
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:582:9: note: candidates are:
       [=](std::pair<std::string, uint64_t>& a,
         ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ./db/merge_context.h:7,
                 from ./db/dbformat.h:16,
                 from ./tools/block_cache_analyzer/block_cache_trace_analyzer.h:12,
                 from tools/block_cache_analyzer/block_cache_trace_analyzer.cc:8:
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note: bool (*)(std::pair<std::basic_string<char>, long unsigned int>&, std::pair<std::basic_string<char>, long unsigned int>&) <conversion>
    while (__comp(*__first, __pivot))
                                   ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note:   candidate expects 3 arguments, 3 provided
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:583:46: note: rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1
           std::pair<std::string, uint64_t>& b) { return b.second < a.second; });
                                              ^
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:583:46: note:   no known conversion for argument 2 from ‘const std::pair<std::basic_string<char>, long unsigned int>’ to ‘std::pair<std::basic_string<char>, long unsigned int>&’
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ./db/merge_context.h:7,
                 from ./db/dbformat.h:16,
                 from ./tools/block_cache_analyzer/block_cache_trace_analyzer.h:12,
                 from tools/block_cache_analyzer/block_cache_trace_analyzer.cc:8:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: error: no match for call to ‘(rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1) (const std::pair<std::basic_string<char>, long unsigned int>&, std::pair<std::basic_string<char>, long unsigned int>&)’
    while (__comp(__pivot, *__last))
                                  ^
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:582:9: note: candidates are:
       [=](std::pair<std::string, uint64_t>& a,
         ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ./db/merge_context.h:7,
                 from ./db/dbformat.h:16,
                 from ./tools/block_cache_analyzer/block_cache_trace_analyzer.h:12,
                 from tools/block_cache_analyzer/block_cache_trace_analyzer.cc:8:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note: bool (*)(std::pair<std::basic_string<char>, long unsigned int>&, std::pair<std::basic_string<char>, long unsigned int>&) <conversion>
    while (__comp(__pivot, *__last))
                                  ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note:   candidate expects 3 arguments, 3 provided
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:583:46: note: rocksdb::BlockCacheTraceAnalyzer::WriteSkewness(const string&, const std::vector<long unsigned int>&, rocksdb::TraceType) const::__lambda1
           std::pair<std::string, uint64_t>& b) { return b.second < a.second; });
                                              ^
tools/block_cache_analyzer/block_cache_trace_analyzer.cc:583:46: note:   no known conversion for argument 1 from ‘const std::pair<std::basic_string<char>, long unsigned int>’ to ‘std::pair<std::basic_string<char>, long unsigned int>&’
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6106

Differential Revision: D18783943

Pulled By: riversand963

fbshipit-source-id: cc7fc10565f0210b9eebf46b95cb4950ec0b15fa
2019-12-03 11:59:21 -08:00
Yanqin Jin fe1147db1c Let DBSecondary close files after catch up (#6114)
Summary:
After secondary instance replays the logs from primary, certain files become
obsolete. The secondary should find these files, evict their table readers from
table cache and close them. If this is not done, the secondary will hold on to
these files and prevent their space from being freed.

Test plan (devserver):
```
$./db_secondary_test --gtest_filter=DBSecondaryTest.SecondaryCloseFiles
$make check
$./db_stress -ops_per_thread=100000 -enable_secondary=true -threads=32 -secondary_catch_up_one_in=10000 -clear_column_family_one_in=1000 -reopen=100
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6114

Differential Revision: D18769998

Pulled By: riversand963

fbshipit-source-id: 5d1f151567247196164e1b79d8402fa2045b9120
2019-12-02 17:45:03 -08:00
anand76 16fa6fd2a6 Remove key length assertion LRUHandle::CalcTotalCharge (#6115)
Summary:
Inserting an entry in the block cache with 0 length key is a valid use case. Remove the assertion in ```LRUHandle::CalcTotalCharge```.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6115

Differential Revision: D18769693

Pulled By: anand1976

fbshipit-source-id: 34cc159650300dda6d7273480640478f28392cda
2019-12-02 15:00:07 -08:00
David Palm 048472f620 Add missing DataBlock-releated functions to the C-API (#6101)
Summary:
Adds two missing functions to the C-API:

- `rocksdb_block_based_options_set_data_block_index_type`
- `rocksdb_block_based_options_set_data_block_hash_ratio`

This enables users in other languages to enjoy the new(-ish) feature.

The changes here are partially overlapping with [another PR](https://github.com/facebook/rocksdb/pull/5630) but are more focused on the DataBlock indexing options.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6101

Differential Revision: D18765639

fbshipit-source-id: 4a8947e71b179f26fa1eb83c267dd47ee64ac3b3
2019-12-02 11:00:09 -08:00