Summary:
It seems the flag `-fno-elide-constructors` is incorrectly overwritten in Makefile by 9c2ebcc2c3/Makefile (L243)
Applying the change in PR https://github.com/facebook/rocksdb/issues/11675 shows a lot of missing status checks. This PR adds the missing status checks.
Most of changes are just adding asserts in unit tests. I'll add pr comment around more interesting changes that need review.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11686
Test Plan: change Makefile as in https://github.com/facebook/rocksdb/issues/11675, and run `ASSERT_STATUS_CHECKED=1 TEST_UINT128_COMPAT=1 ROCKSDB_MODIFY_NPHASH=1 LIB_MODE=static OPT="-DROCKSDB_NAMESPACE=alternative_rocksdb_ns" make V=1 -j24 J=24 check`
Reviewed By: hx235
Differential Revision: D48176132
Pulled By: cbi42
fbshipit-source-id: 6758946cfb1c6ff84c4c1e0ca540d05e6fc390bd
Summary:
In IDE navigation I find it annoying that there are two statistics.h files (etc.) and often land on the wrong one. Here I migrate several headers to use the blah.h <- blah_impl.h <- blah.cc idiom. Although clang-format wants "blah.h" to be the top include for "blah.cc", I think overall this is an improvement.
No public API changes.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11408
Test Plan: existing tests
Reviewed By: ltamasi
Differential Revision: D45456696
Pulled By: pdillinger
fbshipit-source-id: 809d931253f3272c908cf5facf7e1d32fc507373
Summary:
We haven't been actively mantaining RocksDB LITE recently and the size must have been gone up significantly. We are removing the support.
Most of changes were done through following comments:
unifdef -m -UROCKSDB_LITE `git grep -l ROCKSDB_LITE | egrep '[.](cc|h)'`
by Peter Dillinger. Others changes were manually applied to build scripts, CircleCI manifests, ROCKSDB_LITE is used in an expression and file db_stress_test_base.cc.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11147
Test Plan: See CI
Reviewed By: pdillinger
Differential Revision: D42796341
fbshipit-source-id: 4920e15fc2060c2cd2221330a6d0e5e65d4b7fe2
Summary:
Valgrind was complaining about the test BackupEngineTest.ExcludeFiles. The cause is backup_engine not being freed similar to https://github.com/facebook/rocksdb/issues/9610.
```
==18228== Command: ./backup_engine_test --gtest_filter=BackupEngineTest.ExcludeFiles
==18228==
Note: Google Test filter = BackupEngineTest.ExcludeFiles
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from BackupEngineTest
[ RUN ] BackupEngineTest.ExcludeFiles
[ OK ] BackupEngineTest.ExcludeFiles (16264 ms)
[----------] 1 test from BackupEngineTest (16273 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (16306 ms total)
[ PASSED ] 1 test.
==18228==
==18228== HEAP SUMMARY:
==18228== in use at exit: 14,099 bytes in 159 blocks
==18228== total heap usage: 255,328 allocs, 255,169 frees, 497,538,546 bytes allocated
==18228==
==18228== 19 bytes in 1 blocks are possibly lost in loss record 4 of 67
==18228== at 0x483BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==18228== by 0x1E752D: void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .constprop.0] (basic_string.tcc:219)
==18228== by 0x1F1898: _M_construct_aux<char*> (basic_string.h:251)
==18228== by 0x1F1898: _M_construct<char*> (basic_string.h:270)
==18228== by 0x1F1898: basic_string (basic_string.h:455)
==18228== by 0x1F1898: construct<std::__cxx11::basic_string<char>, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (new_allocator.h:146)
==18228== by 0x1F1898: construct<std::__cxx11::basic_string<char>, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (alloc_traits.h:483)
==18228== by 0x1F1898: push_back (stl_vector.h:1189)
==18228== by 0x1F1898: rocksdb::(anonymous namespace)::TestFs::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) (backup_engine_test.cc:208)
==18228== by 0x4B3583: rocksdb::NewWritableFile(rocksdb::FileSystem*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::FileOptions const&) (read_write_util.cc:23)
==18228== by 0x31C3A8: rocksdb::DBImpl::CreateWAL(unsigned long, unsigned long, unsigned long, rocksdb::log::Writer**) (db_impl_open.cc:1752)
==18228== by 0x321A8C: rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool, bool) (db_impl_open.cc:1852)
==18228== by 0x322E7F: Open (db_impl_open.cc:1660)
==18228== by 0x322E7F: rocksdb::DB::Open(rocksdb::Options const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::DB**) (db_impl_open.cc:1637)
==18228== by 0x1EE1CD: InitializeDBAndBackupEngine (backup_engine_test.cc:724)
==18228== by 0x1EE1CD: rocksdb::(anonymous namespace)::BackupEngineTest::OpenDBAndBackupEngine(bool, bool, rocksdb::(anonymous namespace)::BackupEngineTest::ShareOption) (backup_engine_test.cc:732)
==18228== by 0x217585: rocksdb::(anonymous namespace)::BackupEngineTest_ExcludeFiles_Test::TestBody() (backup_engine_test.cc:4232)
==18228== by 0x296143: HandleSehExceptionsInMethodIfSupported<testing::Test, void> (gtest-all.cc:3899)
==18228== by 0x296143: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest-all.cc:3935)
==18228== by 0x28A0A5: testing::Test::Run() [clone .part.0] (gtest-all.cc:3973)
==18228== by 0x28A364: Run (gtest-all.cc:3965)
==18228== by 0x28A364: testing::TestInfo::Run() [clone .part.0] (gtest-all.cc:4149)
...
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11066
Test Plan: make -j24 J=24 ROCKSDBTESTS_SUBSET=backup_engine_test valgrind_check_some
Reviewed By: ajkr
Differential Revision: D42297791
Pulled By: cbi42
fbshipit-source-id: db67982b27b91cc78e1a9f4a96da0cba7c9785b7
Summary:
We have a request for RocksDB to essentially support
disconnected incremental backup. In other words, if there is limited
or no connectivity to the primary backup dir, we should still be able to
take an incremental backup relative to that primary backup dir,
assuming some metadata about that primary dir is available (and
obviously anticipating primary backup dir will be fully available if
restore is needed).
To support that, this feature allows the API user to "exclude" DB
files from backup. This only applies to files that can be shared
between backups (sst and blob files), and excluded files are
tracked in the backup metadata sufficiently to ensure they are
restored at restore time. At restore time, the user provides
a set of alternate backup directories (as open BackupEngines, which
can be read-only), and excluded files must be found in one of the
backup directories ("included" in some backup).
This feature depends on backup schema version 2 features, though
schema version 2.0 support is not sufficient to read / restore a
backup with exclusions. This change updates the schema version to
2.1 because of this feature, so that it's easy to recognize whether
a RocksDB release supports this feature, while backups not using the
feature are fully compatible with 2.0.
Also in this PR:
* Stacked on https://github.com/facebook/rocksdb/pull/11029
* Allow progress_callback to be empty, not just no-op function, and
recover from exceptions thrown by BackupEngine callbacks.
* The internal-only `AsBackupEngine()` function is working around the
diamond hierarchy of `BackupEngineImplThreadSafe` to get to the
internals, without using confusing features like virtual inheritance.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11030
Test Plan: unit tests added / updated
Reviewed By: ajkr
Differential Revision: D42004388
Pulled By: pdillinger
fbshipit-source-id: 31b6e533d308a5462e528d9012d650482d974077
Summary:
Previously, the "latest" valid backup would not be updated on delete.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11029
Test Plan: unit test included (added to an existing test for efficiency)
Reviewed By: hx235
Differential Revision: D41967925
Pulled By: pdillinger
fbshipit-source-id: ca143354d281eb979557ea421902cd26803a1137
Summary:
This change is motivated by ensuring that `ldb update_manifest` or `UpdateManifestForFilesState` can run without expecting files to open when the old temperature is provided (in case the FileSystem strictly interprets non-kUnknown), but ended up fixing a problem in `OfflineManifestWriter` (used by `ldb unsafe_remove_sst_file`) where it would open some SST files during recovery and expect them to match the prior manifest state, even if not required by the intended new state.
Also update BackupEngine to retry with Temperature kUnknown when reading file with potentially "wrong" temperature.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10796
Test Plan: tests added/updated, that fail before the change(s) and now pass
Reviewed By: jay-zhuang
Differential Revision: D40232645
Pulled By: jay-zhuang
fbshipit-source-id: b5aa2688aecfe0c320b80a7da689b315414c20be
Summary:
Although we've been tracking SST unique IDs in the DB manifest
unconditionally, checking has been opt-in and with an extra pass at DB::Open
time. This changes the behavior of `verify_sst_unique_id_in_manifest` to
check unique ID against manifest every time an SST file is opened through
table cache (normal DB operations), replacing the explicit pass over files
at DB::Open time. This change also enables the option by default and
removes the "EXPERIMENTAL" designation.
One possible criticism is that the option no longer ensures the integrity
of a DB at Open time. This is far from an all-or-nothing issue. Verifying
the IDs of all SST files hardly ensures all the data in the DB is readable.
(VerifyChecksum is supposed to do that.) Also, with
max_open_files=-1 (default, extremely common), all SST files are
opened at DB::Open time anyway.
Implementation details:
* `VerifySstUniqueIdInManifest()` functions are the extra/explicit pass
that is now removed.
* Unit tests that manipulate/corrupt table properties have to opt out of
this check, because that corrupts the "actual" unique id. (And even for
testing we don't currently have a mechanism to set "no unique id"
in the in-memory file metadata for new files.)
* A lot of other unit test churn relates to (a) default checking on, and
(b) checking on SST open even without DB::Open (e.g. on flush)
* Use `FileMetaData` for more `TableCache` operations (in place of
`FileDescriptor`) so that we have access to the unique_id whenever
we might need to open an SST file. **There is the possibility of
performance impact because we can no longer use the more
localized `fd` part of an `FdWithKeyRange` but instead follow the
`file_metadata` pointer. However, this change (possible regression)
is only done for `GetMemoryUsageByTableReaders`.**
* Removed a completely unnecessary constructor overload of
`TableReaderOptions`
Possible follow-up:
* Verification only happens when opening through table cache. Are there
more places where this should happen?
* Improve error message when there is a file size mismatch vs. manifest
(FIXME added in the appropriate place).
* I'm not sure there's a justification for `FileDescriptor` to be distinct from
`FileMetaData`.
* I'm skeptical that `FdWithKeyRange` really still makes sense for
optimizing some data locality by duplicating some data in memory, but I
could be wrong.
* An unnecessary overload of NewTableReader was recently added, in
the public API nonetheless (though unusable there). It should be cleaned
up to put most things under `TableReaderOptions`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10532
Test Plan:
updated unit tests
Performance test showing no significant difference (just noise I think):
`./db_bench -benchmarks=readwhilewriting[-X10] -num=3000000 -disable_wal=1 -bloom_bits=8 -write_buffer_size=1000000 -target_file_size_base=1000000`
Before: readwhilewriting [AVG 10 runs] : 68702 (± 6932) ops/sec
After: readwhilewriting [AVG 10 runs] : 68239 (± 7198) ops/sec
Reviewed By: jay-zhuang
Differential Revision: D38765551
Pulled By: pdillinger
fbshipit-source-id: a827a708155f12344ab2a5c16e7701c7636da4c2
Summary:
Some APIs for getting live files, which are used by Checkpoint
and BackupEngine, can optionally trigger and wait for a flush. These
would deadlock when used on a read-only DB. Here we fix that by assuming
the user wants the overall operation to succeed and is OK without
flushing (because the DB is read-only).
Follow-up work: the same or other issues can be hit by directly invoking
some DB functions that are clearly not appropriate for read-only
instance, but are not covered by overrides in DBImplReadOnly and
CompactedDBImpl. These should be fixed to avoid similar problems on
accidental misuse. (Long term, it would be nice to have a DBReadOnly
class without those members, like BackupEngineReadOnly.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10569
Test Plan: tests updated to catch regression (hang before the fix)
Reviewed By: riversand963
Differential Revision: D38995759
Pulled By: pdillinger
fbshipit-source-id: f5f8bc7123e13cb45bd393dd974d7d6eda20bc68
Summary:
Moved linux builds to using docker to avoid CI instability caused by dependency installation site down.
Added the `Dockerfile` which is used to build the image.
The build time is also significantly reduced, because no dependencies installation and with using 2xlarge+ instance for slow build (like tsan test).
Also fixed a few issues detected while building this:
* `DestoryDB()` Status not checked for a few tests
* nullptr might be used in `inlineskiplist.cc`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10496
Test Plan: CI
Reviewed By: ajkr
Differential Revision: D38554200
Pulled By: jay-zhuang
fbshipit-source-id: 16e8fb2bf07b9c84bb27fb18421c4d54f2f248fd
Summary:
Change tiered compaction feature from `bottommost_temperture` to
`last_level_temperture`. The old option is kept for migration purpose only,
which is behaving the same as `last_level_temperture` and it will be removed in
the next release.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10471
Test Plan: CI
Reviewed By: siying
Differential Revision: D38450621
Pulled By: jay-zhuang
fbshipit-source-id: cc1cdf8bad409376fec0152abc0a64fb72a91527
Summary:
## Problem Summary
RocksDB will acquire the global mutex of db instance for every time when user calls `Write`. When RocksDB schedules a lot of compaction jobs, it will compete the mutex with write thread and it will hurt the write performance.
## Problem Solution:
I want to use log_write_mutex to replace the global mutex in most case so that we do not acquire it in write-thread unless there is a write-stall event or a write-buffer-full event occur.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7516
Test Plan:
1. make check
2. CI
3. COMPILE_WITH_TSAN=1 make db_stress
make crash_test
make crash_test_with_multiops_wp_txn
make crash_test_with_multiops_wc_txn
make crash_test_with_atomic_flush
Reviewed By: siying
Differential Revision: D36908702
Pulled By: riversand963
fbshipit-source-id: 59b13881f4f5c0a58fd3ca79128a396d9cd98efe
Summary:
(PR created for informational/testing purposes only.)
- Fixes lost dynamic updates to GenericRateLimiter bandwidth using `SetBytesPerSecond()`
- Benefit over #10374 is eliminating race conditions with Configurable framework.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10378
Reviewed By: pdillinger
Differential Revision: D37914865
fbshipit-source-id: d4f566d60ec9726d26932388c61671adf0ee0f30
Summary:
In some case, GetFileSize would be failure in copy_file_cb.
If failure, we can return immediately, the subsequent code
is meaningless, and add a log info let user know that problem
happen here.
Singed-off-by: Yite Gu <ess_gyt@qq.com>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10176
Reviewed By: cbi42
Differential Revision: D37510888
Pulled By: ajkr
fbshipit-source-id: 044ad8c45852fd19b8cd564b11f65d40c39e296f
Summary:
We saw flakes with the following failure:
```
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
utilities/backup/backup_engine_test.cc:2667: Failure
Expected: (restore_time) > (0.8 * rate_limited_restore_time), actual: 48269 vs 60470.4
terminate called after throwing an instance of 'testing::internal::GoogleTestFailureException'
what(): utilities/backup/backup_engine_test.cc:2667: Failure
Expected: (restore_time) > (0.8 * rate_limited_restore_time), actual: 48269 vs 60470.4
Received signal 6 (Aborted)
t/run-backup_engine_test-RateLimiting-BackupEngineRateLimitingTestWithParam.RateLimiting-1: line 4: 1032887 Aborted (core dumped) TEST_TMPDIR=$d ./backup_engine_test --gtest_filter=RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
```
Investigation revealed we forgot to use the mock time `SystemClock` for
restore rate limiting. Then the test used wall clock time, which made
the execution of "GenericRateLimiter::Request:PostTimedWait"
non-deterministic as wall clock time might have advanced enough that
waiting was not needed.
This PR changes restore rate limiting to use
mock time, which guarantees we always execute
"GenericRateLimiter::Request:PostTimedWait". Then the assertions that
rely on times recorded inside that callback should be robust.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10271
Test Plan:
Applied the following patch which guaranteed repro before the fix.
Verified the test passes after this PR even with that patch applied.
```
diff --git a/util/rate_limiter.cc b/util/rate_limiter.cc
index f369e3220..6b3ed82fa 100644
--- a/util/rate_limiter.cc
+++ b/util/rate_limiter.cc
@@ -158,6 +158,7 @@ void GenericRateLimiter::SetBytesPerSecond(int64_t bytes_per_second) {
void GenericRateLimiter::Request(int64_t bytes, const Env::IOPriority pri,
Statistics* stats) {
+ usleep(100000);
assert(bytes <= refill_bytes_per_period_.load(std::memory_order_relaxed));
bytes = std::max(static_cast<int64_t>(0), bytes);
TEST_SYNC_POINT("GenericRateLimiter::Request");
```
Reviewed By: hx235
Differential Revision: D37499848
Pulled By: ajkr
fbshipit-source-id: fd790d5a192996be8ba13b656751ccc7d8cb8f6e
Summary:
https://github.com/facebook/rocksdb/issues/9984 changes the behavior of RocksDB: if logger creation failed during `SanitizeOptions()`,
`DB::Open()` will fail. However, since `SanitizeOptions()` is called in `DBImpl::DBImpl()`, we cannot
directly expose the error to caller without some additional work.
This is a first version proposal which:
- Adds a new member `init_logger_creation_s` to `DBImpl` to store the result of init logger creation
- Checks the error during `DB::Open()` and return it to caller if non-ok
This is not very ideal. We can alternatively move the logger creation logic out of the `SanitizeOptions()`.
Since `SanitizeOptions()` is used in other places, we need to check whether this change breaks anything
in case other callers of `SanitizeOptions()` assumes that a logger should be created.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10223
Test Plan: make check
Reviewed By: pdillinger
Differential Revision: D37321717
Pulled By: riversand963
fbshipit-source-id: 58042358a86369d606549dd9938933dd47591c4b
Summary:
**Context/Summary:**
https://github.com/facebook/rocksdb/pull/9424 added rate-limiting support for user reads, which does not include batched `MultiGet()`s that call `RandomAccessFileReader::MultiRead()`. The reason is that it's harder (compared with RandomAccessFileReader::Read()) to implement the ideal rate-limiting where we first call `RateLimiter::RequestToken()` for allowed bytes to multi-read and then consume those bytes by satisfying as many requests in `MultiRead()` as possible. For example, it can be tricky to decide whether we want partially fulfilled requests within one `MultiRead()` or not.
However, due to a recent urgent user request, we decide to pursue an elementary (but a conditionally ineffective) solution where we accumulate enough rate limiter requests toward the total bytes needed by one `MultiRead()` before doing that `MultiRead()`. This is not ideal when the total bytes are huge as we will actually consume a huge bandwidth from rate-limiter causing a burst on disk. This is not what we ultimately want with rate limiter. Therefore a follow-up work is noted through TODO comments.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10159
Test Plan:
- Modified existing unit test `DBRateLimiterOnReadTest/DBRateLimiterOnReadTest.NewMultiGet`
- Traced the underlying system calls `io_uring_enter` and verified they are 10 seconds apart from each other correctly under the setting of `strace -ftt -e trace=io_uring_enter ./db_bench -benchmarks=multireadrandom -db=/dev/shm/testdb2 -readonly -num=50 -threads=1 -multiread_batched=1 -batch_size=100 -duration=10 -rate_limiter_bytes_per_sec=200 -rate_limiter_refill_period_us=1000000 -rate_limit_bg_reads=1 -disable_auto_compactions=1 -rate_limit_user_ops=1` where each `MultiRead()` read about 2000 bytes (inspected by debugger) and the rate limiter grants 200 bytes per seconds.
- Stress test:
- Verified `./db_stress (-test_cf_consistency=1/test_batches_snapshots=1) -use_multiget=1 -cache_size=1048576 -rate_limiter_bytes_per_sec=10241024 -rate_limit_bg_reads=1 -rate_limit_user_ops=1` work
Reviewed By: ajkr, anand1976
Differential Revision: D37135172
Pulled By: hx235
fbshipit-source-id: 73b8e8f14761e5d4b77235dfe5d41f4eea968bcd
Summary:
For regular db instance and secondary instance, we return error and refuse to open DB if Logger creation fails.
Our current code allows it, but it is really difficult to debug because
there will be no LOG files. The same for OPTIONS file, which will be explored in another PR.
Furthermore, Arena::AllocateAligned(size_t bytes, size_t huge_page_size, Logger* logger) has an
assertion as the following:
```cpp
#ifdef MAP_HUGETLB
if (huge_page_size > 0 && bytes > 0) {
assert(logger != nullptr);
}
#endif
```
It can be removed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9984
Test Plan: make check
Reviewed By: jay-zhuang
Differential Revision: D36347754
Pulled By: riversand963
fbshipit-source-id: 529798c0511d2eaa2f0fd40cf7e61c4cbc6bc57e
Summary:
There are currently some preprocessor checks that assume support for Visual Studio versions older than 2015 (i.e., 0 < _MSC_VER < 1900), although we don't support them any more.
We removed all code that only compiles on those older versions, except third-party/ files.
The ROCKSDB_NOEXCEPT symbol is now obsolete, since it now always gets replaced by noexcept. We removed it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10065
Reviewed By: pdillinger
Differential Revision: D36721901
Pulled By: guidotag
fbshipit-source-id: a2892d365ef53cce44a0a7d90dd6b72ee9b5e5f2
Summary:
Added rate limiter and read rate-limiting support to SequentialFileReader. I've updated call sites to SequentialFileReader::Read with appropriate IO priority (or left a TODO and specified IO_TOTAL for now).
The PR is separated into four commits: the first one added the rate-limiting support, but with some fixes in the unit test since the number of request bytes from rate limiter in SequentialFileReader are not accurate (there is overcharge at EOF). The second commit fixed this by allowing SequentialFileReader to check file size and determine how many bytes are left in the file to read. The third commit added benchmark related code. The fourth commit moved the logic of using file size to avoid overcharging the rate limiter into backup engine (the main user of SequentialFileReader).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9973
Test Plan:
- `make check`, backup_engine_test covers usage of SequentialFileReader with rate limiter.
- Run db_bench to check if rate limiting is throttling as expected: Verified that reads and writes are together throttled at 2MB/s, and at 0.2MB chunks that are 100ms apart.
- Set up: `./db_bench --benchmarks=fillrandom -db=/dev/shm/test_rocksdb`
- Benchmark:
```
strace -ttfe read,write ./db_bench --benchmarks=backup -db=/dev/shm/test_rocksdb --backup_rate_limit=2097152 --use_existing_db
strace -ttfe read,write ./db_bench --benchmarks=restore -db=/dev/shm/test_rocksdb --restore_rate_limit=2097152 --use_existing_db
```
- db bench on backup and restore to ensure no performance regression.
- backup (avg over 50 runs): pre-change: 1.90443e+06 micros/op; post-change: 1.8993e+06 micros/op (improve by 0.2%)
- restore (avg over 50 runs): pre-change: 1.79105e+06 micros/op; post-change: 1.78192e+06 micros/op (improve by 0.5%)
```
# Set up
./db_bench --benchmarks=fillrandom -db=/tmp/test_rocksdb -num=10000000
# benchmark
TEST_TMPDIR=/tmp/test_rocksdb
NUM_RUN=50
for ((j=0;j<$NUM_RUN;j++))
do
./db_bench -db=$TEST_TMPDIR -num=10000000 -benchmarks=backup -use_existing_db | egrep 'backup'
# Restore
#./db_bench -db=$TEST_TMPDIR -num=10000000 -benchmarks=restore -use_existing_db
done > rate_limit.txt && awk -v NUM_RUN=$NUM_RUN '{sum+=$3;sum_sqrt+=$3^2}END{print sum/NUM_RUN, sqrt(sum_sqrt/NUM_RUN-(sum/NUM_RUN)^2)}' rate_limit.txt >> rate_limit_2.txt
```
Reviewed By: hx235
Differential Revision: D36327418
Pulled By: cbi42
fbshipit-source-id: e75d4307cff815945482df5ba630c1e88d064691
Summary:
**Context:**
`BackupEngineRateLimitingTestWithParam.RateLimiting` and `BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup` involve creating backup and restoring of a big database with rate-limiting. Using the normal env with a normal clock requires real elapse of time (13702 - 19848 ms/per test). As suggested in https://github.com/facebook/rocksdb/pull/8722#discussion_r703698603, this PR is to speed it up with SpecialEnv (`time_elapse_only_sleep=true`) where its clock accepts fake elapse of time during rate-limiting (100 - 600 ms/per test)
**Summary:**
- Added TEST_ function to set clock of the default rate limiters in backup engine
- Shrunk testdb by 10 times while keeping it big enough for testing
- Renamed some test variables and reorganized some if-else branch for clarity without changing the test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9974
Test Plan:
- Run tests pre/post PR the same time to verify the tests are sped up by 90 - 95%
`BackupEngineRateLimitingTestWithParam.RateLimiting`
Pre:
```
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (11123 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (9441 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (11096 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (9339 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (11121 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (9413 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (11185 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (9511 ms)
[----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82230 ms total)
```
Post:
```
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (395 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (564 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (358 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (567 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (173 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (176 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (191 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (177 ms)
[----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (2601 ms total)
```
`BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup`
Pre:
```
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (7275 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (3961 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (7117 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (3921 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (19862 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (10231 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (19848 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (10372 ms)
[----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82587 ms total)
```
Post:
```
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (157 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (152 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (160 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (158 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (155 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (151 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (146 ms)
[ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7
[ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (153 ms)
[----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (1232 ms total)
```
Reviewed By: pdillinger
Differential Revision: D36336345
Pulled By: hx235
fbshipit-source-id: 724c6ba745f95f56d4440a6d2f1e4512a2987589
Summary:
ToString() is created as some platform doesn't support std::to_string(). However, we've already used std::to_string() by mistake for 16 months (in db/db_info_dumper.cc). This commit just remove ToString().
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9955
Test Plan: Watch CI tests
Reviewed By: riversand963
Differential Revision: D36176799
fbshipit-source-id: bdb6dcd0e3a3ab96a1ac810f5d0188f684064471
Summary:
Right now we still don't fully use std::numeric_limits but use a macro, mainly for supporting VS 2013. Right now we only support VS 2017 and up so it is not a problem. The code comment claims that MinGW still needs it. We don't have a CI running MinGW so it's hard to validate. since we now require C++17, it's hard to imagine MinGW would still build RocksDB but doesn't support std::numeric_limits<>.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9954
Test Plan: See CI Runs.
Reviewed By: riversand963
Differential Revision: D36173954
fbshipit-source-id: a35a73af17cdcae20e258cdef57fcf29a50b49e0
Summary:
Various renaming and fixes to get rid of remaining uses of
"backupable" which is terminology leftover from the original, flawed
design of BackupableDB. Now any DB can be backed up, using BackupEngine.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9792
Test Plan: CI
Reviewed By: ajkr
Differential Revision: D35334386
Pulled By: pdillinger
fbshipit-source-id: 2108a42b4575c8cccdfd791c549aae93ec2f3329