Commit graph

329 commits

Author SHA1 Message Date
Peter Dillinger 4750421ece Replace most typedef with using= (#8751)
Summary:
Old typedef syntax is confusing

Most but not all changes with

    perl -pi -e 's/typedef (.*) ([a-zA-Z0-9_]+);/using $2 = $1;/g' list_of_files
    make format

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8751

Test Plan: existing

Reviewed By: zhichao-cao

Differential Revision: D30745277

Pulled By: pdillinger

fbshipit-source-id: 6f65f0631c3563382d43347896020413cc2366d9
2021-09-07 11:31:59 -07:00
Peter Dillinger 13ded69484 Built-in support for generating unique IDs, bug fix (#8708)
Summary:
Env::GenerateUniqueId() works fine on Windows and on POSIX
where /proc/sys/kernel/random/uuid exists. Our other implementation is
flawed and easily produces collision in a new multi-threaded test.
As we rely more heavily on DB session ID uniqueness, this becomes a
serious issue.

This change combines several individually suitable entropy sources
for reliable generation of random unique IDs, with goal of uniqueness
and portability, not cryptographic strength nor maximum speed.

Specifically:
* Moves code for getting UUIDs from the OS to port::GenerateRfcUuid
rather than in Env implementation details. Callers are now told whether
the operation fails or succeeds.
* Adds an internal API GenerateRawUniqueId for generating high-quality
128-bit unique identifiers, by combining entropy from three "tracks":
  * Lots of info from default Env like time, process id, and hostname.
  * std::random_device
  * port::GenerateRfcUuid (when working)
* Built-in implementations of Env::GenerateUniqueId() will now always
produce an RFC 4122 UUID string, either from platform-specific API or
by converting the output of GenerateRawUniqueId.

DB session IDs now use GenerateRawUniqueId while DB IDs (not as
critical) try to use port::GenerateRfcUuid but fall back on
GenerateRawUniqueId with conversion to an RFC 4122 UUID.

GenerateRawUniqueId is declared and defined under env/ rather than util/
or even port/ because of the Env dependency.

Likely follow-up: enhance GenerateRawUniqueId to be faster after the
first call and to guarantee uniqueness within the lifetime of a single
process (imparting the same property onto DB session IDs).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8708

Test Plan:
A new mini-stress test in env_test checks the various public
and internal APIs for uniqueness, including each track of
GenerateRawUniqueId individually. We can't hope to verify anywhere close
to 128 bits of entropy, but it can at least detect flaws as bad as the
old code. Serial execution of the new tests takes about 350 ms on
my machine.

Reviewed By: zhichao-cao, mrambacher

Differential Revision: D30563780

Pulled By: pdillinger

fbshipit-source-id: de4c9ff4b2f581cf784fcedb5f39f16e5185c364
2021-08-30 15:20:41 -07:00
Peter Dillinger 318fe6941a Add port::GetProcessID() (#8693)
Summary:
Useful in some places for object uniqueness across processes.
Currently used for generating a host-wide identifier of Cache objects
but expected to be used soon in some unique id generation code.

`int64_t` is chosen for return type because POSIX uses signed integer type,
usually `int`, for `pid_t` and Windows uses `DWORD`, which is `uint32_t`.

Future work: avoid copy-pasted declarations in port_*.h, perhaps with
port_common.h always included from port.h

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8693

Test Plan: manual for now

Reviewed By: ajkr, anand1976

Differential Revision: D30492876

Pulled By: pdillinger

fbshipit-source-id: 39fc2788623cc9f4787866bdb67a4d183dde7eef
2021-08-24 17:46:14 -07:00
Burton Li 9b0a32f802 Support dynamic sector size in alignment validation for Windows. (#8613)
Summary:
- Use dynamic section size when calling IsSectorAligned()
- Support relative path for GetSectorSize().
- Move buffer and sector alignment check to assert for better retail performance.
- Typo fixes.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8613

Reviewed By: ajkr

Differential Revision: D30136082

Pulled By: mrambacher

fbshipit-source-id: e8cb849befdcae4fea99de5ed5dd6565e612425f
2021-08-16 07:31:57 -07:00
Peter Dillinger aeb913dd01 Standardize on GCC for TSAN conditional compilation (#8543)
Summary:
In https://github.com/facebook/rocksdb/issues/8539 I accidentally only checked for GCC TSAN, which is
what I tested locally, while CircleCI and FB CI use clang TSAN. Related:
other existing code like in stack_trace.cc only check for clang TSAN.

I've now standardized these to the GCC convention in port/lang.h, so now

    #ifdef __SANITIZE_THREAD__

can check for any TSAN (assuming lang.h include)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8543

Test Plan:
Put an assert(false) in slice_test and look for the NOTE
about "signal-unsafe call", both GCC and clang. Eventually, CircleCI
TSAN in https://github.com/facebook/rocksdb/issues/8538

Reviewed By: zhichao-cao

Differential Revision: D29728483

Pulled By: pdillinger

fbshipit-source-id: 8a3b8015c2ed48078214c3ee17146a2c3f11c9f7
2021-07-15 23:50:00 -07:00
Peter Dillinger a53d6d25e0 Improve support for valgrind error on reachable (#8503)
Summary:
MyRocks apparently uses valgrind to check for unreachable
unfreed data, which is stricter than our valgrind checks. Internal ref:
D29257815

This patch adds valgrind support to STATIC_AVOID_DESTRUCTION so that it's
not reported with those stricter checks.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8503

Test Plan:
make valgrind_test
Also, with modified VALGRIND_OPTS (see Makefile), more kinds of
failures seen before than after this commit.

Reviewed By: ajkr, yizhang82

Differential Revision: D29597784

Pulled By: pdillinger

fbshipit-source-id: 360de157a176aec4d1be99ca20d160ecd47c0873
2021-07-12 17:00:27 -07:00
Andrew Kryczka bac399449d jemalloc_helper: Limit the mm_malloc.h hack to glibc on linux (#8425)
Summary:
Original author: kraj (https://github.com/facebook/rocksdb/issues/8413)

We have a hack to ensure clang's `posix_memalign()` hack works to be
compatible with glibc's `posix_memalign()` declaration. Our side of the
hack is irrelevant and should be omitted when not using glibc.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8425

Reviewed By: mrambacher

Differential Revision: D29239029

Pulled By: ajkr

fbshipit-source-id: 12b900f50a4823b880a6558f25d8590dbfc0aa26
2021-06-29 08:40:02 -07:00
Lucian Petrut 390c5246d2 Allow using WindowsThread with Mingw (#8108)
Summary:
Allow using WindowsThread with Mingw

Most Mingw builds require Posix threads in order to use std::thread.
As per https://github.com/facebook/rocksdb/issues/7764, this is not always the case.

That being considered, we're going to improve the Mingw thread model
checks.

Closes: https://github.com/facebook/rocksdb/issues/7764
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8108

Reviewed By: jay-zhuang

Differential Revision: D27365778

Pulled By: mrambacher

fbshipit-source-id: 2c15b1f04ae90e1e3a25a33e86ceb779224a9529
2021-06-29 06:52:08 -07:00
Peter Dillinger 311a544c2a Use deleters to label cache entries and collect stats (#8297)
Summary:
This change gathers and publishes statistics about the
kinds of items in block cache. This is especially important for
profiling relative usage of cache by index vs. filter vs. data blocks.
It works by iterating over the cache during periodic stats dump
(InternalStats, stats_dump_period_sec) or on demand when
DB::Get(Map)Property(kBlockCacheEntryStats), except that for
efficiency and sharing among column families, saved data from
the last scan is used when the data is not considered too old.

The new information can be seen in info LOG, for example:

    Block cache LRUCache@0x7fca62229330 capacity: 95.37 MB collections: 8 last_copies: 0 last_secs: 0.00178 secs_since: 0
    Block cache entry stats(count,size,portion): DataBlock(7092,28.24 MB,29.6136%) FilterBlock(215,867.90 KB,0.888728%) FilterMetaBlock(2,5.31 KB,0.00544%) IndexBlock(217,180.11 KB,0.184432%) WriteBuffer(1,256.00 KB,0.262144%) Misc(1,0.00 KB,0%)

And also through DB::GetProperty and GetMapProperty (here using
ldb just for demonstration):

    $ ./ldb --db=/dev/shm/dbbench/ get_property rocksdb.block-cache-entry-stats
    rocksdb.block-cache-entry-stats.bytes.data-block: 0
    rocksdb.block-cache-entry-stats.bytes.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.bytes.index-block: 178992
    rocksdb.block-cache-entry-stats.bytes.misc: 0
    rocksdb.block-cache-entry-stats.bytes.other-block: 0
    rocksdb.block-cache-entry-stats.bytes.write-buffer: 0
    rocksdb.block-cache-entry-stats.capacity: 8388608
    rocksdb.block-cache-entry-stats.count.data-block: 0
    rocksdb.block-cache-entry-stats.count.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.count.index-block: 215
    rocksdb.block-cache-entry-stats.count.misc: 1
    rocksdb.block-cache-entry-stats.count.other-block: 0
    rocksdb.block-cache-entry-stats.count.write-buffer: 0
    rocksdb.block-cache-entry-stats.id: LRUCache@0x7f3636661290
    rocksdb.block-cache-entry-stats.percent.data-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.deprecated-filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-meta-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.index-block: 2.133751
    rocksdb.block-cache-entry-stats.percent.misc: 0.000000
    rocksdb.block-cache-entry-stats.percent.other-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.write-buffer: 0.000000
    rocksdb.block-cache-entry-stats.secs_for_last_collection: 0.000052
    rocksdb.block-cache-entry-stats.secs_since_last_collection: 0

Solution detail - We need some way to flag what kind of blocks each
entry belongs to, preferably without changing the Cache API.
One of the complications is that Cache is a general interface that could
have other users that don't adhere to whichever convention we decide
on for keys and values. Or we would pay for an extra field in the Handle
that would only be used for this purpose.

This change uses a back-door approach, the deleter, to indicate the
"role" of a Cache entry (in addition to the value type, implicitly).
This has the added benefit of ensuring proper code origin whenever we
recognize a particular role for a cache entry; if the entry came from
some other part of the code, it will use an unrecognized deleter, which
we simply attribute to the "Misc" role.

An internal API makes for simple instantiation and automatic
registration of Cache deleters for a given value type and "role".

Another internal API, CacheEntryStatsCollector, solves the problem of
caching the results of a scan and sharing them, to ensure scans are
neither excessive nor redundant so as not to harm Cache performance.

Because code is added to BlocklikeTraits, it is pulled out of
block_based_table_reader.cc into its own file.

This is a reformulation of https://github.com/facebook/rocksdb/issues/8276, without the type checking option
(could still be added), and with actual stat gathering.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8297

Test Plan: manual testing with db_bench, and a couple of basic unit tests

Reviewed By: ltamasi

Differential Revision: D28488721

Pulled By: pdillinger

fbshipit-source-id: 472f524a9691b5afb107934be2d41d84f2b129fb
2021-05-19 16:51:13 -07:00
sdong e19908cba6 Refactor kill point (#8241)
Summary:
Refactor kill point to one single class, rather than several extern variables. The intention was to drop unflushed data before killing to simulate some job, and I tried to a pointer to fault ingestion fs to the killing class, but it ended up with harder than I thought. Perhaps we'll need to do this in another way. But I thought the refactoring itself is good so I send it out.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8241

Test Plan: make release and run crash test for a while.

Reviewed By: anand1976

Differential Revision: D28078486

fbshipit-source-id: f9182c1455f52e6851c13f88a21bade63bcec45f
2021-05-05 15:50:29 -07:00
Mr-Leshiy c2c7d5e916 Fix cast-function-type warning (#8230)
Summary:
Fixing cast-function-type which is appears during the following build:
```bash
cmake ..  -DFAIL_ON_WARNINGS=ON -DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc -DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++ -DCMAKE_SYSTEM_NAME=Windows
make rocksdb
```
Here is the log:
```
/home/leshiy/Work/rocksdb/port/win/env_win.cc: In constructor ‘rocksdb::port::WinClock::WinClock()’:
/home/leshiy/Work/rocksdb/port/win/env_win.cc:92:9: error: cast between incompatible function types from ‘FARPROC’ {aka ‘long long int (*)()’} to ‘rocksdb::port::WinClock::FnGetSystemTimePreciseAsFileTime’ {aka ‘void (*)(_FILETIME*)’} [-Werror=cast-function-type]
   92 |         (FnGetSystemTimePreciseAsFileTime)GetProcAddress(
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   93 |             module, "GetSystemTimePreciseAsFileTime");
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [CMakeFiles/rocksdb.dir/build.make:4337: CMakeFiles/rocksdb.dir/port/win/env_win.cc.obj] Error 1
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/rocksdb.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8230

Reviewed By: jay-zhuang

Differential Revision: D28000215

Pulled By: mrambacher

fbshipit-source-id: 874782cf48f70470e3fbd9097585bf42e810ca61
2021-04-26 10:13:55 -07:00
Adam Retter 90e245697f Fix Windows strcmp for Unicode (#8190)
Summary:
The code for strcmp that was present does work when compiled for Windows unicode file paths.

Needs backporting to:
* 6.17.fb
* 6.18.fb
* 6.19.fb

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8190

Reviewed By: akankshamahajan15

Differential Revision: D27765588

Pulled By: jay-zhuang

fbshipit-source-id: 89f8a5ac61fd7edc758340dfd335b0a5f96dae6e
2021-04-16 12:11:16 -07:00
David Carlier 88c8f7a090 stack trace freebsd update. using native api to get the process (#8144)
Summary:
full name.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8144

Reviewed By: ajkr

Differential Revision: D27581146

Pulled By: riversand963

fbshipit-source-id: 7d4cbde02a07aa4676e35aeb60c3d6f1f492a3cd
2021-04-06 00:28:47 -07:00
wolfkdy 63748c2204 On ARM platform, use yield op to relax CPU. See issue 7376 (#7438)
Summary:
see https://github.com/facebook/rocksdb/issues/7376.
The `wfe` op on ARM platform is not suitable to relax CPU. Use `yield` op.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7438

Reviewed By: riversand963

Differential Revision: D24063427

Pulled By: jay-zhuang

fbshipit-source-id: b0ebc5590d7555bd21b30f15cd59f84dc006367a
2021-03-26 18:13:24 -07:00
Jay Zhuang 45c65d6dcf Use thread-safe strerror_r() to get error message (#8087)
Summary:
`strerror()` is not thread-safe, using `strerror_r()` instead. The API could be different on the different platforms, used the code from 0deef031cb/folly/String.cpp (L457)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8087

Reviewed By: mrambacher

Differential Revision: D27267151

Pulled By: jay-zhuang

fbshipit-source-id: 4b8856d1ec069d5f239b764750682c56e5be9ddb
2021-03-24 23:07:27 -07:00
mrambacher 3dff28cf9b Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033)
Summary:
For performance purposes, the lower level routines were changed to use a SystemClock* instead of a std::shared_ptr<SystemClock>.  The shared ptr has some performance degradation on certain hardware classes.

For most of the system, there is no risk of the pointer being deleted/invalid because the shared_ptr will be stored elsewhere.  For example, the ImmutableDBOptions stores the Env which has a std::shared_ptr<SystemClock> in it.  The SystemClock* within the ImmutableDBOptions is essentially a "short cut" to gain access to this constant resource.

There were a few classes (PeriodicWorkScheduler?) where the "short cut" property did not hold.  In those cases, the shared pointer was preserved.

Using db_bench readrandom perf_level=3 on my EC2 box, this change performed as well or better than 6.17:

6.17: readrandom   :      28.046 micros/op 854902 ops/sec;   61.3 MB/s (355999 of 355999 found)
6.18: readrandom   :      32.615 micros/op 735306 ops/sec;   52.7 MB/s (290999 of 290999 found)
PR: readrandom   :      27.500 micros/op 871909 ops/sec;   62.5 MB/s (367999 of 367999 found)

(Note that the times for 6.18 are prior to revert of the SystemClock).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8033

Reviewed By: pdillinger

Differential Revision: D27014563

Pulled By: mrambacher

fbshipit-source-id: ad0459eba03182e454391b5926bf5cdd45657b67
2021-03-15 04:34:11 -07:00
vrqq ad25b1afb9 Update win_logger.cc : assert failed when return value not checked. (-DROCKSDB_ASSERT_STATUS_CHECKED) (#7955)
Summary:
Ignore return value on WinLogger::CloseInternal() when build with -DROCKSDB_ASSERT_STATUS_CHECKED on windows.

It's a good way to ignore check here?

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7955

Reviewed By: jay-zhuang

Differential Revision: D26524145

Pulled By: ajkr

fbshipit-source-id: f2f643e94cde9772617c68b658fb529fffebd8ce
2021-02-18 16:34:10 -08:00
mrambacher 4a09d632c4 Remove Legacy and Custom FileWrapper classes from header files (#7851)
Summary:
Removed the uses of the Legacy FileWrapper classes from the source code.  The wrappers were creating an additional layer of indirection/wrapping, as the Env already has a FileSystem.

Moved the Custom FileWrapper classes into the CustomEnv, as these classes are really for the private use the the CustomEnv class.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7851

Reviewed By: anand1976

Differential Revision: D26114816

Pulled By: mrambacher

fbshipit-source-id: db32840e58d969d3a0fa6c25aaf13d6dcdc74150
2021-01-28 22:10:32 -08:00
mrambacher 12f1137355 Add a SystemClock class to capture the time functions of an Env (#7858)
Summary:
Introduces and uses a SystemClock class to RocksDB.  This class contains the time-related functions of an Env and these functions can be redirected from the Env to the SystemClock.

Many of the places that used an Env (Timer, PerfStepTimer, RepeatableThread, RateLimiter, WriteController) for time-related functions have been changed to use SystemClock instead.  There are likely more places that can be changed, but this is a start to show what can/should be done.  Over time it would be nice to migrate most (if not all) of the uses of the time functions from the Env to the SystemClock.

There are several Env classes that implement these functions.  Most of these have not been converted yet to SystemClock implementations; that will come in a subsequent PR.  It would be good to unify many of the Mock Timer implementations, so that they behave similarly and be tested similarly (some override Sleep, some use a MockSleep, etc).

Additionally, this change will allow new methods to be introduced to the SystemClock (like https://github.com/facebook/rocksdb/issues/7101 WaitFor) in a consistent manner across a smaller number of classes.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7858

Reviewed By: pdillinger

Differential Revision: D26006406

Pulled By: mrambacher

fbshipit-source-id: ed10a8abbdab7ff2e23d69d85bd25b3e7e899e90
2021-01-25 22:09:11 -08:00
Adam Retter 4926b33742 Improvements to Env::GetChildren (#7819)
Summary:
The main improvement here is to not include `.` or `..` in the results of `Env::GetChildren`. The occurrence of `.` or `..`; it is non-portable, dependent on the Operating System and the File System. See: https://www.gnu.org/software/libc/manual/html_node/Reading_002fClosing-Directory.html

There were lots of duplicate checks spread through the RocksDB codebase previously to skip `.` and `..`. This new removes the need for those at the source.

Also some minor fixes to `Env::GetChildren`:
* Improve error handling in POSIX implementation
* Remove unnecessary array allocation on Windows
* Fix struct name for Windows Non-UTF-8 API

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7819

Reviewed By: ajkr

Differential Revision: D25837394

Pulled By: jay-zhuang

fbshipit-source-id: 1e137e7218d38b450af9c083f73d5357abcbba2e
2021-01-09 09:44:34 -08:00
mrambacher e628f59e87 Create a CustomEnv class; Add WinFileSystem; Make LegacyFileSystemWrapper private (#7703)
Summary:
This PR does the following:
-> Creates a WinFileSystem class.  This class is the Windows equivalent of the PosixFileSystem and will be used on Windows systems.
-> Introduces a CustomEnv class.  A CustomEnv is an Env that takes a FileSystem as constructor argument.  I believe there will only ever be two implementations of this class (PosixEnv and WinEnv).  There is still a CustomEnvWrapper class that takes an Env and a FileSystem and wraps the Env calls with the input Env but uses the FileSystem for the FileSystem calls
-> Eliminates the public uses of the LegacyFileSystemWrapper.

With this change in place, there are effectively the following patterns of Env:
- "Base Env classes" (PosixEnv, WinEnv).  These classes implement the core Env functions (e.g. Threads) and have a hard-coded input FileSystem.  These classes inherit from CompositeEnv, implement the core Env functions (threads) and delegate the FileSystem-like calls to the input file system.
- Wrapped Composite Env classes (MemEnv).  These classes take in an Env and a FileSystem.  The core env functions are re-directed to the wrapped env.  The file system calls are redirected to the input file system
- Legacy Wrapped Env classes.  These classes take in an Env input (but no FileSystem).  The core env functions are re-directed to the wrapped env.  A "Legacy File System" is created using this env and the file system calls directed to the env itself.

With these changes in place, the PosixEnv becomes a singleton -- there is only ever one created.  Any other use of the PosixEnv is via another wrapped env.  This cleans up some of the issues with the env construction and destruction.

Additionally, there were places in the code that required had an Env when they required a FileSystem.  Many of these places would wrap the Env with a LegacyFileSystemWrapper instead of using the env->GetFileSystem().  These places were changed, thereby removing layers of additional redirection (LegacyFileSystem --> Env --> Env::FileSystem).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7703

Reviewed By: zhichao-cao

Differential Revision: D25762190

Pulled By: anand1976

fbshipit-source-id: 1a088e97fc916f28ac69c149cd1dcad0ab31704b
2021-01-06 10:49:32 -08:00
Adam Retter 62afa968c2 Fix various small build issues, Java API naming (#7776)
Summary:
* Compatibility with older GCC.
* Compatibility with older jemalloc libraries.
* Remove Docker warning when building i686 binaries.
* Fix case inconsistency in Java API naming (potential update to HISTORY.md deferred)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7776

Reviewed By: akankshamahajan15

Differential Revision: D25607235

Pulled By: pdillinger

fbshipit-source-id: 7ab0fb7fa7a34e97ed0bec991f5081acb095777d
2020-12-18 16:12:26 -08:00
Peter Dillinger 0b06af9146 Warn about practically unfixable TSAN warnings in stack trace (#7723)
Summary:
TSAN reports that our stack trace handler makes unsafe calls
during a signal handler. I just tried fixing some of them and I don't
think it's fixable unless we can get away from using FILE stdio. Even if
we can use lower level functions only, I'm not sure it's fixed.

I also tried suppressing the reports with function and file level TSAN
suppression, but that doesn't seem to work, perhaps because the
violation is reported on the callee, not the caller.

So I added a warning to be printed whenever these violations would be
reported that they are practically ignorable.

Internal ref: T77844138

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7723

Test Plan:
run external_sst_file_test with seeded abort(), with TSAN
(TSAN warnings + new warning) and without TSAN (no warning, just stack
trace).

Reviewed By: akankshamahajan15

Differential Revision: D25228011

Pulled By: pdillinger

fbshipit-source-id: 3eda1d6e7ca3cdc64076cf99ae954168837d2818
2020-12-01 10:17:56 -08:00
Adam Retter 4c336c6912 Fix jemalloc compliation problem on macOS (#7624)
Summary:
Closes https://github.com/facebook/rocksdb/issues/7269

I have only tested this on macOS, let's see what CI makes of it for the other platforms...

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7624

Reviewed By: ajkr

Differential Revision: D24834305

Pulled By: pdillinger

fbshipit-source-id: ba818d8424297ccebd18ed854b044764c2dbab5f
2020-11-17 15:29:35 -08:00
Yanqin Jin 394210f280 Remove unused includes (#7604)
Summary:
This is a PR generated **semi-automatically** by an internal tool to remove unused includes and `using` statements.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7604

Test Plan: make check

Reviewed By: ajkr

Differential Revision: D24579392

Pulled By: riversand963

fbshipit-source-id: c4bfa6c6b08da1de186690d37eb73d8fff45aecd
2020-10-28 23:22:27 -07:00
Koby Kahane 3e745053b7 Fix MSVC-related build issues (#7439)
Summary:
This PR addresses some build and functional issues on MSVC targets, as a step towards an eventual goal of having RocksDB build successfully for Windows on ARM64.

Addressed issues include:
- BitsSetToOne and CountTrailingZeroBits do not compile on non-x64 MSVC targets. A fallback implementation of BitsSetToOne when Intel intrinsics are not available is added, based on the C++20 `<bit>` popcount implementation in Microsoft's STL.
- The implementation of FloorLog2 for MSVC targets (including x64) gives incorrect results. The unit test easily detects this, but CircleCI is currently configured to only run a specific set of tests for Windows CMake builds, so this seems to have been unnoticed.
- AsmVolatilePause does not use YieldProcessor on Windows ARM64 targets, even though it is available.
- When CondVar::TimedWait calls Microsoft STL's condition_variable::wait_for, it can potentially trigger a bug (just recently fixed in the upcoming VS 16.8's STL) that deadlocks various tests that wait for a timer to execute, since `Timer::Run` doesn't get a chance to execute before being blocked by the test function acquiring the mutex.
- In c_test, `GetTempDir` assumes a POSIX-style temp path.
- `NormalizePath` did not eliminate consecutive POSIX-style path separators on Windows, resulting in test failures in e.g., wal_manager_test.
- Various other test failures.

In a followup PR I hope to modify CircleCI's config.yml to invoke all RocksDB unit tests in Windows CMake builds with CTest, instead of the current use of `run_ci_db_test.ps1` which requires individual tests to be specified and is missing many of the existing tests.

Notes from peterd: FloorLog2 is not yet used in production code (it's for something in progress). I also added a few more inexpensive platform-dependent tests to Windows CircleCI runs. And included facebook/folly#1461 as requested

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7439

Reviewed By: jay-zhuang

Differential Revision: D24021563

Pulled By: pdillinger

fbshipit-source-id: 0ec2027c0d6a494d8a0fe38d9667fc2f7e29f7e7
2020-10-01 09:23:04 -07:00
Levi Tamasi 30fb9dd50f Introduce a helper method UncompressData (#7434)
Summary:
The patch introduces a helper method in `util/compression.h` called `UncompressData`
that dispatches calls to the correct uncompression method based on type, and changes
`UncompressBlockContentsForCompressionType` and `Benchmark::Uncompress` in
`db_bench` so they are implemented in terms of the new method. This eliminates
some code duplication. (`Benchmark::Compress` is also updated to use the previously
introduced `CompressData` helper.)

In addition, the patch brings the implementation of `Snappy_Uncompress` into sync with
the other uncompression methods by making the method compute the buffer size and allocate
the buffer itself. Finally, the patch eliminates some potentially risky back-and-forth conversions
between various unsigned and signed integer types by exposing the size of the allocated buffer
as a `size_t` instead of an `int`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7434

Test Plan:
`make check`
`./db_bench -benchmarks=compress,uncompress --compression_type ...`

Reviewed By: riversand963

Differential Revision: D23900011

Pulled By: ltamasi

fbshipit-source-id: b25df63ceec4639889be94acb22eb53e530c54e0
2020-09-25 09:01:45 -07:00
Xavier Deguillard 249f2b59a0 build: make it compile with @mode/win (#7406)
Summary:
While rocksdb can compile on both macOS and Linux with Buck, it couldn't be
compiled on Windows. The only way to compile it on Windows was with the CMake
build.

To keep the multi-platform complexity low, I've simply included all the Windows
bits in the TARGETS file, and added large #if blocks when not on Windows, the
same was done on the posix specific files.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7406

Test Plan:
On my devserver:
  buck test //rocksdb/...
On Windows:
  buck build mode/win //rocksdb/src:rocksdb_lib

Reviewed By: pdillinger

Differential Revision: D23874358

Pulled By: xavierd

fbshipit-source-id: 8768b5d16d7e8f44b5ca1e2483881ca4b24bffbe
2020-09-23 12:55:54 -07:00
Andrew Kryczka 312f23c92d build fixes for GNU/kFreeBSD (#6992)
Summary:
Upstream https://salsa.debian.org/mariadb-team/mariadb-10.4/-/blob/master/debian/patches/rocksdb-kfreebsd.patch
by jrtc27.

Fixes https://github.com/facebook/rocksdb/issues/5223.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6992

Reviewed By: zhichao-cao

Differential Revision: D22084150

Pulled By: ajkr

fbshipit-source-id: 1822311ba16f112a15065b2180ce89d36af9cafc
2020-06-18 09:51:28 -07:00
Peter Dillinger c7432cc3c0 Fix more defects reported by Coverity Scan (#6935)
Summary:
Mostly uninitialized values: some probably written before use, but some seem like bugs. Also, destructor needs to be virtual, and possible use-after-free in test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6935

Test Plan: make check

Reviewed By: siying

Differential Revision: D21885484

Pulled By: pdillinger

fbshipit-source-id: e2e7cb0a0cf196f2b55edd16f0634e81f6cc8e08
2020-06-04 15:35:08 -07:00
Lucian Petrut 172adce767 Posix threads (#6865)
Summary:
Rocksdb is using the c++11 std::threads feature. The issue is that
MINGW only supports it when using Posix threads.

This change will allow rocksdb::port::WindowsThread to be replaced
with std::thread, which in turn will allow Rocksdb to be cross
compiled using MINGW.

At the same time, we'll have to use GetCurrentProcessId instead of _getpid.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6865

Reviewed By: cheng-chang

Differential Revision: D21864285

Pulled By: ajkr

fbshipit-source-id: 0982eed313e7d34d351b1364c1ccc722da473205
2020-06-03 12:01:57 -07:00
Peter Dillinger 31da5e34c1 C++20 compatibility (#6697)
Summary:
Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:

* Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
* Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
* std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
* Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
* Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
* Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
* Added GCC 9 C++11 & GCC9 C++20 in Travis
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697

Test Plan: make check and CI

Reviewed By: cheng-chang

Differential Revision: D21020318

Pulled By: pdillinger

fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
2020-04-20 13:24:25 -07:00
Yanqin Jin 243852ec15 Add IsDirectory() to Env and FS (#6711)
Summary:
IsDirectory() is a common API to check whether a path is a regular file or
directory.
POSIX: call stat() and use S_ISDIR(st_mode)
Windows: PathIsDirectoryA() and PathIsDirectoryW()
HDFS: FileSystem.IsDirectory()
Java: File.IsDirectory()
...
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6711

Test Plan: make check

Reviewed By: anand1976

Differential Revision: D21053520

Pulled By: riversand963

fbshipit-source-id: 680aadfd8ce982b63689190cf31b3145d5a89e27
2020-04-17 14:39:18 -07:00
anand76 5c19a441c4 Fault injection in db_stress (#6538)
Summary:
This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538

Test Plan:
crash_test
make check

Reviewed By: riversand963

Differential Revision: D20714347

Pulled By: anand1976

fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
2020-04-10 17:21:26 -07:00
Andrew Gallagher fd3ddaf90d Fix jemalloc forward declarations (#6613)
Summary:
Add `nothrow` attribute to match declarations in jemalloc.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/6613

Reviewed By: igorsugak

Differential Revision: D20749490

fbshipit-source-id: 9ac8df27f7b4268f27b32b130c23ce8a1f772b3a
2020-03-31 11:38:51 -07:00
Cheng Chang ee50b8d499 Be able to decrease background thread's CPU priority when creating database backup (#6602)
Summary:
When creating a database backup, the background threads will not only consume IO resources by copying files, but also consuming CPU such as by computing checksums. During peak times, the CPU consumption by the background threads might affect online queries.

This PR makes it possible to decrease CPU priority of these threads when creating a new backup.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6602

Test Plan: make check

Reviewed By: siying, zhichao-cao

Differential Revision: D20683216

Pulled By: cheng-chang

fbshipit-source-id: 9978b9ed9488e8ce135e90ca083e5b4b7221fd84
2020-03-28 19:07:25 -07:00
Cheng Chang 4fc216649d Support direct IO in RandomAccessFileReader::MultiRead (#6446)
Summary:
By supporting direct IO in RandomAccessFileReader::MultiRead, the benefits of parallel IO (IO uring) and direct IO can be combined.

In direct IO mode, read requests are aligned and merged together before being issued to RandomAccessFile::MultiRead, so blocks in the original requests might share the same underlying buffer, the shared buffers are returned in `aligned_bufs`, which is a new parameter of the `MultiRead` API.

For example, suppose alignment requirement for direct IO is 4KB, one request is (offset: 1KB, len: 1KB), another request is (offset: 3KB, len: 1KB), then since they all belong to page (offset: 0, len: 4KB), `MultiRead` only reads the page with direct IO into a buffer on heap, and returns 2 Slices referencing regions in that same buffer. See `random_access_file_reader_test.cc` for more examples.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6446

Test Plan: Added a new test `random_access_file_reader_test.cc`.

Reviewed By: anand1976

Differential Revision: D20097518

Pulled By: cheng-chang

fbshipit-source-id: ca48a8faf9c3af146465c102ef6b266a363e78d1
2020-03-20 16:33:26 -07:00
Peter Dillinger 43dde332cb Share kPageSize (and other small tweaks) (#6443)
Summary:
Make kPageSize extern const size_t (used in draft https://github.com/facebook/rocksdb/issues/6427)
Make kLitteEndian constexpr bool
Clarify a couple of comments
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6443

Test Plan: make check, CI

Differential Revision: D20044558

Pulled By: pdillinger

fbshipit-source-id: e0c5cc13229c82726280dc0ddcba4078346b8418
2020-02-22 08:01:36 -08:00
sdong fdf882ded2 Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433)
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433

Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.

Differential Revision: D19977691

fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
2020-02-20 12:09:57 -08:00
Huisheng Liu 2fdd8087ce Implement getfreespace for WinEnv (#6265)
Summary:
A new interface method Env::GetFreeSpace was added in https://github.com/facebook/rocksdb/issues/4164. It needs to be implemented for Windows port. Some error_handler_test cases fail on Windows because recovery cannot succeed without free space being reported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6265

Differential Revision: D19303065

fbshipit-source-id: 1f1a83e53f334284781cf61feabc996e87b945ca
2020-01-07 13:56:13 -08:00
John Ericson c16b087427 Work around weird unused errors with Mingw (#6075)
Summary:
From the reset of the code, it looks this this maybe can be unconditionally given the attribute? But I couldn't test with MSVC so I defensively put under CPP.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6075

Differential Revision: D18723749

fbshipit-source-id: 45fc8732c28dd29aab1644225d68f3c6f39bd69b
2019-11-26 21:42:29 -08:00
Andrew Kryczka b00761eea6 Fix block cache ID uniqueness for Windows builds (#5844)
Summary:
Since we do not evict a file's blocks from block cache before that file
is deleted, we require a file's cache ID prefix is both unique and
non-reusable. However, the Windows functionality we were relying on only
guaranteed uniqueness. That meant a newly created file could be assigned
the same cache ID prefix as a deleted file. If the newly created file
had block offsets matching the deleted file, full cache keys could be
exactly the same, resulting in obsolete data blocks returned from cache
when trying to read from the new file.

We noticed this when running on FAT32 where compaction was writing out
of order keys due to reading obsolete blocks from its input files. The
functionality is documented as behaving the same on NTFS, although I
wasn't able to repro it there.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5844

Test Plan:
we had a reliable repro of out-of-order keys on FAT32 that
was fixed by this change

Differential Revision: D17752442

fbshipit-source-id: 95d983f9196cf415f269e19293b97341edbf7e00
2019-10-11 18:19:31 -07:00
sdong e8263dbdaa Apply formatter to recent 200+ commits. (#5830)
Summary:
Further apply formatter to more recent commits.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5830

Test Plan: Run all existing tests.

Differential Revision: D17488031

fbshipit-source-id: 137458fd94d56dd271b8b40c522b03036943a2ab
2019-09-20 12:04:26 -07:00
Peter Dillinger 68626249c3 Refactor/consolidate legacy Bloom implementation details (#5784)
Summary:
Refactoring to consolidate implementation details of legacy
Bloom filters. This helps to organize and document some related,
obscure code.

Also added make/cpp var TEST_CACHE_LINE_SIZE so that it's easy to
compile and run unit tests for non-native cache line size. (Fixed a
related test failure in db_properties_test.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5784

Test Plan:
make check, including Recently added Bloom schema unit tests
(in ./plain_table_db_test && ./bloom_test), and including with
TEST_CACHE_LINE_SIZE=128U and TEST_CACHE_LINE_SIZE=256U. Tested the
schema tests with temporary fault injection into new implementations.

Some performance testing with modified unit tests suggest a small to moderate
improvement in speed.

Differential Revision: D17381384

Pulled By: pdillinger

fbshipit-source-id: ee42586da996798910fc45ac0b6289147f16d8df
2019-09-16 16:17:09 -07:00
Maysam Yabandeh 638d239507 Charge block cache for cache internal usage (#5797)
Summary:
For our default block cache, each additional entry has extra memory overhead. It include LRUHandle (72 bytes currently) and the cache key (two varint64, file id and offset). The usage is not negligible. For example for block_size=4k, the overhead accounts for an extra 2% memory usage for the cache. The patch charging the cache for the extra usage, reducing untracked memory usage outside block cache. The feature is enabled by default and can be disabled by passing kDontChargeCacheMetadata to the cache constructor.
This PR builds up on https://github.com/facebook/rocksdb/issues/4258
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5797

Test Plan:
- Existing tests are updated to either disable the feature when the test has too much dependency on the old way of accounting the usage or increasing the cache capacity to account for the additional charge of metadata.
- The Usage tests in cache_test.cc are augmented to test the cache usage under kFullChargeCacheMetadata.

Differential Revision: D17396833

Pulled By: maysamyabandeh

fbshipit-source-id: 7684ccb9f8a40ca595e4f5efcdb03623afea0c6f
2019-09-16 15:26:21 -07:00
Peter Dillinger d3a6726f02 Revert changes from PR#5784 accidentally in PR#5780 (#5810)
Summary:
This will allow us to fix history by having the code changes for PR#5784 properly attributed to it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5810

Differential Revision: D17400231

Pulled By: pdillinger

fbshipit-source-id: 2da8b1cdf2533cfedb35b5526eadefb38c291f09
2019-09-16 11:38:53 -07:00
Peter Dillinger aa2486b23c Refactor some confusing logic in PlainTableReader
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5780

Test Plan: existing plain table unit test

Differential Revision: D17368629

Pulled By: pdillinger

fbshipit-source-id: f25409cdc2f39ebe8d5cbb599cf820270e6b5d26
2019-09-13 10:26:36 -07:00
Shylock Hg 9eb3e1f77d Use delete to disable automatic generated methods. (#5009)
Summary:
Use delete to disable automatic generated methods instead of private, and put the constructor together for more clear.This modification cause the unused field warning, so add unused attribute to disable this warning.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5009

Differential Revision: D17288733

fbshipit-source-id: 8a767ce096f185f1db01bd28fc88fef1cdd921f3
2019-09-11 18:09:00 -07:00
Yi Wu 849a8c0ae0 fix sign compare warnings (#5651)
Summary:
Fix -Wsign-compare warnings for gcc9.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5651

Test Plan: Tested with ubuntu19.10+gcc9

Differential Revision: D16567428

fbshipit-source-id: 730b2704d42ba0c4e4ea946a3199bbb34be4c25c
2019-07-30 14:12:54 -07:00
Tomas Kolda 6e8a1354a7 Fix regression - 100% CPU - Regression for Windows 7 (#5557)
Summary:
Fixes https://github.com/facebook/rocksdb/issues/5552
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5557

Differential Revision: D16266329

fbshipit-source-id: a8f6b50298a6f7c8d6c7e172bb26dd7eb6bd8a4d
2019-07-15 12:19:49 -07:00
Andrew Kryczka 09ea5d8944 Fix clang build with jemalloc (#5522)
Summary:
Fixes the below build failure for clang compiler using glibc and jemalloc.

Platform: linux x86-64
Compiler: clang version 6.0.0-1ubuntu2
Build failure:
```
$ CXX=clang++ CC=clang USE_CLANG=1 WITH_JEMALLOC_FLAG=1 JEMALLOC=1 EXTRA_LDFLAGS="-L/home/andrew/jemalloc/lib/" EXTRA_CXXFLAGS="-I/home/andrew/jemalloc/include/" make check -j12
...
  CC       memory/jemalloc_nodump_allocator.o
In file included from memory/jemalloc_nodump_allocator.cc:6:
In file included from ./memory/jemalloc_nodump_allocator.h:11:
In file included from ./port/jemalloc_helper.h:16:
/usr/include/clang/6.0.0/include/mm_malloc.h:39:16: error: 'posix_memalign' is missing exception specification 'throw()'
extern "C" int posix_memalign(void **__memptr, size_t __alignment, size_t __size);
               ^
/home/andrew/jemalloc/include/jemalloc/jemalloc.h:388:26: note: expanded from macro 'posix_memalign'
#  define posix_memalign je_posix_memalign
                         ^
/home/andrew/jemalloc/include/jemalloc/jemalloc.h:77:29: note: expanded from macro 'je_posix_memalign'
#  define je_posix_memalign posix_memalign
                            ^
/home/andrew/jemalloc/include/jemalloc/jemalloc.h:232:38: note: previous declaration is here
JEMALLOC_EXPORT int JEMALLOC_NOTHROW    je_posix_memalign(void **memptr,
                                        ^
/home/andrew/jemalloc/include/jemalloc/jemalloc.h:77:29: note: expanded from macro 'je_posix_memalign'
#  define je_posix_memalign posix_memalign
                            ^
1 error generated.
Makefile:1972: recipe for target 'memory/jemalloc_nodump_allocator.o' failed
make: *** [memory/jemalloc_nodump_allocator.o] Error 1
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5522

Differential Revision: D16069869

Pulled By: miasantreble

fbshipit-source-id: c489bbc993adee194b9a550134c6237a264bc443
2019-07-02 13:02:12 -07:00
Andrew Kryczka 0d57d93a06 Support jemalloc compiled with --with-jemalloc-prefix (#5521)
Summary:
Previously, if the jemalloc was built with nonempty string for
`--with-jemalloc-prefix`, then `HasJemalloc()` would return false on
Linux, so jemalloc would not be used at runtime. On Mac, it would cause
a linker failure due to no definitions found for the weak functions
declared in "port/jemalloc_helper.h". This should be a rare problem
because (1) on Linux the default `--with-jemalloc-prefix` value is the
empty string, and (2) Homebrew's build explicitly sets
`--with-jemalloc-prefix` to the empty string.

However, there are cases where `--with-jemalloc-prefix` is nonempty.
For example, when building jemalloc from source on Mac, the default
setting is `--with-jemalloc-prefix=je_`. Such jemalloc builds should be
usable by RocksDB.

The fix is simple. Defining `JEMALLOC_MANGLE` before including
"jemalloc.h" causes it to define unprefixed symbols that are aliases for
each of the prefixed symbols. Thanks to benesch for figuring this out
and explaining it to me.

Fixes https://github.com/facebook/rocksdb/issues/1462.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5521

Test Plan:
build jemalloc with prefixed symbols:

```
$ ./configure --with-jemalloc-prefix=lol
$ make
```

compile rocksdb against it:

```
$ WITH_JEMALLOC_FLAG=1 JEMALLOC=1 EXTRA_LDFLAGS="-L/home/andrew/jemalloc/lib/" EXTRA_CXXFLAGS="-I/home/andrew/jemalloc/include/" make -j12 ./db_bench
```

run db_bench and verify jemalloc actually used:

```
$ ./db_bench -benchmarks=fillrandom -statistics=true -dump_malloc_stats=true -stats_dump_period_sec=1
$ grep jemalloc /tmp/rocksdbtest-1000/dbbench/LOG
2019/06/29-12:20:52.088658 7fc5fb7f6700 [_impl/db_impl.cc:837] ___ Begin jemalloc statistics ___
...
```

Differential Revision: D16092758

fbshipit-source-id: c2c358346190ed62ceb2a3547a6c4c180b12f7c4
2019-07-02 12:07:01 -07:00
Huisheng Liu b47cfec5d0 fix compilation error on MSVC (#5458)
Summary:
"__attribute__((__weak__))" was introduced in port\jemalloc_helper.h. It's not supported by Microsoft VS 2015, resulting in compile error. This fix adds a #if branch to work around the compile issue.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5458

Differential Revision: D15827285

fbshipit-source-id: 8c5f7ad31de1ac677bd96f16c4450767de834beb
2019-06-14 11:28:13 -07:00
Siying Dong 000b9ec217 Move some logging related files to logging/ (#5387)
Summary:
Many logging related source files are under util/. It will be more structured if they are together.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5387

Differential Revision: D15579036

Pulled By: siying

fbshipit-source-id: 3850134ed50b8c0bb40a0c8ae1f184fa4081303f
2019-05-31 17:23:59 -07:00
Siying Dong 8843129ece Move some memory related files from util/ to memory/ (#5382)
Summary:
Move arena, allocator, and memory tools under util to a separate memory/ directory.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5382

Differential Revision: D15564655

Pulled By: siying

fbshipit-source-id: 9cd6b5d0d3d52b39606e19221fa154596e5852a5
2019-05-30 17:44:09 -07:00
Siying Dong e9e0101ca4 Move test related files under util/ to test_util/ (#5377)
Summary:
There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
ve them to a new directory test_util/, so that util/ is cleaner.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377

Differential Revision: D15551366

Pulled By: siying

fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
2019-05-30 11:25:51 -07:00
Sagar Vemuri dda474399a Remove PATENTS text from a few straggler files (#5326)
Summary:
Remove PATENTS related wording from a few stragglers which still reference the old PATENTS file.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5326

Differential Revision: D15423297

Pulled By: sagar0

fbshipit-source-id: 4babcddfc120b7d2fed6eb3898287cf8012bf8ea
2019-05-21 16:22:35 -07:00
Andrew Kryczka 8272a6de57 Optionally wait on bytes_per_sync to smooth I/O (#5183)
Summary:
The existing implementation does not guarantee bytes reach disk every `bytes_per_sync` when writing SST files, or every `wal_bytes_per_sync` when writing WALs. This can cause confusing behavior for users who enable this feature to avoid large syncs during flush and compaction, but then end up hitting them anyways.

My understanding of the existing behavior is we used `sync_file_range` with `SYNC_FILE_RANGE_WRITE` to submit ranges for async writeback, such that we could continue processing the next range of bytes while that I/O is happening. I believe we can preserve that benefit while also limiting how far the processing can get ahead of the I/O, which prevents huge syncs from happening when the file finishes.

Consider this `sync_file_range` usage: `sync_file_range(fd_, 0, static_cast<off_t>(offset + nbytes), SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE)`. Expanding the range to start at 0 and adding the `SYNC_FILE_RANGE_WAIT_BEFORE` flag causes any pending writeback (like from a previous call to `sync_file_range`) to finish before it proceeds to submit the latest `nbytes` for writeback. The latest `nbytes` are still written back asynchronously, unless processing exceeds I/O speed, in which case the following `sync_file_range` will need to wait on it.

There is a second change in this PR to use `fdatasync` when `sync_file_range` is unavailable (determined statically) or has some known problem with the underlying filesystem (determined dynamically).

The above two changes only apply when the user enables a new option, `strict_bytes_per_sync`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5183

Differential Revision: D14953553

Pulled By: siying

fbshipit-source-id: 445c3862e019fb7b470f9c7f314fc231b62706e9
2019-04-22 11:51:39 -07:00
datonli f0edf9d575 #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152)
Summary:
mv port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5152

Differential Revision: D14779409

Pulled By: siying

fbshipit-source-id: d4162c47c979c6e8cc6a9e601802864ab3768ecb
2019-04-04 11:38:19 -07:00
Burton Li d1edf4eced Format env_win coding style by google c++ style guide (#5096)
Summary:
The existing code for env_win src and header file doesn't fully followed the recommended code style (https://google.github.io/styleguide/cppguide.html#Functions). Fix it for better readability.
anand1976 siying
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5096

Differential Revision: D14585358

Pulled By: anand1976

fbshipit-source-id: 7ce35ffe9e922f5c1421b0bbaa5fce7abad57617
2019-03-27 16:38:20 -07:00
Yanqin Jin 9358178edc Support for single-primary, multi-secondary instances (#4899)
Summary:
This PR allows RocksDB to run in single-primary, multi-secondary process mode.
The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.

This PR has several components:
1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.

2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.

3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
3.1 Tail the primary's MANIFEST during recovery.
3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
3.3 Tailing WAL will be in a future PR.

4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4899

Differential Revision: D14510945

Pulled By: riversand963

fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
2019-03-26 16:45:31 -07:00
Burton Li 88d85b6820 fix NowNanos overflow (#5062)
Summary:
The original implementation of WinEnvIO::NowNanos() has a constant data overflow by:
li.QuadPart *= std::nano::den;
As a result, the api provides a incorrect result.
e.g.:
li.QuadPart=13477844301545
std::nano::den=1e9

The fix uses pre-computed nano_seconds_per_period_ to present the nano seconds per performance counter period, in the case if nano::den is divisible by perf_counter_frequency_. Otherwise it falls back to use high_resolution_clock.
siying ajkr
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5062

Differential Revision: D14426842

Pulled By: anand1976

fbshipit-source-id: 127f1daf423dd4b30edd0dcf8ea0466f468bec12
2019-03-21 15:18:36 -07:00
Levi Tamasi 34f8ac0c99 Make adaptivity of LRU cache mutexes configurable (#5054)
Summary:
The patch adds a new config option to LRUCacheOptions that enables
users to choose whether to use an adaptive mutex for the LRU block
cache (on platforms where adaptive mutexes are supported). The default
is true if RocksDB is compiled with -DROCKSDB_DEFAULT_TO_ADAPTIVE_MUTEX,
false otherwise.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5054

Differential Revision: D14542749

Pulled By: ltamasi

fbshipit-source-id: 0065715ab6cf91f10444b737fed8c8aee6a8a0d2
2019-03-20 12:33:44 -07:00
Zhongyi Xie fdc72a5c5d add OptionType kInt32T and kInt64T
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5061

Differential Revision: D14418581

Pulled By: miasantreble

fbshipit-source-id: be7f90e16586666ddd0cce36971e403782ab0892
2019-03-12 13:49:52 -07:00
Yi Wu 8a1ecd1982 Fix build failures due to missing JEMALLOC_CXX_THROW macro (#5053)
Summary:
JEMALLOC_CXX_THROW is not defined for earlier versions of jemalloc (e.g. 3.6), causing builds to fail on some platforms. Fixing it. Closes #4869
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5053

Differential Revision: D14390034

Pulled By: sagar0

fbshipit-source-id: b2b7a03cd377201ef385eb521f65bae85c558055
2019-03-08 17:06:13 -08:00
Yi Wu 77a8d4d476 Detect if Jemalloc is linked with the binary (#4844)
Summary:
Declare Jemalloc non-standard APIs as weak symbols, so that if Jemalloc is linked with the binary, these symbols will be replaced by Jemalloc's, otherwise they will be nullptr. This is similar to how folly detect jemalloc, but we assume the main program use jemalloc as long as jemalloc is linked: https://github.com/facebook/folly/blob/master/folly/memory/Malloc.h#L147
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4844

Differential Revision: D13574934

Pulled By: yiwu-arbug

fbshipit-source-id: 7ea871beb1be7d5a1259cc38f9b78078793db2db
2019-01-03 16:30:12 -08:00
Sagar Vemuri dc3528077a Update all unique/shared_ptr instances to be qualified with namespace std (#4638)
Summary:
Ran the following commands to recursively change all the files under RocksDB:
```
find . -type f -name "*.cc" -exec sed -i 's/ unique_ptr/ std::unique_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/<unique_ptr/<std::unique_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/ shared_ptr/ std::shared_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/<shared_ptr/<std::shared_ptr/g' {} +
```
Running `make format` updated some formatting on the files touched.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4638

Differential Revision: D12934992

Pulled By: sagar0

fbshipit-source-id: 45a15d23c230cdd64c08f9c0243e5183934338a8
2018-11-09 11:19:58 -08:00
Wilfried Goesgens 5d809ecef7 Add compile time option to work with utf8 filename strings (#4469)
Summary:
The default behaviour of rocksdb is to use the `*A(` windows API functions.
These accept filenames in the currently configured system encoding,
be it Latin 1, utf8 or whatever.
If the Application intends to completely work with utf8 strings internally,
converting these to that codepage properly isn't even always possible.
Thus this patch adds a switch to use the `*W(` functions, which accept
UTF-16 filenames, and uses C++11 features to translate the
UTF8 containing std::string to an UTF16 containing std::wstring.

This feature is a compile time options, that can be enabled by setting `WITH_WINDOWS_UTF8_FILENAMES` to true.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4469

Differential Revision: D10356011

Pulled By: yiwu-arbug

fbshipit-source-id: 27b6ae9171f209085894cdf80069e8a896642044
2018-10-11 23:24:28 -07:00
Dmitry Alimov e13d8dcbbb Fix typos in comments (#4456)
Summary:
Fix some typos in the comments
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4456

Differential Revision: D10209214

Pulled By: miasantreble

fbshipit-source-id: dff857ba60396bc95126e635db96d7dc8330d2cb
2018-10-04 20:46:50 -07:00
Constantin Belyaev 2353c5c821 Fix cross-filesystem checkpoint on Windows (#4365)
Summary:
Now port/win_env.cc do check error for cross device link creation.
Fixes #4364
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4365

Differential Revision: D9833144

Pulled By: ajkr

fbshipit-source-id: be7555e510f4b8d2196d843841606a6cfada7644
2018-09-14 10:28:39 -07:00
cngzhnp 64324e329e Support pragma once in all header files and cleanup some warnings (#4339)
Summary:
As you know, almost all compilers support "pragma once" keyword instead of using include guards. To be keep consistency between header files, all header files are edited.

Besides this, try to fix some warnings about loss of data.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4339

Differential Revision: D9654990

Pulled By: ajkr

fbshipit-source-id: c2cf3d2d03a599847684bed81378c401920ca848
2018-09-05 18:13:31 -07:00
Yanqin Jin bb5dcea98e Add path to WritableFileWriter. (#4039)
Summary:
We want to sample the file I/O issued by RocksDB and report the function calls. This requires us to include the file paths otherwise it's hard to tell what has been going on.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4039

Differential Revision: D8670178

Pulled By: riversand963

fbshipit-source-id: 97ee806d1c583a2983e28e213ee764dc6ac28f7a
2018-08-23 10:12:58 -07:00
Dmitri Smirnov ab22cf349e Implement Env::NumFileLinks (#4221)
Summary:
Although delete scheduler implementation allows for the interface not to be supported, the delete_scheduler_test does not allow for that.
Address compiler warnings
Make sst_dump_test use test directory structure as the current execution directory may not be writiable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4221

Differential Revision: D9210152

Pulled By: siying

fbshipit-source-id: 381a74511e969ecb8089d5c4b4df87dc30c8df63
2018-08-09 14:29:11 -07:00
Siying Dong a5e851e113 Reformatting some recent changes (#4161)
Summary:
Lint is not happy with some new code recently committed. Format them.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4161

Differential Revision: D8940582

Pulled By: siying

fbshipit-source-id: c9b43b1ef8c88b5e923911058b44eb77234b36b7
2018-07-20 14:43:38 -07:00
Dmitri Smirnov 78ab11cd71 Return new operator for Status allocations for Windows (#4128)
Summary: Windows requires new/delete for memory allocations to be overriden. Refactor to be less intrusive.

Differential Revision: D8878047

Pulled By: siying

fbshipit-source-id: 35f2b5fec2f88ea48c9be926539c6469060aab36
2018-07-19 15:09:06 -07:00
Yanqin Jin 524c6e6b72 Add file name info to SequentialFileReader. (#4026)
Summary:
We potentially need this information for tracing, profiling and diagnosis.
Closes https://github.com/facebook/rocksdb/pull/4026

Differential Revision: D8555214

Pulled By: riversand963

fbshipit-source-id: 4263e06c00b6d5410b46aa46eb4e358ff2161dd2
2018-06-21 08:42:24 -07:00
Dmitri Smirnov 61d69d450d Hide jemalloc aligned allocation functions into .cc (#4025)
Summary:
so they could be overriden
Closes https://github.com/facebook/rocksdb/pull/4025

Differential Revision: D8526287

Pulled By: siying

fbshipit-source-id: 9537b299dc907b4d1eeaf77a8784b13cb058280d
2018-06-19 17:12:06 -07:00
shpala 9c7da963bc Fix a crash in WinEnvIO::GetSectorSize (#3975)
Summary:
Fix a crash in `WinEnvIO::GetSectorSize` that happens on old Windows systems (e.g Windows 7).
On old Windows systems that don't support querying StorageAccessAlignmentProperty using IOCTL_STORAGE_QUERY_PROPERTY, the flow calls a different DeviceIoControl with nullptr as lpBytesReturned.
When the code reaches this point, we get an access violation.
Closes https://github.com/facebook/rocksdb/pull/3975

Differential Revision: D8385186

Pulled By: ajkr

fbshipit-source-id: fae4c9b4b0a52c8a10182e1b35bcaa30dc393bbb
2018-06-12 13:45:18 -07:00
Zhongyi Xie f1592a06c2 run make format for PR 3838 (#3954)
Summary:
PR https://github.com/facebook/rocksdb/pull/3838 made some changes that triggers lint warnings.
Run `make format` to fix formatting as suggested by siying .
Also piggyback two changes:
1) fix singleton destruction order for windows and posix env
2) fix two clang warnings
Closes https://github.com/facebook/rocksdb/pull/3954

Differential Revision: D8272041

Pulled By: miasantreble

fbshipit-source-id: 7c4fd12bd17aac13534520de0c733328aa3c6c9f
2018-06-05 12:58:02 -07:00
Dmitri Smirnov f4b72d7056 Provide a way to override windows memory allocator with jemalloc for ZSTD
Summary:
Windows does not have LD_PRELOAD mechanism to override all memory allocation functions and ZSTD makes use of C-tuntime calloc. During flushes and compactions default system allocator fragments and the system slows down considerably.

For builds with jemalloc we employ an advanced ZSTD context creation API that re-directs memory allocation to jemalloc. To reduce the cost of context creation on each block we cache ZSTD context within the block based table builder while a new SST file is being built, this will help all platform builds including those w/o jemalloc. This avoids system allocator fragmentation and improves the performance.

The change does not address random reads and currently on Windows reads with ZSTD regress as compared with SNAPPY compression.
Closes https://github.com/facebook/rocksdb/pull/3838

Differential Revision: D8229794

Pulled By: miasantreble

fbshipit-source-id: 719b622ab7bf4109819bc44f45ec66f0dd3ee80d
2018-06-04 12:12:48 -07:00
Dmitri Smirnov 3db8504cde Catchup with posix features
Summary:
Catch up with Posix features
  NewWritableRWFile must fail when file does not exists
  Implement Env::Truncate()
  Adjust Env options optimization functions
  Implement MemoryMappedBuffer on Windows.
Closes https://github.com/facebook/rocksdb/pull/3857

Differential Revision: D8053610

Pulled By: ajkr

fbshipit-source-id: ccd0d46c29648a9f6f496873bc1c9d6c5547487e
2018-05-24 15:13:04 -07:00
Kefu Chai c465509379 port_posix: use posix_memalign() for aligned_alloc
Summary:
to workaround issue of http://tracker.ceph.com/issues/21422 .
and in tcmalloc aligned_alloc and posix_memalign() are basically the
same thing. the same applies to GNU glibc.

fixes #3175

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Closes https://github.com/facebook/rocksdb/pull/3862

Differential Revision: D8147930

Pulled By: yiwu-arbug

fbshipit-source-id: 355afe93c4dd0a96a0d711ef190e8b86fbe8d11d
2018-05-24 12:13:16 -07:00
Dmitri Smirnov 934f96de27 Better destroydb
Summary:
Delete archive directory before WAL folder
  since archive may be contained as a subfolder.
  Also improve loop readability.
Closes https://github.com/facebook/rocksdb/pull/3797

Differential Revision: D7866378

Pulled By: riversand963

fbshipit-source-id: 0c45d97677ce6fbefa3f8d602ef5e2a2a925e6f5
2018-05-03 16:13:09 -07:00
Dmitri Smirnov acb61b7a52 Adjust pread/pwrite to return Status
Summary:
Returning bytes_read causes the caller to call GetLastError()
  to report failure but the lasterror may be overwritten by then
  so we lose the error code.
  Fix up CMake file to include xpress source code only when needed.
  Fix warning for the uninitialized var.
Closes https://github.com/facebook/rocksdb/pull/3795

Differential Revision: D7832935

Pulled By: anand1976

fbshipit-source-id: 4be21affb9b85d361b96244f4ef459f492b7cb2b
2018-05-01 13:42:46 -07:00
Dmitri Smirnov 61785c73ed CloseHandle docs says that the return is non-zero, does not say TRUE(1)
Summary:
say it is TRUE(1). Add assert.
Closes https://github.com/facebook/rocksdb/pull/3630

Differential Revision: D7346895

Pulled By: ajkr

fbshipit-source-id: a46075aa4dd89f32520230606adecccecc874cdf
2018-03-20 18:43:02 -07:00
Bruce Mitchener 0de710f5b8 Use nullptr instead of NULL / 0 more consistently.
Summary: Closes https://github.com/facebook/rocksdb/pull/3569

Differential Revision: D7170968

Pulled By: yiwu-arbug

fbshipit-source-id: 308a6b7dd358a04fd9a7de3d927bfd8abd57d348
2018-03-07 12:42:12 -08:00
Dmitri Smirnov c364eb42b5 Windows cumulative patch
Summary:
This patch addressed several issues.
  Portability including db_test std::thread -> port::Thread Cc: @
  and %z to ROCKSDB portable macro. Cc: maysamyabandeh

  Implement Env::AreFilesSame

  Make the implementation of file unique number more robust

  Get rid of C-runtime and go directly to Windows API when dealing
  with file primitives.

  Implement GetSectorSize() and aling unbuffered read on the value if
  available.

  Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976

  Fix test running script issue where $status var was of incorrect scope
  so the failures were swallowed and not reported.

  DestroyDB() creates a logger and opens a LOG file in the directory
  being cleaned up. This holds a lock on the folder and the cleanup is
  prevented. This fails one of the checkpoin tests. We observe the same in production.
  We close the log file in this change.

 Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test
 attempts to open a directory with NewRandomAccessFile which does not
 work on Windows.
  Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug
Closes https://github.com/facebook/rocksdb/pull/3552

Differential Revision: D7156304

Pulled By: siying

fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b
2018-03-06 11:57:43 -08:00
Andrew Kryczka 5d68243e61 Comment out unused variables
Summary:
Submitting on behalf of another employee.
Closes https://github.com/facebook/rocksdb/pull/3557

Differential Revision: D7146025

Pulled By: ajkr

fbshipit-source-id: 495ca5db5beec3789e671e26f78170957704e77e
2018-03-05 13:13:41 -08:00
Dmitri Smirnov 7eb292da14 Fix a memory leak in WindowsThread
Summary:
_endthreadex does not return and thus objects
  for stack destructors do not run. This creates a memory leak.
  We remove the calls since _enthreadex called automatically after the
  threadproc returns i.e. thread exits.
Closes https://github.com/facebook/rocksdb/pull/3542

Differential Revision: D7088713

Pulled By: ajkr

fbshipit-source-id: 749ecafc6a9572f587f76e516547e07734349a54
2018-02-26 13:46:12 -08:00
Igor Sugak aba3409740 Back out "[codemod] - comment out unused parameters"
Reviewed By: igorsugak

fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37
2018-02-22 12:43:17 -08:00
David Lai f4a030ce81 - comment out unused parameters
Reviewed By: everiq, igorsugak

Differential Revision: D7046710

fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461
2018-02-22 09:44:23 -08:00
Siying Dong ef29d2a234 Explictly fail writes if key or value is not smaller than 4GB
Summary:
Right now, users will encounter unexpected bahavior if they use key or value larger than 4GB. We should explicitly fail the queriers.
Closes https://github.com/facebook/rocksdb/pull/3484

Differential Revision: D6953895

Pulled By: siying

fbshipit-source-id: b60491e1af064fc5d52971956661f6c18ceac24f
2018-02-09 14:57:54 -08:00
Tamir Duberstein cd5092e168 Suppress unused warnings
Summary:
- Use `__unused__` everywhere
- Suppress unused warnings in Release mode
    + This currently affects non-MSVC builds (e.g. mingw64).
Closes https://github.com/facebook/rocksdb/pull/3448

Differential Revision: D6885496

Pulled By: miasantreble

fbshipit-source-id: f2f6adacec940cc3851a9eee328fafbf61aad211
2018-02-02 12:27:07 -08:00
Adam Retter a53c571d2d FreeBSD build support for RocksDB and RocksJava
Summary:
Tested on a clean FreeBSD 11.01 x64.

Closes https://github.com/facebook/rocksdb/pull/1423
Closes https://github.com/facebook/rocksdb/pull/3357

Differential Revision: D6705868

Pulled By: sagar0

fbshipit-source-id: cbccbbdafd4f42922512ca03619a5d5583a425fd
2018-01-11 13:29:55 -08:00
Siying Dong ccc095a016 Speed up BlockTest.BlockReadAmpBitmap
Summary:
BlockTest.BlockReadAmpBitmap is too slow and times out in some environments. Speed it up by:
(1) improve the way the verification is done. With this it is 5 times faster
(2) run fewer tests for large blocks. This cut it down by another 10 times.
Now it can finish in similar time as other tests.
Closes https://github.com/facebook/rocksdb/pull/3313

Differential Revision: D6643711

Pulled By: siying

fbshipit-source-id: c2397d666eab5421a78ca87e1e45491e0f832a6d
2018-01-02 10:41:28 -08:00
burtonli b5c99cc908 Disable onboard cache for compaction output
Summary:
FILE_FLAG_WRITE_THROUGH is for disabling device on-board cache in windows API, which should be disabled if user doesn't need system cache.
There was a perf issue related with this, we found during memtable flush, the high percentile latency jumps significantly. During profiling, we found those high latency (P99.9) read requests got queue-jumped by write requests from memtable flush and takes 80ms or even more time to wait, even when SSD overall IO throughput is relatively low.

After enabling FILE_FLAG_WRITE_THROUGH, we rerun the test found high percentile latency drops a lot without observable impact on writes.

Scenario 1: 40MB/s + 40MB/s  R/W compaction throughput

 Original | FILE_FLAG_WRITE_THROUGH | Percentage reduction
---------------------------------------------------------------
P99.9 | 56.897 ms | 35.593 ms | -37.4%
P99 | 3.905 ms | 3.896 ms | -2.8%

Scenario 2:  14MB/s + 14MB/s R/W compaction throughput, cohosted with 100+ other rocksdb instances have manually triggered memtable flush operations (memtable is tiny), creating a lot of randomized the small file writes operations during test.

Original | FILE_FLAG_WRITE_THROUGH | Percentage reduction
---------------------------------------------------------------
P99.9 | 86.227   ms | 50.436 ms | -41.5%
P99 | 8.415   ms | 3.356 ms | -60.1%
Closes https://github.com/facebook/rocksdb/pull/3225

Differential Revision: D6624174

Pulled By: miasantreble

fbshipit-source-id: 321b86aee9d74470840c70e5d0d4fa9880660a91
2017-12-21 18:41:34 -08:00
Dmitri Smirnov fe608e32ab Fix a race condition in WindowsThread (port::Thread)
Summary:
Fix a race condition when we create a thread and immediately destroy
 This case should be supported.
  What happens is that the thread function needs the Data instance
  to actually run but has no shared ownership and must rely on the
  WindowsThread instance to continue existing.
  To address this we change unique_ptr to shared_ptr and then
  acquire an additional refcount for the threadproc which destroys it
  just before the thread exit.
  We choose to allocate shared_ptr instance on the heap as this allows
  the original thread to continue w/o waiting for the new thread to start
  running.
Closes https://github.com/facebook/rocksdb/pull/3240

Differential Revision: D6511324

Pulled By: yiwu-arbug

fbshipit-source-id: 4633ff7996daf4d287a9fe34f60c1dd28cf4ff36
2017-12-07 13:42:53 -08:00
Shaohua Li 33c7d4ccd9 Make writable_file_max_buffer_size dynamic
Summary:
The DBOptions::writable_file_max_buffer_size can be changed dynamically.
Closes https://github.com/facebook/rocksdb/pull/3053

Differential Revision: D6152720

Pulled By: shligit

fbshipit-source-id: aa0c0cfcfae6a54eb17faadb148d904797c68681
2017-10-31 13:56:35 -07:00
Dmitri Smirnov 682db81385 Enable cacheline_aligned_alloc() to allocate from jemalloc if enabled.
Summary:
Reuse WITH_JEMALLOC option in preparation for module search unification.
  Move jemalloc overrides into a separate .cc
  Remote obsolete JEMALLOC_NOINIT option.
Closes https://github.com/facebook/rocksdb/pull/3078

Differential Revision: D6174826

Pulled By: yiwu-arbug

fbshipit-source-id: 9970a0289b4490272d15853920d9d7531af91140
2017-10-27 13:27:12 -07:00
Dmitri Smirnov d2a65c59e1 Fix unused var warnings in Release mode
Summary:
MSVC does not support unused attribute at this time. A separate assignment line fixes the issue probably by being counted as usage for MSVC and it no longer complains about unused var.
Closes https://github.com/facebook/rocksdb/pull/3048

Differential Revision: D6126272

Pulled By: maysamyabandeh

fbshipit-source-id: 4907865db45fd75a39a15725c0695aaa17509c1f
2017-10-23 14:27:04 -07:00