rocksdb

Commit Graph

Author	SHA1	Message	Date
Levi Tamasi	8dd4bf6cff	Separate the handling of value types in SaveValue (#10840 ) Summary: Currently, the code in `SaveValue` that handles `kTypeValue` and `kTypeBlobIndex` (and more recently, `kTypeWideColumnEntity`) is mostly shared. This made sense originally; however, by now the handling of these three value types has diverged significantly. The patch makes the logic cleaner and also eliminates quite a bit of branching by giving each value type its own `case` and removing a fall-through. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10840 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D40568420 Pulled By: ltamasi fbshipit-source-id: 2e614606afd1c3d9c76d9b5f1efa0959fc174103	2022-10-21 10:05:46 -07:00
dependabot[bot]	2564215e35	Bump nokogiri from 1.13.6 to 1.13.9 in /docs (#10842 ) Summary: Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.6 to 1.13.9. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/releases">nokogiri's releases</a>.</em></p> <blockquote> <h2>1.13.9 / 2022-10-18</h2> <h3>Security</h3> <ul> <li>[CRuby] Vendored libxml2 is updated to address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-2309">CVE-2022-2309</a>, <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-40304">CVE-2022-40304</a>, and <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-40303">CVE-2022-40303</a>. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-2qc6-mcvw-92cw">GHSA-2qc6-mcvw-92cw</a> for more information.</li> <li>[CRuby] Vendored zlib is updated to address <a href="https://ubuntu.com/security/CVE-2022-37434">CVE-2022-37434</a>. Nokogiri was not affected by this vulnerability, but this version of zlib was being flagged up by some vulnerability scanners, see <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2626">https://github.com/facebook/rocksdb/issues/2626</a> for more information.</li> </ul> <h3>Dependencies</h3> <ul> <li>[CRuby] Vendored libxml2 is updated to <a href="https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.10.3">v2.10.3</a> from v2.9.14.</li> <li>[CRuby] Vendored libxslt is updated to <a href="https://gitlab.gnome.org/GNOME/libxslt/-/releases/v1.1.37">v1.1.37</a> from v1.1.35.</li> <li>[CRuby] Vendored zlib is updated from 1.2.12 to 1.2.13. (See <a href="https://github.com/sparklemotion/nokogiri/blob/v1.13.x/LICENSE-DEPENDENCIES.md#platform-releases">LICENSE-DEPENDENCIES.md</a> for details on which packages redistribute this library.)</li> </ul> <h3>Fixed</h3> <ul> <li>[CRuby] <code>Nokogiri::XML::Namespace</code> objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2658">https://github.com/facebook/rocksdb/issues/2658</a>] (Thanks, <a href="https://github.com/eightbitraptor"><code>@eightbitraptor</code></a> and <a href="https://github.com/peterzhu2118"><code>@peterzhu2118</code></a>!)</li> <li>[CRuby] <code>Document#remove_namespaces!</code> now defers freeing the underlying <code>xmlNs</code> struct until the <code>Document</code> is GCed. Previously, maintaining a reference to a <code>Namespace</code> object that was removed in this way could lead to a segfault. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2658">https://github.com/facebook/rocksdb/issues/2658</a>]</li> </ul> <hr /> <p>sha256 checksums:</p> <pre><code>9b69829561d30c4461ea803baeaf3460e8b145cff7a26ce397119577a4083a02 nokogiri-1.13.9-aarch64-linux.gem e76ebb4b7b2e02c72b2d1541289f8b0679fb5984867cf199d89b8ef485764956 nokogiri-1.13.9-arm64-darwin.gem 15bae7d08bddeaa898d8e3f558723300137c26a2dc2632a1f89c8574c4467165 nokogiri-1.13.9-java.gem f6a1dbc7229184357f3129503530af73cc59ceba4932c700a458a561edbe04b9 nokogiri-1.13.9-x64-mingw-ucrt.gem 36d935d799baa4dc488024f71881ff0bc8b172cecdfc54781169c40ec02cbdb3 nokogiri-1.13.9-x64-mingw32.gem ebaf82aa9a11b8fafb67873d19ee48efb565040f04c898cdce8ca0cd53ff1a12 nokogiri-1.13.9-x86-linux.gem 11789a2a11b28bc028ee111f23311461104d8c4468d5b901ab7536b282504154 nokogiri-1.13.9-x86-mingw32.gem 01830e1646803ff91c0fe94bc768ff40082c6de8cfa563dafd01b3f7d5f9d795 nokogiri-1.13.9-x86_64-darwin.gem 8e93b8adec22958013799c8690d81c2cdf8a90b6f6e8150ab22e11895844d781 nokogiri-1.13.9-x86_64-linux.gem 96f37c1baf0234d3ae54c2c89aef7220d4a8a1b03d2675ff7723565b0a095531 nokogiri-1.13.9.gem </code></pre> <h2>1.13.8 / 2022-07-23</h2> <h3>Deprecated</h3> <ul> <li><code>XML::Reader#attribute_nodes</code> is deprecated due to incompatibility between libxml2's <code>xmlReader</code> memory semantics and Ruby's garbage collector. Although this method continues to exist for backwards compatibility, it is unsafe to call and may segfault. This method will be removed in a future version of Nokogiri, and callers should use <code>#attribute_hash</code> instead. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2598">https://github.com/facebook/rocksdb/issues/2598</a>]</li> </ul> <h3>Improvements</h3> <ul> <li><code>XML::Reader#attribute_hash</code> is a new method to safely retrieve the attributes of a node from <code>XML::Reader</code>. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2598">https://github.com/facebook/rocksdb/issues/2598</a>, <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2599">https://github.com/facebook/rocksdb/issues/2599</a>]</li> </ul> <h3>Fixed</h3> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md">nokogiri's changelog</a>.</em></p> <blockquote> <h2>1.13.9 / 2022-10-18</h2> <h3>Security</h3> <ul> <li>[CRuby] Vendored libxml2 is updated to address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-2309">CVE-2022-2309</a>, <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-40304">CVE-2022-40304</a>, and <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-40303">CVE-2022-40303</a>. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-2qc6-mcvw-92cw">GHSA-2qc6-mcvw-92cw</a> for more information.</li> <li>[CRuby] Vendored zlib is updated to address <a href="https://ubuntu.com/security/CVE-2022-37434">CVE-2022-37434</a>. Nokogiri was not affected by this vulnerability, but this version of zlib was being flagged up by some vulnerability scanners, see <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2626">https://github.com/facebook/rocksdb/issues/2626</a> for more information.</li> </ul> <h3>Dependencies</h3> <ul> <li>[CRuby] Vendored libxml2 is updated to <a href="https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.10.3">v2.10.3</a> from v2.9.14.</li> <li>[CRuby] Vendored libxslt is updated to <a href="https://gitlab.gnome.org/GNOME/libxslt/-/releases/v1.1.37">v1.1.37</a> from v1.1.35.</li> <li>[CRuby] Vendored zlib is updated from 1.2.12 to 1.2.13. (See <a href="https://github.com/sparklemotion/nokogiri/blob/v1.13.x/LICENSE-DEPENDENCIES.md#platform-releases">LICENSE-DEPENDENCIES.md</a> for details on which packages redistribute this library.)</li> </ul> <h3>Fixed</h3> <ul> <li>[CRuby] <code>Nokogiri::XML::Namespace</code> objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2658">https://github.com/facebook/rocksdb/issues/2658</a>] (Thanks, <a href="https://github.com/eightbitraptor"><code>@eightbitraptor</code></a> and <a href="https://github.com/peterzhu2118"><code>@peterzhu2118</code></a>!)</li> <li>[CRuby] <code>Document#remove_namespaces!</code> now defers freeing the underlying <code>xmlNs</code> struct until the <code>Document</code> is GCed. Previously, maintaining a reference to a <code>Namespace</code> object that was removed in this way could lead to a segfault. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2658">https://github.com/facebook/rocksdb/issues/2658</a>]</li> </ul> <h2>1.13.8 / 2022-07-23</h2> <h3>Deprecated</h3> <ul> <li><code>XML::Reader#attribute_nodes</code> is deprecated due to incompatibility between libxml2's <code>xmlReader</code> memory semantics and Ruby's garbage collector. Although this method continues to exist for backwards compatibility, it is unsafe to call and may segfault. This method will be removed in a future version of Nokogiri, and callers should use <code>#attribute_hash</code> instead. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2598">https://github.com/facebook/rocksdb/issues/2598</a>]</li> </ul> <h3>Improvements</h3> <ul> <li><code>XML::Reader#attribute_hash</code> is a new method to safely retrieve the attributes of a node from <code>XML::Reader</code>. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2598">https://github.com/facebook/rocksdb/issues/2598</a>, <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2599">https://github.com/facebook/rocksdb/issues/2599</a>]</li> </ul> <h3>Fixed</h3> <ul> <li>[CRuby] Calling <code>XML::Reader#attributes</code> is now safe to call. In Nokogiri <= 1.13.7 this method may segfault. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2598">https://github.com/facebook/rocksdb/issues/2598</a>, <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2599">https://github.com/facebook/rocksdb/issues/2599</a>]</li> </ul> <h2>1.13.7 / 2022-07-12</h2> <h3>Fixed</h3> <p><code>XML::Node</code> objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2578">https://github.com/facebook/rocksdb/issues/2578</a>] (Thanks, <a href="https://github.com/eightbitraptor"><code>@eightbitraptor</code></a>!)</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`897759cc25`"><code>897759c</code></a> version bump to v1.13.9</li> <li><a href="`aeb1ac3283`"><code>aeb1ac3</code></a> doc: update CHANGELOG</li> <li><a href="`c663e4905a`"><code>c663e49</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2671">https://github.com/facebook/rocksdb/issues/2671</a> from sparklemotion/flavorjones-update-zlib-1.2.13_v1...</li> <li><a href="`212e07da28`"><code>212e07d</code></a> ext: hack to cross-compile zlib v1.2.13 on darwin</li> <li><a href="`76dbc8c5be`"><code>76dbc8c</code></a> dep: update zlib to v1.2.13</li> <li><a href="`24e3a9c414`"><code>24e3a9c</code></a> doc: update CHANGELOG</li> <li><a href="`4db3b4daa9`"><code>4db3b4d</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2668">https://github.com/facebook/rocksdb/issues/2668</a> from sparklemotion/flavorjones-namespace-scopes-comp...</li> <li><a href="`73d73d6e43`"><code>73d73d6</code></a> fix: Document#remove_namespaces! use-after-free bug</li> <li><a href="`5f58b34724`"><code>5f58b34</code></a> fix: namespace nodes behave properly when compacted</li> <li><a href="`b08a8586c7`"><code>b08a858</code></a> test: repro namespace_scopes compaction issue</li> <li>Additional commits viewable in <a href="https://github.com/sparklemotion/nokogiri/compare/v1.13.6...v1.13.9">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=nokogiri&package-manager=bundler&previous-version=1.13.6&new-version=1.13.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/facebook/rocksdb/network/alerts). </details> Pull Request resolved: https://github.com/facebook/rocksdb/pull/10842 Reviewed By: siying Differential Revision: D40579643 Pulled By: ajkr fbshipit-source-id: 45035f691035cdbb111dc0b36489c4e91fe31cae	2022-10-20 22:13:41 -07:00
Jay Zhuang	1663f77d2a	Fix no internal time recorded for small preclude_last_level (#10829 ) Summary: When the `preclude_last_level_data_seconds` or `preserve_internal_time_seconds` is smaller than 100 (seconds), no seqno->time information was recorded. Also make sure all data will be compacted to the last level even if there's no write to record the time information. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10829 Test Plan: added unittest Reviewed By: siying Differential Revision: D40443934 Pulled By: jay-zhuang fbshipit-source-id: 2ecf1361daf9f3e5c3385aee6dc924fa59e2813a	2022-10-20 17:11:38 -07:00
Levi Tamasi	865d5576ad	Support providing the default column separately when serializing columns (#10839 ) Summary: The patch makes it possible to provide the value of the default column separately when calling `WideColumnSerialization::Serialize`. This eliminates the need to construct a new `WideColumns` vector in certain cases (for example, it will come in handy when implementing `Merge`). Pull Request resolved: https://github.com/facebook/rocksdb/pull/10839 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D40561448 Pulled By: ltamasi fbshipit-source-id: 69becdd510e6a83ab1feb956c12772110e1040d6	2022-10-20 16:00:58 -07:00
Andrew Kryczka	33ceea9b76	Add DB property for fast block cache stats collection (#10832 ) Summary: This new property allows users to trigger the background block cache stats collection mode through the `GetProperty()` and `GetMapProperty()` APIs. The background mode has much lower overhead at the expense of returning stale values in more cases. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10832 Test Plan: updated unit test Reviewed By: pdillinger Differential Revision: D40497883 Pulled By: ajkr fbshipit-source-id: bdcc93402f426463abb2153756aad9e295447343	2022-10-20 15:04:29 -07:00
Peter Dillinger	7555243bcf	Refactor ShardedCache for more sharing, static polymorphism (#10801 ) Summary: The motivations for this change include * Free up space in ClockHandle so that we can add data for secondary cache handling while still keeping within single cache line (64 byte) size. * This change frees up space by eliminating the need for the `hash` field by making the fixed-size key itself a hash, using a 128-bit bijective (lossless) hash. * Generally more customizability of ShardedCache (such as hashing) without worrying about virtual call overheads * ShardedCache now uses static polymorphism (template) instead of dynamic polymorphism (virtual overrides) for the CacheShard. No obvious performance benefit is seen from the change (as mostly expected; most calls to virtual functions in CacheShard could already be optimized to static calls), but offers more flexibility without incurring the runtime cost of adhering to a common interface (without type parameters or static callbacks). * You'll also notice less `reinterpret_cast`ing and other boilerplate in the Cache implementations, as this can go in ShardedCache. More detail: * Don't have LRUCacheShard maintain `std::shared_ptr<SecondaryCache>` copies (extra refcount) when LRUCache can be in charge of keeping a `shared_ptr`. * Renamed `capacity_mutex_` to `config_mutex_` to better represent the scope of what it guards. * Some preparation for 64-bit hash and indexing in LRUCache, but didn't include the full change because of slight performance regression. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10801 Test Plan: Unit test updates were non-trivial because of major changes to the ClockCacheShard interface in handling of key vs. hash. Performance: Create with `TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=30000000 -disable_wal=1 -bloom_bits=16` Test with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=readrandom[-X1000] -readonly -num=30000000 -bloom_bits=16 -cache_index_and_filter_blocks=1 -cache_size=610000000 -duration 20 -threads=16 ``` Before: `readrandom [AVG 150 runs] : 321147 (± 253) ops/sec` After: `readrandom [AVG 150 runs] : 321530 (± 326) ops/sec` So possibly ~0.1% improvement. And with `-cache_type=hyper_clock_cache`: Before: `readrandom [AVG 30 runs] : 614126 (± 7978) ops/sec` After: `readrandom [AVG 30 runs] : 645349 (± 8087) ops/sec` So roughly 5% improvement! Reviewed By: anand1976 Differential Revision: D40252236 Pulled By: pdillinger fbshipit-source-id: ff8fc70ef569585edc95bcbaaa0386f61355ae5b	2022-10-18 22:06:57 -07:00
Yueh-Hsuan Chiang	e267909ecf	Enable a multi-level db to smoothly migrate to FIFO via DB::Open (#10348 ) Summary: FIFO compaction can theoretically open a DB with any compaction style. However, the current code only allows FIFO compaction to open a DB with a single level. This PR relaxes the limitation of FIFO compaction and allows it to open a DB with multiple levels. Below is the read / write / compaction behavior: * The read behavior is untouched, and it works like a regular rocksdb instance. * The write behavior is untouched as well. When a FIFO compacted DB is opened with multiple levels, all new files will still be in level 0, and no files will be moved to a different level. * Compaction logic is extended. It will first identify the bottom-most non-empty level. Then, it will delete the oldest file in that level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10348 Test Plan: Added a new test to verify the migration from level to FIFO where the db has multiple levels. Extended existing test cases in db_test and db_basic_test to also verify all entries of a key after reopening the DB with FIFO compaction. Reviewed By: jay-zhuang Differential Revision: D40233744 fbshipit-source-id: 6cc011d6c3467e6bfb9b6a4054b87619e69815e1	2022-10-18 14:38:13 -07:00
Peter Dillinger	e466173d5c	Print stack traces on frozen tests in CI (#10828 ) Summary: Instead of existing calls to ps from gnu_parallel, call a new wrapper that does ps, looks for unit test like processes, and uses pstack or gdb to print thread stack traces. Also, using `ps -wwf` instead of `ps -wf` ensures output is not cut off. For security, CircleCI runs with security restrictions on ptrace (/proc/sys/kernel/yama/ptrace_scope = 1), and this change adds a work-around to `InstallStackTraceHandler()` (only used by testing tools) to allow any process from the same user to debug it. (I've also touched >100 files to ensure all the unit tests call this function.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/10828 Test Plan: local manual + temporary infinite loop in a unit test to observe in CircleCI Reviewed By: hx235 Differential Revision: D40447634 Pulled By: pdillinger fbshipit-source-id: 718a4c4a5b54fa0f9af2d01a446162b45e5e84e1	2022-10-18 00:35:35 -07:00
Peter Dillinger	8367f0d2d7	Improve / refactor anonymous mmap capabilities (#10810 ) Summary: The motivation for this change is a planned feature (related to HyperClockCache) that will depend on a large array that can essentially grow automatically, up to some bound, without the pointer address changing and with guaranteed zero-initialization of the data. Anonymous mmaps provide such functionality, and this change provides an internal API for that. The other existing use of anonymous mmap in RocksDB is for allocating in huge pages. That code and other related Arena code used some awkward non-RAII and pre-C++11 idioms, so I cleaned up much of that as well, with RAII, move semantics, constexpr, etc. More specifcs: * Minimize conditional compilation * Add Windows support for anonymous mmaps * Use std::deque instead of std::vector for more efficient bag Pull Request resolved: https://github.com/facebook/rocksdb/pull/10810 Test Plan: unit test added for new functionality Reviewed By: riversand963 Differential Revision: D40347204 Pulled By: pdillinger fbshipit-source-id: ca83fcc47e50fabf7595069380edd2954f4f879c	2022-10-17 17:10:16 -07:00
Levi Tamasi	11c0d1310e	Do not adjust test_batches_snapshots to avoid mixing runs (#10830 ) Summary: This is a small follow-up to https://github.com/facebook/rocksdb/pull/10821. The goal of that PR was to hold `test_batches_snapshots` fixed across all `db_stress` invocations; however, that patch didn't address the case when `test_batches_snapshots` is unset due to a conflicting `enable_compaction_filter` or `prefix_size` setting. This PR updates the logic so the other parameter is sanitized instead in the case of such conflicts. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10830 Reviewed By: riversand963 Differential Revision: D40444548 Pulled By: ltamasi fbshipit-source-id: 0331265704904b729262adec37139292fcbb7805	2022-10-17 14:32:59 -07:00
Peter Dillinger	8142223b1b	Git ignore .clangd/ (#10817 ) Summary: Used for IDE integration Pull Request resolved: https://github.com/facebook/rocksdb/pull/10817 Test Plan: CI Reviewed By: riversand963 Differential Revision: D40348563 Pulled By: pdillinger fbshipit-source-id: ae2151017de7df6afc55363276105a7dac53683c	2022-10-17 08:33:58 -07:00
Jay Zhuang	8124bc3526	Enable preclude_last_level_data_seconds in stress test (#10824 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10824 Reviewed By: siying Differential Revision: D40390535 Pulled By: jay-zhuang fbshipit-source-id: 700803a1aff8a1e77c038740d87931577e79bcf6	2022-10-16 09:28:43 -07:00
Levi Tamasi	2f3042d732	Check wide columns in TestIterateAgainstExpected (#10820 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10820 Reviewed By: riversand963 Differential Revision: D40363653 Pulled By: ltamasi fbshipit-source-id: d347547d8cdd3f8926b35b6af4d1fa0f827e4a10	2022-10-14 14:25:05 -07:00
Levi Tamasi	3cd78bce1e	Temporarily disable mixing batched and non-batched runs (#10821 ) Summary: We have recently made some stress test improvements that rely on decoding the "value base" from the values stored in the database. This logic does not currently support the case when some KVs are written by a non-batched ops run and some by a batched ops run. The patch temporarily disables mixing these two. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10821 Reviewed By: riversand963 Differential Revision: D40367326 Pulled By: ltamasi fbshipit-source-id: 66f2e0cbc097ab6b1f9e4b39b833bd466f1aaab5	2022-10-13 18:00:30 -07:00
Levi Tamasi	eae3a686ee	Check wide columns in TestIterate (#10818 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10818 Test Plan: Tested using some simple blackbox crash test runs in the various modes (non-batched, batched, CF consistency). Reviewed By: riversand963 Differential Revision: D40349527 Pulled By: ltamasi fbshipit-source-id: 2918bc26adbbeac314beaa958aafe770b01e5cc6	2022-10-13 12:06:36 -07:00
Peter Dillinger	1ee747d795	Deflake^2 DBBloomFilterTest.OptimizeFiltersForHits (#10816 ) Summary: This reverts https://github.com/facebook/rocksdb/issues/10792 and uses a different strategy to stabilize the test: remove the unnecessary randomness by providing a constant seed for shuffling keys. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10816 Test Plan: `gtest-parallel ./db_bloom_filter_test -r1000 --gtest_filter=ForHits` Reviewed By: jay-zhuang Differential Revision: D40347957 Pulled By: pdillinger fbshipit-source-id: a270e157485cbd94ed03b80cdd21b954ebd57d57	2022-10-13 09:08:09 -07:00
Peter Dillinger	a2eea18fc9	Fix file modes (#10815 ) Summary: *.sh files need execute permission. Benchmark-linux failing in CircleCI due to https://github.com/facebook/rocksdb/issues/10803 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10815 Test Plan: CI Reviewed By: ltamasi Differential Revision: D40346922 Pulled By: pdillinger fbshipit-source-id: 658f185b5d2e906ee50e1de1b12f27fa9968ba5d	2022-10-13 09:00:37 -07:00
Mark Callaghan	6ff0c204cb	Several small improvements (#10803 ) Summary: This has several small improvements. benchmark.sh * add BYTES_PER_SYNC as an env variable * use --prepopulate_block_cache when O_DIRECT is used * use --undefok to list options that don't work for all 7.x releases * print "failure" in report.tsv when a benchmark fails * parse the slightly different throughput line used by db_bench for multireadrandom * remove the trailing comma for BlobDB size before printing it in report.tsv * use the last line of the output from /bin/time as there can be more than one line when db_bench has a non-zero exit * fix more bash lint warnings * add ",stats" to the --benchmark=... lines to get stats at the end of each benchmark benchmark_compare.sh * run revrange immediately after fillseq to let compaction debt get removed * add --multiread_batched when --benchmarks=multireadrandom is used * use --benchmarks=overwriteandwait when supported to get a more accurate measure of write-amp Pull Request resolved: https://github.com/facebook/rocksdb/pull/10803 Test Plan: Run it for leveled, universal and BlobDB Reviewed By: jay-zhuang Differential Revision: D40278315 Pulled By: mdcallag fbshipit-source-id: 793134ddc7d48d05a07436cd8942c375a23983a7	2022-10-12 15:13:28 -07:00
Levi Tamasi	23b7dc2f4f	Check columns in CfConsistencyStressTest::VerifyDb (#10804 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10804 Reviewed By: riversand963 Differential Revision: D40279057 Pulled By: ltamasi fbshipit-source-id: 9efc3dae7f5eaab162d55a41c58c2535b0a53054	2022-10-12 11:43:34 -07:00
Levi Tamasi	85399b14f7	Consider wide columns when checksumming in the stress tests (#10788 ) Summary: There are two places in the stress test code where we compute the CRC for a range of KVs for the purposes of checking consistency, namely in the CF consistency test (to make sure CFs contain the same data), and when performing `CompactRange` (to make sure the pre- and post-compaction states are equivalent). The patch extends the logic so that wide columns are also considered in both cases. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10788 Test Plan: Tested using some simple blackbox crash test runs. Reviewed By: riversand963 Differential Revision: D40191134 Pulled By: ltamasi fbshipit-source-id: 542c21cac9077c6d225780deb210319bb5eee955	2022-10-11 14:40:25 -07:00
Jay Zhuang	5a5f21c489	Allow the last level data moving up to penultimate level (#10782 ) Summary: Lock the penultimate level for the whole compaction inputs range, so any key in that compaction is safe to move up from the last level to penultimate level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10782 Reviewed By: siying Differential Revision: D40231540 Pulled By: siying fbshipit-source-id: ca115cc8b4018b35d797329fa85a19b06cc8c13e	2022-10-10 22:50:34 -07:00
Peter Dillinger	2d0380adbe	Allow manifest fix-up without requiring prior state (#10796 ) Summary: This change is motivated by ensuring that `ldb update_manifest` or `UpdateManifestForFilesState` can run without expecting files to open when the old temperature is provided (in case the FileSystem strictly interprets non-kUnknown), but ended up fixing a problem in `OfflineManifestWriter` (used by `ldb unsafe_remove_sst_file`) where it would open some SST files during recovery and expect them to match the prior manifest state, even if not required by the intended new state. Also update BackupEngine to retry with Temperature kUnknown when reading file with potentially "wrong" temperature. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10796 Test Plan: tests added/updated, that fail before the change(s) and now pass Reviewed By: jay-zhuang Differential Revision: D40232645 Pulled By: jay-zhuang fbshipit-source-id: b5aa2688aecfe0c320b80a7da689b315414c20be	2022-10-10 17:59:17 -07:00
Hui Xiao	f6a0065d54	Allow Flush(sync=true) not supported in DB::Open() and db_stress (#10784 ) Summary: Context: https://github.com/facebook/rocksdb/pull/10698 made `Flush(sync=true)` required for` DB::Open()` (to pass the original but now deleted assertion `impl->TEST_WALBufferIsEmpty()` under `manual_wal_flush=true`, see https://github.com/facebook/rocksdb/pull/10698 summary for more ) as well as db_stress to pass. However RocksDB users may not implement SyncWAL() (used inFlush(sync=true)). Therefore we replace such in DB::Open and db_stress in this PR and align with https://github.com/facebook/rocksdb/blob/main/db/db_impl/db_impl_open.cc#L1883-L1887 and https://github.com/facebook/rocksdb/blob/main/db_stress_tool/db_stress_test_base.cc#L847-L849 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10784 Test Plan: make check Reviewed By: anand1976 Differential Revision: D40193354 Pulled By: anand1976 fbshipit-source-id: e80d53880799ae01bdd717641d07997d3bfe2b54	2022-10-10 15:52:10 -07:00
akankshamahajan	ebf8c454fd	Provide support for async_io with tailing iterators (#10781 ) Summary: Provide support for async_io if ReadOptions.tailing is set true. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10781 Test Plan: - Update unit tests - Ran db_bench: ./db_bench --benchmarks="readrandom" --use_existing_db --use_tailing_iterator=1 --async_io=1 Reviewed By: anand1976 Differential Revision: D40128882 Pulled By: anand1976 fbshipit-source-id: 55e17855536871a5c47e2de92d238ae005c32d01	2022-10-10 15:48:48 -07:00
Levi Tamasi	5182bf3f83	Skip column validation for non-value types when iter_start_ts is set (#10799 ) Summary: When the `iter_start_ts` read option is set, iterator exposes internal keys. This also includes tombstones, which by definition do not have a value (or columns). The patch makes sure we skip the wide-column consistency check in this case. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10799 Test Plan: Tested using a simple blackbox crash test with timestamps enabled. Reviewed By: jay-zhuang, riversand963 Differential Revision: D40235628 fbshipit-source-id: 49519fb55d8fe2bb9249ced809f7a81bff2b9df2	2022-10-10 15:07:07 -07:00
Changyu Bi	a6ce1955b1	Fix flaky test ShuttingDownNotBlockStalledWrites (#10800 ) Summary: DBTest::ShuttingDownNotBlockStalledWrites is flaky, added new sync point dependency to fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10800 Test Plan: gtest-parallel --repeat=1000 ./db_test --gtest_filter="*ShuttingDownNotBlockStalledWrites" Reviewed By: jay-zhuang Differential Revision: D40239116 Pulled By: jay-zhuang fbshipit-source-id: 8c2d7e7df58f202d287bd9f5c9b60b7eff270d0c	2022-10-10 13:58:55 -07:00
Jay Zhuang	62ba5c8034	Deflake DBBloomFilterTest.OptimizeFiltersForHits (#10792 ) Summary: The test may fail because the L5 files may only cover small portion of the whole key range. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10792 Test Plan: ``` gtest-parallel ./db_bloom_filter_test --gtest_filter=DBBloomFilterTest.OptimizeFiltersForHits -r 1000 -w 100 ``` Reviewed By: siying Differential Revision: D40217600 Pulled By: siying fbshipit-source-id: 18db549184bccf5e513eaa7e31ab17385b71ef71	2022-10-10 12:34:25 -07:00
anand76	fac7a31c95	Fix a few errors in async IO blog post (#10795 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10795 Reviewed By: jay-zhuang, akankshamahajan15 Differential Revision: D40229329 fbshipit-source-id: 7ec5347e0a8a52f80a0a9cc2a0c17b094736d6d9	2022-10-10 10:47:07 -07:00
Qingping Wang	a45e6878f3	fix issue 10751 (#10765 ) Summary: Fix https://github.com/facebook/rocksdb/issues/10751 where a stalled write could be blocked forever when DB shutdown. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10765 Reviewed By: ajkr Differential Revision: D40110069 Pulled By: ajkr fbshipit-source-id: 598c05777db9be85913a0a85e421b3295ecdff5e	2022-10-10 09:46:09 -07:00
Jay Zhuang	c401f285c3	Add option `preserve_internal_time_seconds` to preserve the time info (#10747 ) Summary: Add option `preserve_internal_time_seconds` to preserve the internal time information. It's mostly for the migration of the existing data to tiered storage ( `preclude_last_level_data_seconds`). When the tiering feature is just enabled, the existing data won't have the time information to decide if it's hot or cold. Enabling this feature will start collect and preserve the time information for the new data. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10747 Reviewed By: siying Differential Revision: D39910141 Pulled By: siying fbshipit-source-id: 25c21638e37b1a7c44006f636b7d714fe7242138	2022-10-07 18:49:40 -07:00
anand76	f366f90bdb	Blog post for asynchronous IO (#10789 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10789 Reviewed By: akankshamahajan15 Differential Revision: D40198988 Pulled By: akankshamahajan15 fbshipit-source-id: 5db74f12dd8854f6288fbbf8775c8e759778c307	2022-10-07 17:42:48 -07:00
Yanqin Jin	11943e8b27	Exclude timestamp when checking compaction boundaries (#10787 ) Summary: When checking if a range [start, end) overlaps with a compaction whose range is [start1, end1), always exclude timestamp from start, end, start1 and end1, otherwise some versions of one user key may be compacted to bottommost layer while others remain in the original level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10787 Test Plan: make check Reviewed By: ltamasi Differential Revision: D40187672 Pulled By: ltamasi fbshipit-source-id: 81226267fd3e33ffa79665c62abadf2ebec45496	2022-10-07 14:11:23 -07:00
Levi Tamasi	7af47c532b	Verify wide columns during prefix scan in stress tests (#10786 ) Summary: The patch adds checks to the `{NonBatchedOps,BatchedOps,CfConsistency}StressTest::TestPrefixScan` methods to make sure the wide columns exposed by the iterators are as expected (based on the value base encoded into the iterator value). It also makes some code hygiene improvements in these methods. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10786 Test Plan: Ran some simple blackbox tests in the various modes (non-batched, batched, CF consistency). Reviewed By: riversand963 Differential Revision: D40163623 Pulled By: riversand963 fbshipit-source-id: 72f4c3b51063e48c15f974c4ec64d751d3ed0a83	2022-10-07 11:17:57 -07:00
Yanqin Jin	943247b76e	Expand stress test coverage for min_write_buffer_number_to_merge (#10785 ) Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10785 Test Plan: CI Reviewed By: ltamasi Differential Revision: D40162583 Pulled By: ltamasi fbshipit-source-id: 4e01f9b682f397130e286cf5d82190b7973fa3c1	2022-10-06 18:08:19 -07:00
Jay Zhuang	23fa5b7789	Use `sstableKeyCompare()` for compaction output boundary check (#10763 ) Summary: To make it consistent with the compaction picker which uses the `sstableKeyCompare()` to pick the overlap files. For example, without this change, it may cut L1 files like: ``` L1: [2-21] [22-30] L2: [1-10] [21-30] ``` Because "21" on L1 is smaller than "21" on L2. But for compaction, these 2 files are overlapped. `sstableKeyCompare()` also take range delete into consideration which may cut file for the same key. It also makes the `max_compaction_bytes` calculation more accurate for cases like above, the overlapped bytes was under estimated. Also make sure the 2 keys won't be splitted to 2 files because of reaching `max_compaction_bytes`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10763 Reviewed By: cbi42 Differential Revision: D39971904 Pulled By: cbi42 fbshipit-source-id: bcc309e9c3dc61a8f50667a6f633e6132c0154a8	2022-10-06 15:54:58 -07:00
Levi Tamasi	d6d8c007ff	Verify columns in NonBatchedOpsStressTest::VerifyDb (#10783 ) Summary: As the first step of covering the wide-column functionality of iterators in our stress tests, the patch adds verification logic to `NonBatchedOpsStressTest::VerifyDb` that checks whether the iterator's value and columns are in sync. Note: I plan to update the other types of stress tests and add similar verification for prefix scans etc. in separate PRs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10783 Test Plan: Ran some simple blackbox crash tests. Reviewed By: riversand963 Differential Revision: D40152370 Pulled By: riversand963 fbshipit-source-id: 8f9d17d7af5da58ccf1bd2057cab53cc9645ac35	2022-10-06 15:07:16 -07:00
Peter Dillinger	b205c6d029	Fix bug in HyperClockCache ApplyToEntries; cleanup (#10768 ) Summary: We have seen some rare crash test failures in HyperClockCache, and the source could certainly be a bug fixed in this change, in ClockHandleTable::ConstApplyToEntriesRange. It wasn't properly accounting for the fact that incrementing the acquire counter could be ineffective, due to parallel updates. (When incrementing the acquire counter is ineffective, it is incorrect to then decrement it.) This change includes some other minor clean-up in HyperClockCache, and adds stats_dump_period_sec with a much lower period to the crash test. This should be the primary caller of ApplyToEntries, in collecting cache entry stats. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10768 Test Plan: haven't been able to reproduce the failure, but should be in a better state (bug fix and improved crash test) Reviewed By: anand1976 Differential Revision: D40034747 Pulled By: anand1976 fbshipit-source-id: a06fcefe146e17ee35001984445cedcf3b63eb68	2022-10-06 14:54:21 -07:00
Andrew Kryczka	f461e064ed	Address feedback on recent recovery testing blog post (#10780 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10780 Reviewed By: hx235 Differential Revision: D40120327 Pulled By: hx235 fbshipit-source-id: 08b43a11cee11743b4428dd2a9aff44270668e05	2022-10-05 15:31:04 -07:00
Yanqin Jin	4d82b94896	Sanitize min_write_buffer_number_to_merge to 1 with atomic_flush (#10773 ) Summary: With current implementation, within the same RocksDB instance, all column families with non-empty memtables will be scheduled for flush if RocksDB determines that any column family needs to be flushed, e.g. memtable full, write buffer manager, etc., if atomic flush is enabled. Not doing so can lead to data loss and inconsistency when WAL is disabled, which is a common setting when atomic flush is enabled. Therefore, setting a per-column-family knob, min_write_buffer_number_to_merge to a value greater than 1 is not compatible with atomic flush, and should be sanitized during column family creation and db open. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10773 Test Plan: Reproduce: D39993203 has detailed steps. Run the test with and without the fix. Reviewed By: cbi42 Differential Revision: D40077955 Pulled By: cbi42 fbshipit-source-id: 451a9179eb531ac42eaccf40b451b9dec4085240	2022-10-05 12:24:39 -07:00
Changyu Bi	eca47fb696	Ignore kBottommostFiles compaction logic when allow_ingest_behind (#10767 ) Summary: fix for https://github.com/facebook/rocksdb/issues/10752 where RocksDB could be in an infinite compaction loop (with compaction reason kBottommostFiles) if allow_ingest_behind is enabled and the bottommost level is unfilled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10767 Test Plan: Added a unit test to reproduce the compaction loop. Reviewed By: ajkr Differential Revision: D40031861 Pulled By: ajkr fbshipit-source-id: 71c4b02931fbe507a847632905404c9b8fa8c96b	2022-10-05 09:27:14 -07:00
Andrew Kryczka	00d697bdc5	blog post: Verifying crash-recovery with lost buffered writes (#10775 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10775 Reviewed By: hx235 Differential Revision: D40090300 Pulled By: hx235 fbshipit-source-id: 1358f0a4a1583b49548305cfd1477e520c8985ba	2022-10-04 23:24:54 -07:00
Changyu Bi	ffde463a5f	Cleanup SuperVersion in Iterator::Refresh() (#10770 ) Summary: Fix a bug in Iterator::Refresh() where the local SV it obtained could be obsolete upon return, and should be cleaned up. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10770 Test Plan: added a unit test to reproduce the issue. Reviewed By: ajkr Differential Revision: D40063809 Pulled By: ajkr fbshipit-source-id: 619e728eb0f1ac9540b4d0ad38e43acc37a514b2	2022-10-04 22:23:24 -07:00
Yanqin Jin	edda219fc3	Manual flush with `wait=false` should not stall when writes stopped (#10001 ) Summary: When `FlushOptions::wait` is set to false, manual flush should not stall forever. If the database has already stopped writes, then the thread calling `DB::Flush()` with `FlushOptions::wait=false` should not enter the `DBImpl::write_thread_`. To prevent this, we should do a check at the beginning and return `TryAgain()` Resolves: https://github.com/facebook/rocksdb/issues/9892 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10001 Reviewed By: siying Differential Revision: D36422303 Pulled By: siying fbshipit-source-id: 723bd3065e8edc4f17c82449d0d6b95a2381ac0a	2022-10-04 16:43:01 -07:00
Jay Zhuang	f007ad8b4f	RoundRobin TTL compaction (#10725 ) Summary: For RoundRobin compaction, the data should be mostly sorted per level and within level. Use normal compaction picker for RR until all expired data is compacted. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10725 Reviewed By: ajkr Differential Revision: D39771069 Pulled By: jay-zhuang fbshipit-source-id: 7ccf88d7c093fad5673bda73a7b08cc4757780cd	2022-10-04 14:53:32 -07:00
Varun Sharma	626eaa4189	ci: add GitHub token permissions for workflow (#10549 ) Summary: This PR adds minimum token permissions for the GITHUB_TOKEN in GitHub Actions workflows using https://github.com/step-security/secure-workflows. GitHub recommends defining minimum GITHUB_TOKEN permissions for securing GitHub Actions workflows - https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/ - https://docs.github.com/en/actions/security-guides/automatic-token-authentication#modifying-the-permissions-for-the-github_token - The Open Source Security Foundation (OpenSSF) [Scorecards](https://github.com/ossf/scorecard) treats not setting token permissions as a high-risk issue This project is part of the top 100 critical projects as per OpenSSF (https://github.com/ossf/wg-securing-critical-projects), so fixing the token permissions to improve security. Before the change: `GITHUB_TOKEN` has `write` permissions for multiple scopes, e.g. https://github.com/facebook/rocksdb/runs/7936368166?check_suite_focus=true#step:1:19 After the change: `GITHUB_TOKEN` will have minimum permissions needed for the jobs. Signed-off-by: Varun Sharma <varunsh@stepsecurity.io> Pull Request resolved: https://github.com/facebook/rocksdb/pull/10549 Reviewed By: ajkr Differential Revision: D38923184 Pulled By: jay-zhuang fbshipit-source-id: 0c48f98fe90665e53724f57a7d3b01dd80f34a93	2022-10-04 12:10:30 -07:00
Peter Dillinger	5f4391dda2	Some clean-up of secondary cache (#10730 ) Summary: This is intended as a step toward possibly separating secondary cache integration from the Cache implementation as much as possible, to (hopefully) minimize code duplication in adding secondary cache support to HyperClockCache. * Major clarifications to API docs of secondary cache compatible parts of Cache. For example, previously the docs seemed to suggest that Wait() was not needed if IsReady()==true. And it wasn't clear what operations were actually supported on pending handles. * Add some assertions related to these requirements, such as that we don't Release() before Wait() (which would leak a secondary cache handle). * Fix a leaky abstraction with dummy handles, which are supposed to be internal to the Cache. Previously, these just used value=nullptr to indicate dummy handle, which meant that they could be confused with legitimate value=nullptr cases like cache reservations. Also fixed blob_source_test which was relying on this leaky abstraction. * Drop "incomplete" terminology, which was another name for "pending". * Split handle flags into "mutable" ones requiring mutex and "immutable" ones which do not. Because of single-threaded access to pending handles, the "Is Pending" flag can be in the "immutable" set. This allows removal of a TSAN work-around and removing a mutex acquire-release in IsReady(). * Remove some unnecessary handling of charges on handles of failed lookups. Keeping total_charge=0 means no special handling needed. (Removed one unnecessary mutex acquire/release.) * Simplify handling of dummy handle in Lookup(). There is no need to explicitly Ref & Release w/Erase if we generally overwrite the dummy anyway. (Removed one mutex acquire/release, a call to Release().) Intended follow-up: * Clarify APIs in secondary_cache.h * Doesn't SecondaryCacheResultHandle transfer ownership of the Value() on success (implementations should not release the value in destructor)? * Does Wait() need to be called if IsReady() == true? (This would be different from Cache.) * Do Value() and Size() have undefined behavior if IsReady() == false? * Why have a custom API for what is essentially a std::future<std::pair<void, size_t>>? Improve unit testing of standalone handle case * Apparent null `e` bug in `free_standalone_handle` case * Clean up secondary cache testing in lru_cache_test * Why does TestSecondaryCacheResultHandle hold on to a Cache::Handle? * Why does TestSecondaryCacheResultHandle::Wait() do nothing? Shouldn't it establish the post-condition IsReady() == true? * (Assuming that is sorted out...) Shouldn't TestSecondaryCache::WaitAll simply wait on each handle in order (no casting required)? How about making that the default implementation? * Why does TestSecondaryCacheResultHandle::Size() check Value() first? If the API is intended to be returning 0 before IsReady(), then that is weird but should at least be documented. Otherwise, if it's intended to be undefined behavior, we should assert IsReady(). * Consider replacing "standalone" and "dummy" entries with a single kind of "weak" entry that deletes its value when it reaches zero refs. Suppose you are using compressed secondary cache and have two iterators at similar places. It will probably common for one iterator to have standalone results pinned (out of cache) when the second iterator needs those same blocks and has to re-load them from secondary cache and duplicate the memory. Combining the dummy and the standalone should fix this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10730 Test Plan: existing tests (minor update), and crash test with sanitizers and secondary cache Performance test for any regressions in LRUCache (primary only): Create DB with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=30000000 -disable_wal=1 -bloom_bits=16 ``` Test before & after (run at same time) with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=readrandom[-X100] -readonly -num=30000000 -bloom_bits=16 -cache_index_and_filter_blocks=1 -cache_size=233000000 -duration 30 -threads=16 ``` Before: readrandom [AVG 100 runs] : 22234 (± 63) ops/sec; 1.6 (± 0.0) MB/sec After: readrandom [AVG 100 runs] : 22197 (± 64) ops/sec; 1.6 (± 0.0) MB/sec That's within 0.2%, which is not significant by the confidence intervals. Reviewed By: anand1976 Differential Revision: D39826010 Pulled By: anand1976 fbshipit-source-id: 3202b4a91f673231c97648ae070e502ae16b0f44	2022-10-03 22:23:38 -07:00
Levi Tamasi	3ae00dec90	Disable ingestion in stress tests when PutEntity is used (#10769 ) Summary: `SstFileWriter` currently does not support the `PutEntity` API, so in `TestIngestExternalFile` all key-values are written using regular `Put`s. This violates the assumption that whether or not a key corresponds to a plain old key-value or a wide-column entity can be determined by solely looking at the "value base" used when generating the value. The patch fixes this issue by disabling ingestion when `PutEntity` is enabled in the stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10769 Test Plan: Ran a simple blackbox stress test. Reviewed By: akankshamahajan15 Differential Revision: D40042132 Pulled By: ltamasi fbshipit-source-id: 93e75ff55545b7b69fa4ddef1d96093c961158a0	2022-10-03 18:09:56 -07:00
Changyu Bi	8b430e01dc	Add iterator refresh to stress test (#10766 ) Summary: added calls to `Iterator::Refresh()` in `NonBatchedOpsStressTest::TestIterateAgainstExpected()`. The testing key range is locked in `TestIterateAgainstExpected` so I do not expect this change to provide thorough stress test to `Iterator::Refresh()`. However, it can still be helpful for catching bugs like https://github.com/facebook/rocksdb/issues/10739. Will add calls to refresh in `TestIterate` once we support iterator refresh with snapshots. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10766 Test Plan: `python3 tools/db_crashtest.py whitebox --simple --verify_iterator_with_expected_state_one_in=2` Reviewed By: ajkr Differential Revision: D40008320 Pulled By: ajkr fbshipit-source-id: cec93b07f915ef6476d41c1fee9b23c115188085	2022-10-03 16:22:39 -07:00
akankshamahajan	ae0f9c3339	Add new property in IOOptions to skip recursing through directories and list only files during GetChildren. (#10668 ) Summary: Add new property "do_not_recurse" in IOOptions for underlying file system to skip iteration of directories during DB::Open if there are no sub directories and list only files. By default this property is set to false. This property is set true currently in the code where RocksDB is sure only files are needed during DB::Open. Provided support in PosixFileSystem to use "do_not_recurse". TestPlan: - Existing tests Pull Request resolved: https://github.com/facebook/rocksdb/pull/10668 Reviewed By: anand1976 Differential Revision: D39471683 Pulled By: akankshamahajan15 fbshipit-source-id: 90e32f0b86d5346d53bc2714d3a0e7002590527f	2022-10-03 10:59:45 -07:00
Changyu Bi	9f2363f4c4	User-defined timestamp support for `DeleteRange()` (#10661 ) Summary: Add user-defined timestamp support for range deletion. The new API is `DeleteRange(opt, cf, begin_key, end_key, ts)`. Most of the change is to update the comparator to compare without timestamp. Other than that, major changes are - internal range tombstone data structures (`FragmentedRangeTombstoneList`, `RangeTombstone`, etc.) to store timestamps. - Garbage collection of range tombstones and range tombstone covered keys during compaction. - Get()/MultiGet() to return the timestamp of a range tombstone when needed. - Get/Iterator with range tombstones bounded by readoptions.timestamp. - timestamp crash test now issues DeleteRange by default. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10661 Test Plan: - Added unit test: `make check` - Stress test: `python3 tools/db_crashtest.py --enable_ts whitebox --readpercent=57 --prefixpercent=4 --writepercent=25 -delpercent=5 --iterpercent=5 --delrangepercent=4` - Ran `db_bench` to measure regression when timestamp is not enabled. The tests are for write (with some range deletion) and iterate with DB fitting in memory: `./db_bench--benchmarks=fillrandom,seekrandom --writes_per_range_tombstone=200 --max_write_buffer_number=100 --min_write_buffer_number_to_merge=100 --writes=500000 --reads=500000 --seek_nexts=10 --disable_auto_compactions -disable_wal=true --max_num_range_tombstones=1000`. Did not see consistent regression in no timestamp case. \| micros/op \| fillrandom \| seekrandom \| \| --- \| --- \| --- \| \|main\| 2.58 \|10.96\| \|PR 10661\| 2.68 \|10.63\| Reviewed By: riversand963 Differential Revision: D39441192 Pulled By: cbi42 fbshipit-source-id: f05aca3c41605caf110daf0ff405919f300ddec2	2022-09-30 16:13:03 -07:00

... 5 6 7 8 9 ...

11841 Commits All Branches Search

11841 Commits

All Branches