rocksdb

mirror of https://github.com/facebook/rocksdb.git synced 2024-11-27 02:44:18 +00:00

Author	SHA1	Message	Date
changyubi	fec5c8deb8	Remove NUMA setting for benchmark-linux (#11180 ) Summary: benchmark-linux is failing on main branch after https://github.com/facebook/rocksdb/issues/11074 with the following error msg: ``` /usr/bin/time -f '%e %U %S' -o /tmp/benchmark-results/8.0.0/benchmark_overwriteandwait.t1.s0.log.time numactl --interleave=all timeout 1200 ./db_bench --benchmarks=overwrite,waitforcompaction,stats --use_existing_db=1 --sync=0 --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=20 --level0_stop_writes_trigger=30 --max_background_jobs=4 --max_write_buffer_number=8 --undefok=use_blob_cache,use_shared_block_and_blob_cache,blob_cache_size,blob_cache_numshardbits,prepopulate_blob_cache,multiread_batched,cache_low_pri_pool_ratio,prepopulate_block_cache --db=/tmp/rocksdb-benchmark-datadir --wal_dir=/tmp/rocksdb-benchmark-datadir --num=20000000 --key_size=20 --value_size=400 --block_size=8192 --cache_size=10737418240 --cache_numshardbits=6 --compression_max_dict_bytes=0 --compression_ratio=0.5 --compression_type=none --bytes_per_sync=1048576 --cache_index_and_filter_blocks=1 --cache_high_pri_pool_ratio=0.5 --cache_low_pri_pool_ratio=0 --benchmark_write_rate_limit=0 --write_buffer_size=16777216 --target_file_size_base=16777216 --max_bytes_for_level_base=67108864 --verify_checksum=1 --delete_obsolete_files_period_micros=62914560 --max_bytes_for_level_multiplier=8 --statistics=0 --stats_per_interval=1 --stats_interval_seconds=20 --report_interval_seconds=1 --histogram=1 --memtablerep=skip_list --bloom_bits=10 --open_files=-1 --subcompactions=1 --compaction_style=0 --num_levels=8 --min_level_to_compress=3 --level_compaction_dynamic_level_bytes=true --pin_l0_filter_and_index_blocks_in_cache=1 --duration=600 --threads=1 --merge_operator="put" --seed=1675372532 --report_file=/tmp/benchmark-results/8.0.0/benchmark_overwriteandwait.t1.s0.log.r.csv 2>&1 \| tee -a /tmp/benchmark-results/8.0.0/benchmark_overwriteandwait.t1.s0.log /usr/bin/time: cannot run numactl: No such file or directory ``` This PR removes the newly added NUMA setting. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11180 Test Plan: check next main branch run for benchmark-linux Reviewed By: ajkr Differential Revision: D42975930 Pulled By: cbi42 fbshipit-source-id: f084d39aeba9877c0752502e879c5e612b507653	2023-02-02 15:15:09 -08:00
Alan Paxton	6781009ee8	CI Benchmarking. Small configuration changes based on performance analysis. (#11074 ) Summary: First, we made a small reduction in DURATION_RW as runs were exceeding 1 hour and colliding with subsequent runs. See Mark Callaghan’s blog post at http://smalldatum.blogspot.com/2023/01/variance-in-rocksdb-benchmarks-on-cloud.html Configuration parameters which are not consistent with the following email from Mark (see the blog post for more context) have been updated. Where Mark has defined the parameter and we haven't, we define it explicitly. We will need to further monitor for an expected reduction in variance of test times: To match what I did: --- nsecs=1800 dbdir=/data/m/rx resultdir=bm.lc.nt1.cm1.d0 env WRITE_BUFFER_SIZE_MB=16 TARGET_FILE_SIZE_BASE_MB=16 MAX_BYTES_FOR_LEVEL_BASE_MB=64 MAX_BACKGROUND_JOBS=4 NUM_KEYS=20000000 CACHE_SIZE_MB=10240 DURATION_RW=$nsecs DURATION_RO=$nsecs MB_WRITE_PER_SEC=2 NUM_THREADS=1 COMPRESSION_TYPE=none CACHE_INDEX_AND_FILTER_BLOCKS=1 VALUE_SIZE=400 NUMA=1 MIN_LEVEL_TO_COMPRESS=3 COMPACTION_STYLE=leveled bash benchmark_compare.sh $dbdir $resultdir 7.8.fb env WRITE_BUFFER_SIZE_MB=16 TARGET_FILE_SIZE_BASE_MB=16 MAX_BYTES_FOR_LEVEL_BASE_MB=64 MAX_BACKGROUND_JOBS=4 NUM_KEYS=200000000 CACHE_SIZE_MB=10240 DURATION_RW=$nsecs DURATION_RO=$nsecs MB_WRITE_PER_SEC=2 NUM_THREADS=1 COMPRESSION_TYPE=lz4 CACHE_INDEX_AND_FILTER_BLOCKS=1 VALUE_SIZE=400 NUMA=1 MIN_LEVEL_TO_COMPRESS=3 COMPACTION_STYLE=leveled bash benchmark_compare.sh $dbdir $resultdir 7.8.fb Pull Request resolved: https://github.com/facebook/rocksdb/pull/11074 Reviewed By: ajkr Differential Revision: D42969668 Pulled By: cbi42 fbshipit-source-id: 1ea4e6a3901be4016108f93817eb58f74baac21a	2023-02-02 11:11:40 -08:00
Andrew Kryczka	6af16ac7c1	Update HISTORY.md for #11136 (#11177 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11177 Reviewed By: cbi42 Differential Revision: D42948946 Pulled By: ajkr fbshipit-source-id: 783d3d9007faaa036923a0364cdd0bfbd8e78062	2023-02-02 07:50:55 -08:00
Levi Tamasi	df680b24ef	Clean up InvokeFilterIfNeeded a bit (#11174 ) Summary: The patch makes some code quality enhancements in `CompactionIterator::InvokeFilterIfNeeded` including the renaming of `filter` (which is most likely a remnant of the days before the `FilterV2` API when the compaction filter used to return a boolean) to `decision`, the removal of some outdated comments, the elimination of an `error` flag which was only used in one failure case out of many, as well as some small stylistic improvements. (Some the above will also come in handy when adding compaction filter support for wide-column entities.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/11174 Test Plan: `make check` Reviewed By: akankshamahajan15 Differential Revision: D42901408 Pulled By: ltamasi fbshipit-source-id: ab382d59a4990c5dfe1cee219d49e1d80902b666	2023-02-01 10:03:07 -08:00
Andrew Kryczka	071c33846d	Allow canceling manual compaction while waiting for conflicting compaction (#11165 ) Summary: This PR adds logic to the `RunManualCompaction()` loop to check for cancellation before waiting on any conflicting compactions to finish. In case of cancellation, `RunManualCompaction()` no longer waits on conflicting compactions Pull Request resolved: https://github.com/facebook/rocksdb/pull/11165 Test Plan: repro test case Reviewed By: cbi42 Differential Revision: D42864058 Pulled By: ajkr fbshipit-source-id: ea4dd1a8f294abe212905495a8fbe8f07fca3f5a	2023-01-31 16:57:49 -08:00
Levi Tamasi	753d4d5078	Support using GetEntity as a verification method in the non-batched stress tests (#11144 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11144 Test Plan: Ran a simple blackbox crash test. Reviewed By: akankshamahajan15 Differential Revision: D42791464 Pulled By: ltamasi fbshipit-source-id: 8eb6e62f0bc47f709816136ff3ded0a41d04fab8	2023-01-31 10:17:48 -08:00
Levi Tamasi	a82021c3d0	Fix a bug where GetEntity would expose a blob reference (#11162 ) Summary: The patch fixes a feature interaction bug between BlobDB and the `GetEntity` API: without the patch, `GetEntity` would return the blob reference (wrapped into a single-column entity) instead of the actual blob value. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11162 Test Plan: `make check` Reviewed By: akankshamahajan15 Differential Revision: D42854092 Pulled By: ltamasi fbshipit-source-id: f750d0ff57def107da16f545077ddce9860ff21a	2023-01-31 09:59:25 -08:00
Peter Dillinger	94e3beec77	Cleanup, improve, stress test LockWAL() (#11143 ) Summary: The previous API comments for LockWAL didn't provide much about why you might want to use it, and didn't really meet what one would infer its contract was. Also, LockWAL was not in db_stress / crash test. In this change: * Implement a counting semantics for LockWAL()+UnlockWAL(), so that they can safely be used concurrently across threads or recursively within a thread. This should make the API much less bug-prone and easier to use. * Make sure no UnlockWAL() is needed after non-OK LockWAL() (to match RocksDB conventions) * Make UnlockWAL() reliably return non-OK when there's no matching LockWAL() (for debug-ability) * Clarify API comments on LockWAL(), UnlockWAL(), FlushWAL(), and SyncWAL(). Their exact meanings are not obvious, and I don't think it's appropriate to talk about implementation mutexes in the API comments, but about what operations might block each other. * Add LockWAL()/UnlockWAL() to db_stress and crash test, mostly to check for assertion failures, but also checks that latest seqno doesn't change while WAL is locked. This is simpler to add when LockWAL() is allowed in multiple threads. * Remove unnecessary use of sync points in test DBWALTest::LockWal. There was a bug during development of above changes that caused this test to fail sporadically, with and without this sync point change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11143 Test Plan: unit tests added / updated, added to stress/crash test Reviewed By: ajkr Differential Revision: D42848627 Pulled By: pdillinger fbshipit-source-id: 6d976c51791941a31fd8fbf28b0f82e888d9f4b4	2023-01-30 22:52:30 -08:00
sdong	36174d89a6	DB Stress to fix a false assertion (#11164 ) Summary: Seeting this error in stress test: db_stress: internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:2459: void rocksdb::StressTest::Open(rocksdb::SharedState *): Assertion `txn_db_ == nullptr' failed. Received signal 6 (Aborted) ...... It doesn't appear that txn_db_ is set to nullptr at all. We set ithere. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11164 Test Plan: Run db_stress transaction and non-transation with low kill rate and see restarting without assertion Reviewed By: ajkr Differential Revision: D42855662 fbshipit-source-id: 06816d37cce9c94a81cb54ab238fb73aa102ed46	2023-01-30 19:45:47 -08:00
Yu Zhang	24ac53d81a	Use user key on sst file for blob verification for Get and MultiGet (#11105 ) Summary: Use the user key on sst file for blob verification for `Get` and `MultiGet` instead of the user key passed from caller. Add tests for `Get` and `MultiGet` operations when user defined timestamp feature is enabled in a BlobDB. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11105 Test Plan: make V=1 db_blob_basic_test ./db_blob_basic_test --gtest_filter="DBBlobTestWithTimestamp.*" Reviewed By: ltamasi Differential Revision: D42716487 Pulled By: jowlyzhang fbshipit-source-id: 5987ecbb7e56ddf46d2467a3649369390789506a	2023-01-30 10:21:21 -08:00
akankshamahajan	79e57a39a3	Move ExternalSSTTestEnv to FileSystemWrapper (#11139 ) Summary: Migrate ExternalSSTTestEnv to FileSystemWrapper Pull Request resolved: https://github.com/facebook/rocksdb/pull/11139 Reviewed By: anand1976 Differential Revision: D42780180 Pulled By: akankshamahajan15 fbshipit-source-id: 9a4448c9fe5186b518235fe11e1a34dcad897cdd	2023-01-27 14:51:39 -08:00
sdong	4720ba4391	Remove RocksDB LITE (#11147 ) Summary: We haven't been actively mantaining RocksDB LITE recently and the size must have been gone up significantly. We are removing the support. Most of changes were done through following comments: unifdef -m -UROCKSDB_LITE `git grep -l ROCKSDB_LITE \| egrep '[.](cc\|h)'` by Peter Dillinger. Others changes were manually applied to build scripts, CircleCI manifests, ROCKSDB_LITE is used in an expression and file db_stress_test_base.cc. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11147 Test Plan: See CI Reviewed By: pdillinger Differential Revision: D42796341 fbshipit-source-id: 4920e15fc2060c2cd2221330a6d0e5e65d4b7fe2	2023-01-27 13:14:19 -08:00
Yu Zhang	6943ff6e50	Remove deprecated util functions in options_util.h (#11126 ) Summary: Remove the util functions in options_util.h that have previously been marked deprecated. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11126 Test Plan: `make check` Reviewed By: ltamasi Differential Revision: D42757496 Pulled By: jowlyzhang fbshipit-source-id: 2a138a3c207d0e0e0bbb4d99548cf2cadb44bcfb	2023-01-27 11:10:53 -08:00
Andrew Kryczka	97c1024d3e	Include db_stress verification method in failure message (#11133 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11133 Test Plan: - ran it a few times on a mismatching DB+expected state; verified error messages look right: ``` Verification failed for column family 0 key 000000000000D553000000000000014C0000000000000142 (163988): value_from_db: , value_from_expected: 25E7B53421202322, msg: GetMergeOperands verification: Value not found: NotFound: Verification failed for column family 0 key 000000000000AAE2787878 (131123): value_from_db: , value_from_expected: B2A69C18B6B7B4B5BABBB8B9BEBFBCBDA2A3A0A1A6A7A4A5, msg: Iterator verification: Value not found: NotFound: Verification failed for column family 0 key 00000000000080C6000000000000004C78787878 (98409): value_from_db: , value_from_expected: 67AB7E1E636261606F6E6D6C6B6A6968, msg: Get verification: Value not found: NotFound: ``` Reviewed By: hx235 Differential Revision: D42757072 Pulled By: ajkr fbshipit-source-id: b0a4a0aaa5be5d110434324853ac92aaa6972d89	2023-01-27 07:45:25 -08:00
Changyu Bi	c94c8fcbd4	Remove deprecated FileSystem::Load() (#11122 ) Summary: user should use FileSystem::CreateFromString() instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11122 Reviewed By: ajkr Differential Revision: D42727580 Pulled By: cbi42 fbshipit-source-id: c68b17bb82ba9dee46ba23b677d87ecf0a1e06c8	2023-01-26 20:20:58 -08:00
Karim TAAM	a1e92bd956	use verify checksum option in block based table reader Open() (#11099 ) Summary: ## Description In this issue https://github.com/facebook/rocksdb/issues/11002 we found that when we use rocksdb with the `verify checksum` read_option to false the verification is done anyway By analyzing the code along the stacktrace I saw that at the level of https://github.com/facebook/rocksdb/compare/main...matkt:feature/use-verify-checksum-in-block-based-table-reader?expand=1#diff-57ed8c49db2bdd4db7618646a177397674bbf25beacacecb104070071d30129f we are not keeping all the options and we forget the `verify_checksum` the comment in this class suggests that it should be managed https://github.com/facebook/rocksdb/compare/main...matkt:feature/use-verify-checksum-in-block-based-table-reader?expand=1#diff-57ed8c49db2bdd4db7618646a177397674bbf25beacacecb104070071d30129fL581 <img width="1724" alt="204511641-86ab4b9b-45e5-4a2b-a13d-81fa26435d38" src="https://user-images.githubusercontent.com/26581503/213152802-c46bc1c7-a3a2-4a6f-9bb1-bf92ee93af7a.png"> this PR just adds the line to manage the `verify checksum` ## Tests - Running unit tests - Test without setting `verify checksum` and verifying that we are calling the checksum code - Test by setting `verify checksum` to true and verifying that we are calling the checksum code - Test by setting `verify checksum` to false and verifying that we are not calling the checksum code Pull Request resolved: https://github.com/facebook/rocksdb/pull/11099 Reviewed By: cbi42 Differential Revision: D42679881 Pulled By: ajkr fbshipit-source-id: c7dd10768282fd0699f7e1bf397ceb7adbea4ab6	2023-01-26 17:38:59 -08:00
Andrew Kryczka	b44cbbf709	Fix GetMergeOperands() returning MergeInProgress (#11136 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11136 Test Plan: the provided unit test used to fail due to `GetMergeOperands()` returning `Status::MergeInProgress()`; it passes now because the `GetMergeOperands()` call returns `Status::OK()` Reviewed By: pdillinger Differential Revision: D42759198 Pulled By: ajkr fbshipit-source-id: 878f9f40ccc1d7e2fe7b1352814bae3a49c19939	2023-01-26 15:11:19 -08:00
dependabot[bot]	dcf93b7b3e	Bump commonmarker from 0.23.6 to 0.23.7 in /docs (#11128 ) Summary: Bumps [commonmarker](https://github.com/gjtorikian/commonmarker) from 0.23.6 to 0.23.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/gjtorikian/commonmarker/releases">commonmarker's releases</a>.</em></p> <blockquote> <h2>v0.23.7</h2> <h2>What's Changed</h2> <ul> <li>C API stable test by <a href="https://github.com/gjtorikian"><code>@gjtorikian</code></a> in <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/201">gjtorikian/commonmarker#201</a></li> <li>Update to 29.0.gfm.7 by <a href="https://github.com/anticomputer"><code>@anticomputer</code></a> in <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/224">gjtorikian/commonmarker#224</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/gjtorikian/commonmarker/compare/v0.23.6...v0.23.7">https://github.com/gjtorikian/commonmarker/compare/v0.23.6...v0.23.7</a></p> <h2>v0.23.7.pre1</h2> <h2>What's Changed</h2> <ul> <li>C API stable test by <a href="https://github.com/gjtorikian"><code>@gjtorikian</code></a> in <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/201">gjtorikian/commonmarker#201</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/gjtorikian/commonmarker/compare/v0.23.6...v0.23.7.pre1">https://github.com/gjtorikian/commonmarker/compare/v0.23.6...v0.23.7.pre1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/gjtorikian/commonmarker/blob/main/CHANGELOG.md">commonmarker's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/gjtorikian/commonmarker/tree/v1.0.0.pre6">v1.0.0.pre6</a> (2023-01-09)</h2> <p><a href="https://github.com/gjtorikian/commonmarker/compare/v1.0.0.pre5...v1.0.0.pre6">Full Changelog</a></p> <p><strong>Closed issues:</strong></p> <ul> <li>Cargo.lock prevents Ruby 3.2.0 from installing commonmarker v1.0.0.pre4 <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/211">https://github.com/facebook/rocksdb/issues/211</a></li> </ul> <p><strong>Merged pull requests:</strong></p> <ul> <li>always use rb_sys (don't use Ruby's emerging cargo tooling where available) <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/213">https://github.com/facebook/rocksdb/issues/213</a> (<a href="https://github.com/kivikakk">kivikakk</a>)</li> </ul> <h2><a href="https://github.com/gjtorikian/commonmarker/tree/v1.0.0.pre5">v1.0.0.pre5</a> (2023-01-08)</h2> <p><a href="https://github.com/gjtorikian/commonmarker/compare/v1.0.0.pre4...v1.0.0.pre5">Full Changelog</a></p> <p><strong>Merged pull requests:</strong></p> <ul> <li>Provide 3.2 build support <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/212">https://github.com/facebook/rocksdb/issues/212</a> (<a href="https://github.com/gjtorikian">gjtorikian</a>)</li> </ul> <h2><a href="https://github.com/gjtorikian/commonmarker/tree/v1.0.0.pre4">v1.0.0.pre4</a> (2022-12-28)</h2> <p><a href="https://github.com/gjtorikian/commonmarker/compare/v1.0.0.pre3...v1.0.0.pre4">Full Changelog</a></p> <p><strong>Closed issues:</strong></p> <ul> <li>Will the cmark-gfm branch continue to be maintained for awhile? <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/207">https://github.com/facebook/rocksdb/issues/207</a></li> </ul> <p><strong>Merged pull requests:</strong></p> <ul> <li>Implement native syntax highlighting <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/209">https://github.com/facebook/rocksdb/issues/209</a> (<a href="https://github.com/gjtorikian">gjtorikian</a>)</li> <li>Bump magnus from 0.4.3 to 0.4.4 <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/208">https://github.com/facebook/rocksdb/issues/208</a> (<a href="https://github.com/apps/dependabot">dependabot[bot]</a>)</li> <li>Bump magnus from 0.4.2 to 0.4.3 <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/206">https://github.com/facebook/rocksdb/issues/206</a> (<a href="https://github.com/apps/dependabot">dependabot[bot]</a>)</li> <li>Bump comrak from 0.14.0 to 0.15.0 <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/205">https://github.com/facebook/rocksdb/issues/205</a> (<a href="https://github.com/apps/dependabot">dependabot[bot]</a>)</li> <li>Bump magnus from 0.4.1 to 0.4.2 <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/204">https://github.com/facebook/rocksdb/issues/204</a> (<a href="https://github.com/apps/dependabot">dependabot[bot]</a>)</li> </ul> <h2><a href="https://github.com/gjtorikian/commonmarker/tree/v1.0.0.pre3">v1.0.0.pre3</a> (2022-11-30)</h2> <p><a href="https://github.com/gjtorikian/commonmarker/compare/v1.0.0.pre.2...v1.0.0.pre3">Full Changelog</a></p> <p><strong>Closed issues:</strong></p> <ul> <li>Code block incorrectly parsed in commonmarker 1.0.0.pre <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/202">https://github.com/facebook/rocksdb/issues/202</a></li> </ul> <p><strong>Merged pull requests:</strong></p> <ul> <li>Windows build <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/pull/197">https://github.com/facebook/rocksdb/issues/197</a> (<a href="https://github.com/gjtorikian">gjtorikian</a>)</li> </ul> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`734fd86c97`"><code>734fd86</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/224">https://github.com/facebook/rocksdb/issues/224</a> from gjtorikian/update-to-29.0.gfm.7</li> <li><a href="`2e724ec52a`"><code>2e724ec</code></a> Turned off Rubocop.</li> <li><a href="`9c923b0bfd`"><code>9c923b0</code></a> 💎 release 0.23.7</li> <li><a href="`30419c25e8`"><code>30419c2</code></a> Added call to cmark_init_standard_node_flags()</li> <li><a href="`9007c3798f`"><code>9007c37</code></a> Update cmark-upstream to <a href="https://github.com/github/cmark-gfm/commit/57d5e093e">https://github.com/github/cmark-gfm/commit/57d5e093e</a>...</li> <li><a href="`1cfec13373`"><code>1cfec13</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/201">https://github.com/facebook/rocksdb/issues/201</a> from gjtorikian/c-api-stable-test</li> <li><a href="`bbf631b413`"><code>bbf631b</code></a> lint</li> <li><a href="`5b807a115d`"><code>5b807a1</code></a> ease up</li> <li><a href="`9a24e6d2fe`"><code>9a24e6d</code></a> Test fake version</li> <li><a href="`d8a43bc73a`"><code>d8a43bc</code></a> Allow for manual dispatch</li> <li>Additional commits viewable in <a href="https://github.com/gjtorikian/commonmarker/compare/v0.23.6...v0.23.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=commonmarker&package-manager=bundler&previous-version=0.23.6&new-version=0.23.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/facebook/rocksdb/network/alerts). </details> Pull Request resolved: https://github.com/facebook/rocksdb/pull/11128 Reviewed By: ajkr Differential Revision: D42752086 Pulled By: cbi42 fbshipit-source-id: 6992b6f1096400a6b10b79fe36bf955fec841b71	2023-01-26 12:07:52 -08:00
Levi Tamasi	a6cfdd4eda	Fix the HISTORY.md entry related to the removed statistics (#11140 ) Summary: Some histograms were incorrectly categorized as tickers. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11140 Reviewed By: anand1976 Differential Revision: D42780030 Pulled By: ltamasi fbshipit-source-id: 5aca8ec5baad8f73676aaa9d6cdbbd2a619c8a89	2023-01-26 10:38:45 -08:00
akankshamahajan	986c5b9d4e	Migrate TestEnv in listener_test.cc to FileSystemWrapper (#11125 ) Summary: Migrate derived classes from EnvWrapper to FileSystemWrapper so we can eventually deprecate the storage methods in Env. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11125 Test Plan: CircleCI jobs Reviewed By: anand1976 Differential Revision: D42732241 Pulled By: akankshamahajan15 fbshipit-source-id: c89a70a79fcfb13e158bf8919b1a87a9de133222	2023-01-25 22:42:22 -08:00
sdong	e808858ae0	Remove Stats related to compressed block cache (#11135 ) Summary: Since compressed block cache is removed, those stats are not needed. They are removed in different PR in case there is a problem with it. The stats are removed in the same way in https://github.com/facebook/rocksdb/pull/11131/ . HISTORY.md was already updated by mistake, and it would be correct after merging this PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11135 Test Plan: Watch CI Reviewed By: ltamasi Differential Revision: D42757616 fbshipit-source-id: bd7cb782585c8535ce5784295225c376f3011f35	2023-01-25 15:37:50 -08:00
Levi Tamasi	6da2e20df3	Remove more obsolete statistics (#11131 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11131 Test Plan: `make check` Reviewed By: pdillinger Differential Revision: D42753997 Pulled By: ltamasi fbshipit-source-id: ce8b84c1e55374257e93ed74fd255c9b759723ce	2023-01-25 15:14:13 -08:00
Heiko Becker	88edfbfb5e	Fix build with gcc 13 by including <cstdint> (#11118 ) Summary: Like other versions before, gcc 13 moved some includes around and as a result <cstdint> is no longer transitively included [1]. Explicitly include it for uint{32,64}_t. [1] https://gcc.gnu.org/gcc-13/porting_to.html#header-dep-changes Pull Request resolved: https://github.com/facebook/rocksdb/pull/11118 Reviewed By: cbi42 Differential Revision: D42711356 Pulled By: ajkr fbshipit-source-id: 5ea257b85b7017f40fd8fdbce965336da95c55b2	2023-01-25 14:30:32 -08:00
Andrew Kryczka	6a5071ceb5	Support PutEntity in trace analyzer (#11127 ) Summary: Add the most basic support such that trace_analyzer commands no longer fail with ``` Cannot process the write batch in the trace Cannot process the TraceRecord PutEntityCF not implemented Cannot process the trace ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/11127 Reviewed By: cbi42 Differential Revision: D42732319 Pulled By: ajkr fbshipit-source-id: 162d8a31318672a46539b1b042ec25f69b25c4ed	2023-01-25 14:27:02 -08:00
Peter Dillinger	546e213c4f	Fix DelayWrite() calls for two_write_queues (#11130 ) Summary: PR https://github.com/facebook/rocksdb/issues/11020 fixed a case where it was easy to deadlock the DB with LockWAL() but introduced a bug showing up as a rare assertion failure in the stress test. Specifically, `assert(w->state == STATE_INIT)` in `WriteThread::LinkOne()` called from `BeginWriteStall()`, `DelayWrite()`, `WriteImplWALOnly()`. I haven't been about to generate a unit test that reproduces this failure but I believe the root cause is that DelayWrite() was never meant to be re-entrant, only called from the DB's write_thread_ leader. https://github.com/facebook/rocksdb/issues/11020 introduced a call to DelayWrite() from the nonmem_write_thread_ group leader. This fix is to make DelayWrite() apply to the specific write queue that it is being called from (inject a dummy write stall entry to the head of the appropriate write queue). WriteController is re-entrant, based on polling and state changes signalled with bg_cv_, so can manage stalling two queues. The only anticipated complication (called out by Andrew in previous PR) is that we don't want timed write delays being injected in parallel for the two queues, because that dimishes the intended throttling effect. Thus, we only allow timed delays for the primary write queue. HISTORY not updated because this is intended for the same release where the bug was introduced. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11130 Test Plan: Although I was not able to reproduce the assertion failure, I was able to reproduce a distinct flaw with what I believe is the same root cause: a kind of deadlock if both write queues need to wake up from stopped writes. Only one will be waiting on bg_cv_ (the other waiting in `LinkOne()` for the write queue to open up), so a single SignalAll() will only unblock one of the queues, with the other re-instating the stop until another signal on bg_cv_. A simple unit test is added for this case. Will also run crash_test_with_multiops_wc_txn for a while looking for issues. Reviewed By: ajkr Differential Revision: D42749330 Pulled By: pdillinger fbshipit-source-id: 4317dd899a93d57c26fd5af7143038f82d4d4d1b	2023-01-25 14:18:27 -08:00
Peter Dillinger	9afa0f05ad	Remove deprecated Env::LoadEnv() (#11121 ) Summary: Can use Env::CreateFromString() instead Pull Request resolved: https://github.com/facebook/rocksdb/pull/11121 Test Plan: unit tests updated Reviewed By: cbi42 Differential Revision: D42723813 Pulled By: pdillinger fbshipit-source-id: 5d4b5b10225dfdaf662f5f8049ee965a05d3edc9	2023-01-25 12:08:49 -08:00
Levi Tamasi	99e559533d	Remove some deprecated/obsolete statistics from the API (#11123 ) Summary: These tickers/histograms have been obsolete (and not populated) for a long time. The patch removes them from the API completely. Note that this means that the numeric values of the remaining tickers change in the C++ code as they get shifted up. This should be OK: the values of some existing tickers have changed many times over the years as items have been added in the middle. (In contrast, the convention in the Java bindings is to keep the ids, which are not guaranteed to be the same as the ids on the C++ side, the same across releases.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/11123 Test Plan: `make check` Reviewed By: akankshamahajan15 Differential Revision: D42727793 Pulled By: ltamasi fbshipit-source-id: e058a155a20b05b45f53e67ee380aece1b43b6c5	2023-01-24 20:56:15 -08:00
anand76	bcbab59c55	Migrate ErrorEnv from EnvWrapper to FileSystemWrapper (#11124 ) Summary: Migrate ErrorEnv from EnvWrapper to FileSystemWrapper so we can eventually deprecate the storage methods in Env. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11124 Reviewed By: akankshamahajan15 Differential Revision: D42727791 Pulled By: anand1976 fbshipit-source-id: e8362ad624dc28e55c99fc35eda12866755f62c6	2023-01-24 17:14:35 -08:00
sdong	2800aa069a	Remove compressed block cache (#11117 ) Summary: Compressed block cache is replaced by compressed secondary cache. Remove the feature. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11117 Test Plan: See CI passes Reviewed By: pdillinger Differential Revision: D42700164 fbshipit-source-id: 6cbb24e460da29311150865f60ecb98637f9f67d	2023-01-24 17:09:19 -08:00
Peter Dillinger	4a9185340d	A better contract for best_efforts_recovery (#11085 ) Summary: Capture more of the original intent at a high level, without getting bogged down in low-level details. The old text made some weak promises about handling of LOCK files. There should be no specific concern for LOCK files, because we already rely on LockFile() to create the file if it's not present already. And the lock file is generally size 0, so don't have to worry about truncation. Added a unit test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11085 Test Plan: existing tests, and a new one. Reviewed By: siying Differential Revision: D42713233 Pulled By: pdillinger fbshipit-source-id: 2fce7c974d35fac065037c9c4c7326a59c9fe340	2023-01-24 12:55:03 -08:00
Changyu Bi	e0ea0dc6bd	Improve documentation for `allow_ingest_behind` (#11119 ) Summary: update documentation to mention that only universal compaction is supported. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11119 Reviewed By: ajkr Differential Revision: D42715986 Pulled By: cbi42 fbshipit-source-id: 91b145d3318334cb92857c5c0ffc0efed6fa4363	2023-01-24 12:12:19 -08:00
Hui Xiao	86fa2592be	Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD (#11111 ) Summary: Context: Concurrent flushes on the same CF can set on `ColumnFamilyData::flush_reason` before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush `db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool, rocksdb::JobContext, rocksdb::LogBuffer, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed. ` Summary:* Suggested by ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define `ColumnFamilyData::flush_reason` in face of concurrent flushes on the same CF (which wasn't the case a long time ago when `ColumnFamilyData::flush_reason ` first introduced`) Tets: - new unit test - make check - aggressive crash test rehearsal Pull Request resolved: https://github.com/facebook/rocksdb/pull/11111 Reviewed By: ajkr Differential Revision: D42644600 Pulled By: hx235 fbshipit-source-id: 8589c8184869d3415e5b780c887f877818a5ebaf	2023-01-24 09:54:04 -08:00
Hui Xiao	7e7548477c	Update HISTORY.md/version.h/format compatiblity test for 7.10 release (#11114 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11114 Reviewed By: ajkr Differential Revision: D42685234 Pulled By: hx235 fbshipit-source-id: 79908a66ab9052a2552f080049065462ebf2f94c	2023-01-23 13:26:11 -08:00
Andrew Kryczka	b7fbcefda8	Add API to limit blast radius of merge operator failure (#11092 ) Summary: Prior to this PR, `FullMergeV2()` can only return `false` to indicate failure, which causes any operation invoking it to fail. During a compaction, such a failure causes the compaction to fail and causes the DB to irreversibly enter read-only mode. Some users asked for a way to allow the merge operator to fail without such widespread damage. To limit the blast radius of merge operator failures, this PR introduces the `MergeOperationOutput::op_failure_scope` API. When unpopulated (`kDefault`) or set to `kTryMerge`, the merge operator failure handling is the same as before. When set to `kMustMerge`, merge operator failure still causes failure to operations that must merge (`Get()`, iterator, `MultiGet()`, etc.). However, under `kMustMerge`, flushes/compactions can survive merge operator failures by outputting the unmerged input operands. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11092 Reviewed By: siying Differential Revision: D42525673 Pulled By: ajkr fbshipit-source-id: 951dc3bf190f86347dccf3381be967565cda52ee	2023-01-20 14:40:30 -08:00
akankshamahajan	bde65052c4	Enhance async scan prefetch unit tests (#11087 ) Summary: Add more coverage in unit tests for async scan. The added unit test fails without PR https://github.com/facebook/rocksdb/pull/10939. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11087 Test Plan: CircleCI jobs status for new unit tests. Reviewed By: anand1976 Differential Revision: D42487931 Pulled By: akankshamahajan15 fbshipit-source-id: d59ed7666599bd0d2733ac5d76bd70984b54c5a9	2023-01-20 10:17:57 -08:00
codeoos	f4a5446cab	Fix error maybe-uninitialized #11100 (#11101 ) Summary: In this issue [11100](https://github.com/facebook/rocksdb/issues/11100) I try to upgrade dependencies of [BaikalDB](https://github.com/baidu/BaikalDB) and tool chain to gcc-12.I found that when I build rocksdb v6.26.0(maybe I can use newer version),I found that in file trace_replay/trace_replay.cc,the compiler tell me "error mybe-uninitialized".I dound that it can be fixed very easy,so I make this pull request. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11101 Reviewed By: ajkr Differential Revision: D42583031 Pulled By: cbi42 fbshipit-source-id: 7f399f09441a30fe88b83cec5e2fd9885bad5c06	2023-01-19 13:59:48 -08:00
leipeng	a5bcbcd8be	remove unused InternalIteratorBase::is_mutable_ (#11104 ) Summary: `InternalIteratorBase::is_mutable_` is not used any more, remove it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11104 Reviewed By: ajkr Differential Revision: D42582747 Pulled By: cbi42 fbshipit-source-id: d30bf75151fc8414df0ae112a6ec4943b5b7330b	2023-01-19 13:28:58 -08:00
Peter Dillinger	fd911f9655	Upgrade xxhash.h to latest dev (#11098 ) Summary: Upgrading xxhash.h to latest dev version as of 1/17/2023, which is d7197ddea81364a539051f116ca77926100fc77f This should improve performance on some ARM machines. I allowed some of our RocksDB-specific changes to be made obsolete where it seemed appropriate, for example * xxhash.h has its own fallthrough marker (which I hope works for us) * As in https://github.com/Cyan4973/xxHash/pull/549 Merging and resolving conflicts one way or the other was all that went into this diff. Except I had to mix the two sides around `defined(__loongarch64)` How I did the upgrade (for future reference), so that I could use usual merge conflict resolution: ``` # New branch to help with merging git checkout -b xxh_merge_base # Check out RocksDB revision before last xxhash.h upgrade git reset --hard 22161b7547652af82a5dc67458de9ca8946ac83d^ # Create a commit with the raw base version from xxHash repo (from xxHash repo) git show 2c611a76f914828bed675f0f342d6c4199ffee1e:xxhash.h > ../rocksdb/util/xxhash.h # In RocksDB repo git commit -a # Merge in the last xxhash.h upgrade git merge `22161b7547` # Resolve conflict using committed version git show 22161b7547652af82a5dc67458de9ca8946ac83d:util/xxhash.h > util/xxhash.h git commit -a # Catch up to upstream git merge upstream/main # Create a different branch for applying raw upgrade git checkout -b xxh_upgrade_2023 # Find the RocksDB commit we made for the raw base version from xxHash git log main..HEAD # Rewind to it git reset --hard `2428b727a9` # Copy in latest raw version (from xxHash repo) cat xxhash.h > ../rocksdb/util/xxhash.h # Merge in RocksDB changes, use typical tools for conflict resolution git merge xxh_merge_base ``` Branch https://github.com/facebook/rocksdb/tree/xxhash_merge_base can be used as a base for future xxhash merges. Fixes https://github.com/facebook/rocksdb/issues/11073 Pull Request resolved: https://github.com/facebook/rocksdb/pull/11098 Test Plan: existing tests (e.g. Bloom filter schema stability tests) Also seems to include a small performance boost on my Intel dev machine, using `./db_bench --benchmarks=xxh3[-X50] 2>&1 \| egrep -o 'operations;.*' \| sort` Fastest out of 50 runs, before: 15477.3 MB/s Fastest out of 50 runs, after: 15850.7 MB/s, and 11 more runs faster than the "before" number Slowest out of 50 runs, before: 12267.5 MB/s Slowest out of 50 runs, after: 13897.1 MB/s More repetitions show the distinction is repeatable Reviewed By: hx235 Differential Revision: D42560010 Pulled By: pdillinger fbshipit-source-id: c43ee52f1c5fe0ba3d6d6e4eebb22ded5f5492ea	2023-01-19 12:07:50 -08:00
Changyu Bi	e9d6a0d7ce	Fix asan failure caused by range tombstone start key use-after-free (#11106 ) Summary: the `last_tombstone_start_user_key` variable in `BuildTable()` and in `CompactionOutputs::AddRangeDels()` may point to a start key that is freed if user-defined timestamp is enabled. This was causing ASAN failure and this PR fixes this issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11106 Test Plan: Added UT for repro. Reviewed By: ajkr Differential Revision: D42590862 Pulled By: cbi42 fbshipit-source-id: c493265ececdf89636d801d55ae929806c4d4b2c	2023-01-18 16:38:07 -08:00
akankshamahajan	bd4b8d6487	Fix crash in block_cache_trace_analyzer if reference key is null in case of MultiGet (#11042 ) Summary: Same as title Error: ``` block_cache_trace_analyzer: ./db/dbformat.h:421: uint64_t rocksdb::GetInternalKeySeqno(const rocksdb::Slice&): Assertion `n >= kNumInternalBytes' failed. Aborted (core dumped) ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/11042 Test Plan: - Added new unit test which fails without the fix. - Also ran manually on traces to confirm. Reviewed By: anand1976 Differential Revision: D42481587 Pulled By: akankshamahajan15 fbshipit-source-id: 7c33eb03a4a4d8ffbabcfbe0efa1e4d11bde3ba2	2023-01-18 13:24:37 -08:00
Changyu Bi	4d0f9a995c	Consider TTL compaction file cutting earlier to prevent small output file (#11075 ) Summary: in `CompactionOutputs::ShouldStopBefore()`, TTL-related states, `cur_files_to_cut_for_ttl_` and `next_files_to_cut_for_ttl_`, are not updated if the function returns early. This can cause unnecessary compaction output file cuttings and hence produce smaller output files, which may hurt write amp. See the example in the unit test for how this "unnecessary file cutting" can happen. This PR fixes this issue by moving the code for updating TTL states earlier in `CompactionOutputs::ShouldStopBefore()` so that the states are updated for each key. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11075 Test Plan: - Added new unit test. Reviewed By: hx235 Differential Revision: D42398739 Pulled By: cbi42 fbshipit-source-id: 09fab66679c1a734abcfc31bcea33dd9aeb9dbc7	2023-01-17 16:42:41 -08:00
Changyu Bi	6a82b68788	Avoid counting extra range tombstone compensated size in `AddRangeDels()` (#11091 ) Summary: in `CompactionOutputs::AddRangeDels()`, range tombstones with the same start and end key but different sequence numbers all contribute to compensated range tombstone size. This PR removes this redundancy. This PR also includes a fix from https://github.com/facebook/rocksdb/issues/11067 where a range tombstone that is not within a file's range was being added to the file. This fixes an assertion failure for `icmp.Compare(start, end) <= 0` in VersionSet::ApproximateSize() when calculating compensated range tombstone size. Assertions and a comment/essay was added to reason that no such range tombstone will be added after this fix. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11091 Test Plan: - Added unit tests - Stress test with small key range: `python3 tools/db_crashtest.py blackbox --simple --max_key=100 --interval=600 --write_buffer_size=262144 --target_file_size_base=256 --max_bytes_for_level_base=262144 --block_size=128 --value_size_mult=33 --subcompactions=10` Reviewed By: ajkr Differential Revision: D42521588 Pulled By: cbi42 fbshipit-source-id: 5bda3fe38997995314e1f7592319af12b69bc4f8	2023-01-17 12:47:44 -08:00
Changyu Bi	f515d9d203	Revert #10802 Consider range tombstone in compaction output file cutting (#11089 ) Summary: This reverts commit `f02c708aa3` since it introduced several bugs (see https://github.com/facebook/rocksdb/issues/11078 and https://github.com/facebook/rocksdb/issues/11067 for attempts to fix them) and that I do not have a high confidence to fix all of them and ensure no further ones before the next release branch cut. There are also come existing issue found during bug fixing. We will work on it and try to merge it to the release after. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11089 Test Plan: existing CI. Reviewed By: ajkr Differential Revision: D42505972 Pulled By: cbi42 fbshipit-source-id: 2f66dcde6b85dc94977b317c2ce513872cfbc153	2023-01-13 12:28:21 -08:00
leipeng	3941c34950	db_bench: let -benchmark=compact respect -subcompactions (#11077 ) Summary: When running `-benchmarks=compact`, `-subcompactions` does not take effect. `-subcompactions` option comment says it is for L0-L1 compactions, it is natural to extend it to CompactionRangeOptions.max_subcompactions. This PR set CompactionRangeOptions.max_subcompactions = FLAGS_subcompactions Pull Request resolved: https://github.com/facebook/rocksdb/pull/11077 Reviewed By: akankshamahajan15 Differential Revision: D42506251 Pulled By: ajkr fbshipit-source-id: f77c9a99d32ff7af59f3c452c9e16aaeb0360304	2023-01-13 11:47:26 -08:00
Wenlong Zhang	1cfe3528a2	support loongarch64 for rocksdb (#10036 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10036 Reviewed By: hx235 Differential Revision: D42424074 Pulled By: ajkr fbshipit-source-id: 004adb75005a26bd01c5d568d1ec6ac442cd59dd	2023-01-13 08:42:44 -08:00
anand76	a510880346	Add a unit test for async prefetch fix in #11049 (#11084 ) Summary: Add a unit test in prefetch_test for https://github.com/facebook/rocksdb/issues/11049 Pull Request resolved: https://github.com/facebook/rocksdb/pull/11084 Test Plan: Verify the test fails without https://github.com/facebook/rocksdb/issues/11049 and passes with it Reviewed By: akankshamahajan15 Differential Revision: D42485828 Pulled By: anand1976 fbshipit-source-id: ae512f2d121745a1f5212645a9b58868976c1f83	2023-01-12 18:09:07 -08:00
Peter Dillinger	9f7801c5f1	Major Cache refactoring, CPU efficiency improvement (#10975 ) Summary: This is several refactorings bundled into one to avoid having to incrementally re-modify uses of Cache several times. Overall, there are breaking changes to Cache class, and it becomes more of low-level interface for implementing caches, especially block cache. New internal APIs make using Cache cleaner than before, and more insulated from block cache evolution. Hopefully, this is the last really big block cache refactoring, because of rather effectively decoupling the implementations from the uses. This change also removes the EXPERIMENTAL designation on the SecondaryCache support in Cache. It seems reasonably mature at this point but still subject to change/evolution (as I warn in the API docs for Cache). The high-level motivation for this refactoring is to minimize code duplication / compounding complexity in adding SecondaryCache support to HyperClockCache (in a later PR). Other benefits listed below. * static_cast lines of code +29 -35 (net removed 6) * reinterpret_cast lines of code +6 -32 (net removed 26) ## cache.h and secondary_cache.h * Always use CacheItemHelper with entries instead of just a Deleter. There are several motivations / justifications: * Simpler for implementations to deal with just one Insert and one Lookup. * Simpler and more efficient implementation because we don't have to track which entries are using helpers and which are using deleters * Gets rid of hack to classify cache entries by their deleter. Instead, the CacheItemHelper includes a CacheEntryRole. This simplifies a lot of code (cache_entry_roles.h almost eliminated). Fixes https://github.com/facebook/rocksdb/issues/9428. * Makes it trivial to adjust SecondaryCache behavior based on kind of block (e.g. don't re-compress filter blocks). * It is arguably less convenient for many direct users of Cache, but direct users of Cache are now rare with introduction of typed_cache.h (below). * I considered and rejected an alternative approach in which we reduce customizability by assuming each secondary cache compatible value starts with a Slice referencing the uncompressed block contents (already true or mostly true), but we apparently intend to stack secondary caches. Saving an entry from a compressed secondary to a lower tier requires custom handling offered by SaveToCallback, etc. * Make CreateCallback part of the helper and introduce CreateContext to work with it (alternative to https://github.com/facebook/rocksdb/issues/10562). This cleans up the interface while still allowing context to be provided for loading/parsing values into primary cache. This model works for async lookup in BlockBasedTable reader (reader owns a CreateContext) under the assumption that it always waits on secondary cache operations to finish. (Otherwise, the CreateContext could be destroyed while async operation depending on it continues.) This likely contributes most to the observed performance improvement because it saves an std::function backed by a heap allocation. * Use char* for serialized data, e.g. in SaveToCallback, where void* was confusingly used. (We use `char` for serialized byte data all over RocksDB, with many advantages over `void`. `memcpy` etc. are legacy APIs that should not be mimicked.) * Add a type alias Cache::ObjectPtr = void, so that we can better indicate the intent of the void when it is to be the object associated with a Cache entry. Related: started (but did not complete) a refactoring to move away from "value" of a cache entry toward "object" or "obj". (It is confusing to call Cache a key-value store (like DB) when it is really storing arbitrary in-memory objects, not byte strings.) * Remove unnecessary key param from DeleterFn. This is good for efficiency in HyperClockCache, which does not directly store the cache key in memory. (Alternative to https://github.com/facebook/rocksdb/issues/10774) * Add allocator to Cache DeleterFn. This is a kind of future-proofing change in case we get more serious about using the Cache allocator for memory tracked by the Cache. Right now, only the uncompressed block contents are allocated using the allocator, and a pointer to that allocator is saved as part of the cached object so that the deleter can use it. (See CacheAllocationPtr.) If in the future we are able to "flatten out" our Cache objects some more, it would be good not to have to track the allocator as part of each object. * Removes legacy `ApplyToAllCacheEntries` and changes `ApplyToAllEntries` signature for Deleter->CacheItemHelper change. ## typed_cache.h Adds various "typed" interfaces to the Cache as internal APIs, so that most uses of Cache can use simple type safe code without casting and without explicit deleters, etc. Almost all of the non-test, non-glue code uses of Cache have been migrated. (Follow-up work: CompressedSecondaryCache deserves deeper attention to migrate.) This change expands RocksDB's internal usage of metaprogramming and SFINAE (https://en.cppreference.com/w/cpp/language/sfinae). The existing usages of Cache are divided up at a high level into these new interfaces. See updated existing uses of Cache for examples of how these are used. * PlaceholderCacheInterface - Used for making cache reservations, with entries that have a charge but no value. * BasicTypedCacheInterface<TValue> - Used for primary cache storage of objects of type TValue, which can be cleaned up with std::default_delete<TValue>. The role is provided by TValue::kCacheEntryRole or given in an optional template parameter. * FullTypedCacheInterface<TValue, TCreateContext> - Used for secondary cache compatible storage of objects of type TValue. In addition to BasicTypedCacheInterface constraints, we require TValue::ContentSlice() to return persistable data. This simplifies usage for the normal case of simple secondary cache compatibility (can give you a Slice to the data already in memory). In addition to TCreateContext performing the role of Cache::CreateContext, it is also expected to provide a factory function for creating TValue. * For each of these, there's a "Shared" version (e.g. FullTypedSharedCacheInterface) that holds a shared_ptr to the Cache, rather than assuming external ownership by holding only a raw `Cache*`. These interfaces introduce specific handle types for each interface instantiation, so that it's easy to see what kind of object is controlled by a handle. (Ultimately, this might not be worth the extra complexity, but it seems OK so far.) Note: I attempted to make the cache 'charge' automatically inferred from the cache object type, such as by expecting an ApproximateMemoryUsage() function, but this is not so clean because there are cases where we need to compute the charge ahead of time and don't want to re-compute it. ## block_cache.h This header is essentially the replacement for the old block_like_traits.h. It includes various things to support block cache access with typed_cache.h for block-based table. ## block_based_table_reader.cc Before this change, accessing the block cache here was an awkward mix of static polymorphism (template TBlocklike) and switch-case on a dynamic BlockType value. This change mostly unifies on static polymorphism, relying on minor hacks in block_cache.h to distinguish variants of Block. We still check BlockType in some places (especially for stats, which could be improved in follow-up work) but at least the BlockType is a static constant from the template parameter. (No more awkward partial redundancy between static and dynamic info.) This likely contributes to the overall performance improvement, but hasn't been tested in isolation. The other key source of simplification here is a more unified system of creating block cache objects: for directly populating from primary cache and for promotion from secondary cache. Both use BlockCreateContext, for context and for factory functions. ## block_based_table_builder.cc, cache_dump_load_impl.cc Before this change, warming caches was super ugly code. Both of these source files had switch statements to basically transition from the dynamic BlockType world to the static TBlocklike world. None of that mess is needed anymore as there's a new, untyped WarmInCache function that handles all the details just as promotion from SecondaryCache would. (Fixes `TODO akanksha: Dedup below code` in block_based_table_builder.cc.) ## Everything else Mostly just updating Cache users to use new typed APIs when reasonably possible, or changed Cache APIs when not. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10975 Test Plan: tests updated Performance test setup similar to https://github.com/facebook/rocksdb/issues/10626 (by cache size, LRUCache when not "hyper" for HyperClockCache): 34MB 1thread base.hyper -> kops/s: 0.745 io_bytes/op: 2.52504e+06 miss_ratio: 0.140906 max_rss_mb: 76.4844 34MB 1thread new.hyper -> kops/s: 0.751 io_bytes/op: 2.5123e+06 miss_ratio: 0.140161 max_rss_mb: 79.3594 34MB 1thread base -> kops/s: 0.254 io_bytes/op: 1.36073e+07 miss_ratio: 0.918818 max_rss_mb: 45.9297 34MB 1thread new -> kops/s: 0.252 io_bytes/op: 1.36157e+07 miss_ratio: 0.918999 max_rss_mb: 44.1523 34MB 32thread base.hyper -> kops/s: 7.272 io_bytes/op: 2.88323e+06 miss_ratio: 0.162532 max_rss_mb: 516.602 34MB 32thread new.hyper -> kops/s: 7.214 io_bytes/op: 2.99046e+06 miss_ratio: 0.168818 max_rss_mb: 518.293 34MB 32thread base -> kops/s: 3.528 io_bytes/op: 1.35722e+07 miss_ratio: 0.914691 max_rss_mb: 264.926 34MB 32thread new -> kops/s: 3.604 io_bytes/op: 1.35744e+07 miss_ratio: 0.915054 max_rss_mb: 264.488 233MB 1thread base.hyper -> kops/s: 53.909 io_bytes/op: 2552.35 miss_ratio: 0.0440566 max_rss_mb: 241.984 233MB 1thread new.hyper -> kops/s: 62.792 io_bytes/op: 2549.79 miss_ratio: 0.044043 max_rss_mb: 241.922 233MB 1thread base -> kops/s: 1.197 io_bytes/op: 2.75173e+06 miss_ratio: 0.103093 max_rss_mb: 241.559 233MB 1thread new -> kops/s: 1.199 io_bytes/op: 2.73723e+06 miss_ratio: 0.10305 max_rss_mb: 240.93 233MB 32thread base.hyper -> kops/s: 1298.69 io_bytes/op: 2539.12 miss_ratio: 0.0440307 max_rss_mb: 371.418 233MB 32thread new.hyper -> kops/s: 1421.35 io_bytes/op: 2538.75 miss_ratio: 0.0440307 max_rss_mb: 347.273 233MB 32thread base -> kops/s: 9.693 io_bytes/op: 2.77304e+06 miss_ratio: 0.103745 max_rss_mb: 569.691 233MB 32thread new -> kops/s: 9.75 io_bytes/op: 2.77559e+06 miss_ratio: 0.103798 max_rss_mb: 552.82 1597MB 1thread base.hyper -> kops/s: 58.607 io_bytes/op: 1449.14 miss_ratio: 0.0249324 max_rss_mb: 1583.55 1597MB 1thread new.hyper -> kops/s: 69.6 io_bytes/op: 1434.89 miss_ratio: 0.0247167 max_rss_mb: 1584.02 1597MB 1thread base -> kops/s: 60.478 io_bytes/op: 1421.28 miss_ratio: 0.024452 max_rss_mb: 1589.45 1597MB 1thread new -> kops/s: 63.973 io_bytes/op: 1416.07 miss_ratio: 0.0243766 max_rss_mb: 1589.24 1597MB 32thread base.hyper -> kops/s: 1436.2 io_bytes/op: 1357.93 miss_ratio: 0.0235353 max_rss_mb: 1692.92 1597MB 32thread new.hyper -> kops/s: 1605.03 io_bytes/op: 1358.04 miss_ratio: 0.023538 max_rss_mb: 1702.78 1597MB 32thread base -> kops/s: 280.059 io_bytes/op: 1350.34 miss_ratio: 0.023289 max_rss_mb: 1675.36 1597MB 32thread new -> kops/s: 283.125 io_bytes/op: 1351.05 miss_ratio: 0.0232797 max_rss_mb: 1703.83 Almost uniformly improving over base revision, especially for hot paths with HyperClockCache, up to 12% higher throughput seen (1597MB, 32thread, hyper). The improvement for that is likely coming from much simplified code for providing context for secondary cache promotion (CreateCallback/CreateContext), and possibly from less branching in block_based_table_reader. And likely a small improvement from not reconstituting key for DeleterFn. Reviewed By: anand1976 Differential Revision: D42417818 Pulled By: pdillinger fbshipit-source-id: f86bfdd584dce27c028b151ba56818ad14f7a432	2023-01-11 14:20:40 -08:00
Changyu Bi	0a2d3b663a	Fix some unit test failure in ExternalSSTFileBasicTest (#11070 ) Summary: valgrind build for `ExternalSSTFileBasicTest/ExternalSSTFileBasicTest.IngestFileWithMixedValueType` and `ExternalSSTFileBasicTest/ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` started failing (see error message in T141554665). I could not repro but I suspect it is due to file ingestion range overlapping with ongoing compaction, which caused a new global seqno being assigned after https://github.com/facebook/rocksdb/issues/10988. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11070 Test Plan: monitor future valgrind tests result. Reviewed By: hx235 Differential Revision: D42319056 Pulled By: cbi42 fbshipit-source-id: acbcd841a2a15e36b278f39ba514f4b9a6ee43ca	2023-01-05 12:10:02 -08:00
Niklas Fiekas	ff04fb154b	Add C API for ReadOptions::async_io (#11062 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11062 Reviewed By: hx235 Differential Revision: D42297489 Pulled By: ajkr fbshipit-source-id: 03fe1477c1ae1f8af73dc77a6986fdc7025edf4f	2023-01-04 19:36:43 -08:00
ehds	4737e1d41b	fix shared state used after free (#11059 ) Summary: Before this pr, the destruction order is `shared` -> `db_`(StressTest destruction) -> `stress`, but `compaction_filter` of `db_` will hold the `shared` pointer, so `shared` maybe used after free. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11059 Reviewed By: hx235 Differential Revision: D42297366 Pulled By: ajkr fbshipit-source-id: 17b314635359acacd5ba62f9db5f955f451133f7	2023-01-04 19:35:34 -08:00

... 3 4 5 6 7 ...

11926 commits