mirror of
https://github.com/facebook/rocksdb.git
synced 2024-11-29 09:36:17 +00:00
92dc5f3e67
Summary: I have finally tracked down and fixed a bug affecting AutoHCC that was causing CI crash test assertion failures in AutoHCC when using secondary cache, but I was only able to reproduce locally a couple of times, after very long runs/repetitions. It turns out that the essential feature used by secondary cache to trigger the bug is Insert without keeping a handle, which is otherwise rarely used in RocksDB and not incorporated into cache_bench (also used for targeted correctness stress testing) until this change (new option `-blind_insert_percent`). The problem was in copying some logic from FixedHCC that makes the entry "sharable" but unreferenced once populated, if no reference is to be saved. The problem in AutoHCC is that we can only add the entry to a chain after it is in the sharable state, and must be removed from the chain while in the "under (de)construction" state and before it is back in the "empty" state. Also, it is possible for Lookup to find entries that are not connected to any chain, by design for efficiency, and for Release to erase_if_last_ref. Therefore, we could have * Thread 1 starts to Insert a cache entry without keeping ref, and pauses before adding to the chain. * Thread 2 finds it with Lookup optimizations, and then does Release with `erase_if_last_ref=true` causing it to trigger erasure on the entry. It successfully locks the home chain for the entry and purges any entries pending erasure. It is OK that this entry is not found on the chain, as another thread is allowed to remove it from the chain before we are able to (but after is it marked for (de)construction). And after the purge of the chain, the entry is marked empty. * Thread 1 resumes in adding the slot (presumed entry) to the home chain for what was being inserted, but that now violates invariants and sets up a race or double-chain-reference as another thread could insert a new entry in the slot and try to insert into a different chain. This is easily fixed by holding on to a reference until inserted onto the chain. Pull Request resolved: https://github.com/facebook/rocksdb/pull/12046 Test Plan: As I don't have a reliable local reproducer, I triggered 20 runs of internal CI on fbcode_blackbox_crash_test that were previously failing in AutoHCC with about 1/3 probability, and they all passed. Also re-enabling AutoHCC in the crash test with this change. (Revert https://github.com/facebook/rocksdb/issues/12000) Reviewed By: jowlyzhang Differential Revision: D51016979 Pulled By: pdillinger fbshipit-source-id: 3840fb829d65b97c779d8aed62a4a4a433aeff2b |
||
---|---|---|
.. | ||
behavior_changes | ||
bug_fixes | ||
new_features | ||
performance_improvements | ||
public_api_changes | ||
add.sh | ||
README.txt | ||
release.sh |
Adding release notes -------------------- When adding release notes for the next release, add a file to one of these directories: unreleased_history/new_features unreleased_history/behavior_changes unreleased_history/public_api_changes unreleased_history/bug_fixes with a unique name that makes sense for your change, preferably using the .md extension for syntax highlighting. There is a script to help, as in $ unreleased_history/add.sh unreleased_history/bug_fixes/crash_in_feature.md or simply $ unreleased_history/add.sh will take you through some prompts. The file should usually contain one line of markdown, and "* " is not required, as it will automatically be inserted later if not included at the start of the first line in the file. Extra newlines or missing trailing newlines will also be corrected. The only times release notes should be added directly to HISTORY are if * A release is being amended or corrected after it is already "cut" but not tagged, which should be rare. * A single commit contains a noteworthy change and a patch release version bump Ordering of entries ------------------- Within each group, entries will be included using ls sort order, so important entries could start their file name with a small three digit number like 100pretty_important.md. The ordering of groups such as new_features vs. public_api_changes is hard-coded in unreleased_history/release.sh Updating HISTORY.md with release notes -------------------------------------- The script unreleased_history/release.sh does this. Run the script before updating version.h to the next develpment release, so that the script will pick up the version being released. You might want to start with $ DRY_RUN=1 unreleased_history/release.sh | less to check for problems and preview the output. Then run $ unreleased_history/release.sh which will git rm some files and modify HISTORY.md. You still need to commit the changes, or revert with the command reported in the output. Why not update HISTORY.md directly? ----------------------------------- First, it was common to hit unnecessary merge conflicts when adding entries to HISTORY.md, which slowed development. Second, when a PR was opened before a release cut and landed after the release cut, it was easy to add the HISTORY entry to the wrong version's history. This new setup completely fixes both of those issues, with perhaps slighly more initial work to create each entry. There is also now an extra step in using `git blame` to map a release note to its source code implementation, but that is a relatively rare operation.