rocksdb/db_stress_tool
Peter Dillinger b3c54186ab Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653)
Summary:
In rare cases, optimistic transaction commit returns TryAgain. This change tolerates that intentional behavior in db_stress, up to a small limit in a row. This way, we don't miss a possible regression with excessive TryAgain, and trying again (rolling back the transaction) should have a well renewed chance of success as the writes will be associated with fresh sequence numbers.

Also, some of the APIs were not clear about Transaction semantics, so I have clarified:
* (Best I can tell....) Destroying a Transaction is safe without calling Rollback() (or at least should be). I don't know why it's a common pattern in our test code and examples to rollback before unconditional destruction. Stress test updated not to call Rollback unnecessarily (to test safe destruction).
* Despite essentially doing what is asked, simply trying Commit() again when it returns TryAgain does not have a chance of success, because of the transaction being bound to the DB state at the time of operations before Commit. Similar logic applies to Busy AFAIK. Commit() API comments updated, and expanded unit test in optimistic_transaction_test.

Also also, because I can't stop myself, I refactored a good portion of the transaction handling code in db_stress.
* Avoid existing and new copy-paste for most transaction interactions with a new ExecuteTransaction (higher-order) function.
* Use unique_ptr (nicely complements removing unnecessary Rollbacks)
* Abstract out a pattern for safely calling std::terminate() and use it in more places. (The TryAgain errors we saw did not have stack traces because of "terminate called recursively".)

Intended follow-up: resurrect use of `FLAGS_rollback_one_in` but also include non-trivial cases

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11653

Test Plan:
this is the test :)

Also, temporarily bypassed the new retry logic and boosted the chance of hitting TryAgain. Quickly reproduced the TryAgain error. Then re-enabled the new retry logic, and was not able to hit the error after running for tens of minutes, even with the boosted chances.

Reviewed By: cbi42

Differential Revision: D47882995

Pulled By: pdillinger

fbshipit-source-id: 21eadb1525423340dbf28d17cf166b9583311a0d
2023-07-28 16:25:29 -07:00
..
CMakeLists.txt Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
batched_ops_stress.cc Support parallel read and write/delete to same key in NonBatchedOpsStressTest (#11058) 2023-05-15 15:34:22 -07:00
cf_consistency_stress.cc Extend the stress test coverage of MultiGetEntity (#11336) 2023-03-29 20:35:15 -07:00
db_stress.cc Disable tiered storage + BlobDB stress test (#10699) 2022-09-19 15:39:31 -07:00
db_stress_common.cc Increase the stress test coverage of GetEntity (#11303) 2023-03-17 14:47:29 -07:00
db_stress_common.h Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
db_stress_compaction_filter.h Enable compaction filter for db_stress with user-defined timestamp (#10259) 2022-06-27 11:53:09 -07:00
db_stress_driver.cc Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
db_stress_driver.h fix shared state used after free (#11059) 2023-01-04 19:35:34 -08:00
db_stress_env_wrapper.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
db_stress_gflags.cc Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
db_stress_listener.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_stress_listener.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
db_stress_shared_state.cc Remove ROCKSDB_SUPPORT_THREAD_LOCAL define because it's a part of C++11 (#10015) 2022-05-18 15:25:19 -07:00
db_stress_shared_state.h Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653) 2023-07-28 16:25:29 -07:00
db_stress_stat.cc Fix Statistics in db_stress (#9260) 2021-12-07 16:24:22 -08:00
db_stress_stat.h Fix Statistics in db_stress (#9260) 2021-12-07 16:24:22 -08:00
db_stress_table_properties_collector.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
db_stress_test_base.cc Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653) 2023-07-28 16:25:29 -07:00
db_stress_test_base.h Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653) 2023-07-28 16:25:29 -07:00
db_stress_tool.cc Stress/Crash Test for OptimisticTransactionDB (#11513) 2023-06-17 16:27:37 -07:00
expected_state.cc Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
expected_state.h Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
expected_value.cc Improve comment of ExpectedValue in db stress (#11456) 2023-05-18 09:44:15 -07:00
expected_value.h Refactor WriteUnpreparedStressTest to be a unit test (#11424) 2023-05-22 12:31:52 -07:00
multi_ops_txns_stress.cc Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653) 2023-07-28 16:25:29 -07:00
multi_ops_txns_stress.h Group rocksdb.sst.read.micros stat by IOActivity flush and compaction (#11288) 2023-04-21 09:07:18 -07:00
no_batched_ops_stress.cc Allow TryAgain in db_stress with optimistic txn, and refactoring (#11653) 2023-07-28 16:25:29 -07:00