rocksdb/unreleased_history/bug_fixes/open_sst_file_leaks.md
Peter Dillinger 485ee4f45c Fix and test for leaks of open SST files (#13117)
Summary:
Follow-up to https://github.com/facebook/rocksdb/issues/13106 which revealed that some SST file readers (in addition to blob files) were being essentially leaked in TableCache (until DB::Close() time). Patched sources of leaks:
* Flush that is not committed (builder.cc)
* Various obsolete SST files picked up by directory scan but not caught by SubcompactionState::Cleanup() cleaning up from some failed compactions. Dozens of unit tests fail without the "backstop" TableCache::Evict() call in PurgeObsoleteFiles().

We also needed to adjust the check for leaks as follows:
* Ok if DB::Open never finished (see comment)
* Ok if deletions are disabled (see comment)
* Allow "quarantined" files to be in table_cache because (presumably) they might become live again.
* Get live files from all live Versions.

Suggested follow-up:
* Potentially delete more obsolete files sooner with a FIXME in db_impl_files.cc. This could potentially be high value because it seems to gate deletion of any/all newer obsolete files on all older compactions finishing.
* Try to catch obsolete files in more places using the VersionSet::obsolete_files_ pipeline rather than relying on them being picked up with directory scan, or deleting them outside of normal mechanisms.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13117

Test Plan: updated check used in most all unit tests in ASAN build

Reviewed By: hx235

Differential Revision: D65502988

Pulled By: pdillinger

fbshipit-source-id: aa0795a8a09d9ec578d25183fe43e2a35849209c
2024-11-08 10:54:43 -08:00

186 B

  • Fix leaks of some open SST files (until DB::Close()) that are written but never become live due to various failures. (We now have a check for such leaks with no outstanding issues.)