rocksdb/db/db_impl
Peter Dillinger 6de7081cf3 Always verify SST unique IDs on SST file open (#10532)
Summary:
Although we've been tracking SST unique IDs in the DB manifest
unconditionally, checking has been opt-in and with an extra pass at DB::Open
time. This changes the behavior of `verify_sst_unique_id_in_manifest` to
check unique ID against manifest every time an SST file is opened through
table cache (normal DB operations), replacing the explicit pass over files
at DB::Open time. This change also enables the option by default and
removes the "EXPERIMENTAL" designation.

One possible criticism is that the option no longer ensures the integrity
of a DB at Open time. This is far from an all-or-nothing issue. Verifying
the IDs of all SST files hardly ensures all the data in the DB is readable.
(VerifyChecksum is supposed to do that.) Also, with
max_open_files=-1 (default, extremely common), all SST files are
opened at DB::Open time anyway.

Implementation details:
* `VerifySstUniqueIdInManifest()` functions are the extra/explicit pass
that is now removed.
* Unit tests that manipulate/corrupt table properties have to opt out of
this check, because that corrupts the "actual" unique id. (And even for
testing we don't currently have a mechanism to set "no unique id"
in the in-memory file metadata for new files.)
* A lot of other unit test churn relates to (a) default checking on, and
(b) checking on SST open even without DB::Open (e.g. on flush)
* Use `FileMetaData` for more `TableCache` operations (in place of
`FileDescriptor`) so that we have access to the unique_id whenever
we might need to open an SST file. **There is the possibility of
performance impact because we can no longer use the more
localized `fd` part of an `FdWithKeyRange` but instead follow the
`file_metadata` pointer. However, this change (possible regression)
is only done for `GetMemoryUsageByTableReaders`.**
* Removed a completely unnecessary constructor overload of
`TableReaderOptions`

Possible follow-up:
* Verification only happens when opening through table cache. Are there
more places where this should happen?
* Improve error message when there is a file size mismatch vs. manifest
(FIXME added in the appropriate place).
* I'm not sure there's a justification for `FileDescriptor` to be distinct from
`FileMetaData`.
* I'm skeptical that `FdWithKeyRange` really still makes sense for
optimizing some data locality by duplicating some data in memory, but I
could be wrong.
* An unnecessary overload of NewTableReader was recently added, in
the public API nonetheless (though unusable there). It should be cleaned
up to put most things under `TableReaderOptions`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10532

Test Plan:
updated unit tests

Performance test showing no significant difference (just noise I think):
`./db_bench -benchmarks=readwhilewriting[-X10] -num=3000000 -disable_wal=1 -bloom_bits=8 -write_buffer_size=1000000 -target_file_size_base=1000000`
Before: readwhilewriting [AVG 10 runs] : 68702 (± 6932) ops/sec
After: readwhilewriting [AVG 10 runs] : 68239 (± 7198) ops/sec

Reviewed By: jay-zhuang

Differential Revision: D38765551

Pulled By: pdillinger

fbshipit-source-id: a827a708155f12344ab2a5c16e7701c7636da4c2
2022-09-07 22:52:42 -07:00
..
compacted_db_impl.cc Fix periodic_task unable to re-register the same task type (#10379) 2022-08-25 18:52:37 -07:00
compacted_db_impl.h Don't wait for indirect flush in read-only DB (#10569) 2022-08-29 13:36:23 -07:00
db_impl.cc Skip swaths of range tombstone covered keys in merging iterator (2022 edition) (#10449) 2022-09-02 09:51:19 -07:00
db_impl.h Always verify SST unique IDs on SST file open (#10532) 2022-09-07 22:52:42 -07:00
db_impl_compaction_flush.cc Sync dir containing CURRENT after RenameFile on CURRENT as much as possible (#10573) 2022-08-29 17:35:21 -07:00
db_impl_debug.cc Fix periodic_task unable to re-register the same task type (#10379) 2022-08-25 18:52:37 -07:00
db_impl_experimental.cc Remove unused fields from FileMetaData (temporarily) (#10443) 2022-08-01 17:56:13 -07:00
db_impl_files.cc Do not hold mutex when write keys if not necessary (#7516) 2022-07-21 13:35:36 -07:00
db_impl_open.cc Always verify SST unique IDs on SST file open (#10532) 2022-09-07 22:52:42 -07:00
db_impl_readonly.cc Skip swaths of range tombstone covered keys in merging iterator (2022 edition) (#10449) 2022-09-02 09:51:19 -07:00
db_impl_readonly.h Don't wait for indirect flush in read-only DB (#10569) 2022-08-29 13:36:23 -07:00
db_impl_secondary.cc Skip swaths of range tombstone covered keys in merging iterator (2022 edition) (#10449) 2022-09-02 09:51:19 -07:00
db_impl_secondary.h Don't wait for indirect flush in read-only DB (#10569) 2022-08-29 13:36:23 -07:00
db_impl_write.cc Sync dir containing CURRENT after RenameFile on CURRENT as much as possible (#10573) 2022-08-29 17:35:21 -07:00