rocksdb/db
Zhichao Cao a904c62d28 Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412)
Summary:
In PR https://github.com/facebook/rocksdb/issues/7523 , checksum handoff is introduced in RocksDB for WAL, Manifest, and SST files. When user enable checksum handoff for a certain type of file, before the data is written to the lower layer storage system, we calculate the checksum (crc32c) of each piece of data and pass the checksum down with the data, such that data verification can be down by the lower layer storage system if it has the capability. However, it cannot cover the whole lifetime of the data in the memory and also it potentially introduces extra checksum calculation overhead.

In this PR, we introduce a new interface in WritableFileWriter::Append, which allows the caller be able to pass the data and the checksum (crc32c) together. In this way, WritableFileWriter can directly use the pass-in checksum (crc32c) to generate the checksum of data being passed down to the storage system. It saves the calculation overhead and achieves higher protection coverage. When a new checksum is added with the data, we use Crc32cCombine https://github.com/facebook/rocksdb/issues/8305 to combine the existing checksum and the new checksum. To avoid the segmenting of data by rate-limiter before it is stored, rate-limiter is called enough times to accumulate enough credits for a certain write. This design only support Manifest and WAL which use log_writer in the current stage.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8412

Test Plan: make check, add new testing cases.

Reviewed By: anand1976

Differential Revision: D29151545

Pulled By: zhichao-cao

fbshipit-source-id: 75e2278c5126cfd58393c67b1efd18dcc7a30772
2021-06-25 00:47:17 -07:00
..
blob Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
compaction Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
db_impl Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
arena_wrapped_db_iter.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
arena_wrapped_db_iter.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
builder.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
builder.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
c.cc Expose CompressionOptions::parallel_threads through C API (#8302) 2021-05-17 22:53:04 -07:00
c_test.c Expose CompressionOptions::parallel_threads through C API (#8302) 2021-05-17 22:53:04 -07:00
column_family.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
column_family.h Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
column_family_test.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
compact_files_test.cc Compaction should not move data to up level (#8116) 2021-03-29 17:10:42 -07:00
comparator_db_test.cc
convenience.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
corruption_test.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
cuckoo_table_db_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
db_basic_test.cc Initial support for secondary cache in LRUCache (#8271) 2021-05-13 22:58:40 -07:00
db_block_cache_test.cc Cache warming data blocks during flush (#8242) 2021-06-17 21:56:47 -07:00
db_bloom_filter_test.cc Add table properties for number of entries added to filters (#8323) 2021-05-21 17:11:32 -07:00
db_compaction_filter_test.cc Allow applying `CompactionFilter` outside of compaction (#8243) 2021-05-07 16:01:40 -07:00
db_compaction_test.cc Fix DeleteFilesInRange may cause inconsistent compaction error (#8434) 2021-06-22 09:17:37 -07:00
db_dynamic_level_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
db_encryption_test.cc Improvements to Env::GetChildren (#7819) 2021-01-09 09:44:34 -08:00
db_filesnapshot.cc Fix `GetLiveFiles()` returning OPTIONS-000000 (#8268) 2021-05-05 12:54:46 -07:00
db_flush_test.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
db_info_dumper.cc Fix write-ahead log file size overflow (#7870) 2021-01-19 13:47:48 -08:00
db_info_dumper.h
db_inplace_update_test.cc
db_io_failure_test.cc Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715) 2021-01-06 14:15:02 -08:00
db_iter.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iter.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iter_stress_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_iter_test.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iterator_test.cc Instantiate tests DBIteratorTestForPinnedData (#8051) 2021-03-12 12:31:29 -08:00
db_kv_checksum_test.cc Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
db_log_iter_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
db_logical_block_size_cache_test.cc
db_memtable_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_merge_operand_test.cc Support for Merge in Integrated BlobDB with base values (#8292) 2021-06-10 12:58:37 -07:00
db_merge_operator_test.cc
db_options_test.cc Add ObjectRegistry to ConfigOptions (#8166) 2021-05-11 06:47:22 -07:00
db_properties_test.cc Assert unlimited max_open_files for FIFO compaction. (#8172) 2021-04-14 12:05:47 -07:00
db_range_del_test.cc Fix manual compaction `max_compaction_bytes` under-calculated issue (#8269) 2021-05-21 14:03:44 -07:00
db_secondary_test.cc Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
db_sst_test.cc Fix flaky failure in DBSSTest.DBWithSstFileManagerForBlobFilesWithGC (#8196) 2021-04-15 20:18:57 -07:00
db_statistics_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
db_table_properties_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
db_tailing_iter_test.cc Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715) 2021-01-06 14:15:02 -08:00
db_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
db_test2.cc Cancel compact range (#8351) 2021-06-07 11:41:31 -07:00
db_test_util.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
db_test_util.h Cache warming data blocks during flush (#8242) 2021-06-17 21:56:47 -07:00
db_universal_compaction_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
db_wal_test.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
db_with_timestamp_basic_test.cc Fix a bug in key comparison when index type is kBinarySearchWithFirstKey (#8062) 2021-03-15 17:44:52 -07:00
db_with_timestamp_compaction_test.cc
db_write_buffer_manager_test.cc Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) 2021-04-21 13:54:02 -07:00
db_write_test.cc Add more tests to ASSERT_STATUS_CHECKED (4) (#7718) 2020-12-22 15:09:39 -08:00
dbformat.cc
dbformat.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
dbformat_test.cc
deletefile_test.cc Add more tests to ASSERT_STATUS_CHECKED (4) (#7718) 2020-12-22 15:09:39 -08:00
error_handler.cc All the NoSpace() errors will be handled by regular SetBGError and RecoverFromNoSpace() (#8376) 2021-06-11 14:48:28 -07:00
error_handler.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
error_handler_fs_test.cc All the NoSpace() errors will be handled by regular SetBGError and RecoverFromNoSpace() (#8376) 2021-06-11 14:48:28 -07:00
event_helpers.cc Add table properties for number of entries added to filters (#8323) 2021-05-21 17:11:32 -07:00
event_helpers.h
experimental.cc
external_sst_file_basic_test.cc Sync ingested files only if reopen is supported by the FS (#8296) 2021-05-18 19:33:55 -07:00
external_sst_file_ingestion_job.cc Sync ingested files only if reopen is supported by the FS (#8296) 2021-05-18 19:33:55 -07:00
external_sst_file_ingestion_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
external_sst_file_test.cc Deflake ExternalSSTFileTest.PickedLevelBug (#8307) 2021-05-20 09:29:57 -07:00
fault_injection_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
file_indexer.cc
file_indexer.h
file_indexer_test.cc
filename_test.cc
flush_job.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
flush_job.h Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
flush_job_test.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
flush_scheduler.cc
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
forward_iterator.h
forward_iterator_bench.cc
import_column_family_job.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
import_column_family_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
import_column_family_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
internal_stats.cc Pin CacheEntryStatsCollector to fix performance bug (#8385) 2021-06-14 08:15:11 -07:00
internal_stats.h Pin CacheEntryStatsCollector to fix performance bug (#8385) 2021-06-14 08:15:11 -07:00
job_context.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
kv_checksum.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
listener_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
log_format.h
log_reader.cc
log_reader.h
log_test.cc Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
log_writer.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
log_writer.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
logs_with_prep_tracker.cc
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h
malloc_stats.cc
malloc_stats.h
manual_compaction_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
memtable.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
memtable.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
memtable_list.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
memtable_list.h
memtable_list_test.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper.cc Add support for Merge with base value during Compaction in IntegratedBlobDB (#8445) 2021-06-24 18:11:30 -07:00
merge_helper.h Add support for Merge with base value during Compaction in IntegratedBlobDB (#8445) 2021-06-24 18:11:30 -07:00
merge_helper_test.cc
merge_operator.cc
merge_test.cc MergeHelper::FilterMerge() calling ElapsedNanosSafe() upon exit even … (#7867) 2021-01-21 13:13:02 -08:00
obsolete_files_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc
output_validator.h Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.h Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
periodic_work_scheduler_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
pinned_iterators_manager.h
plain_table_db_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
pre_release_callback.h
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator.cc
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_del_aggregator_bench.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_test.cc
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter_test.cc
read_callback.h
repair.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
repair_test.cc Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
snapshot_checker.h
snapshot_impl.cc
snapshot_impl.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
table_cache.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
table_cache.h Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
table_properties_collector.cc Apply `sample_for_compression` to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Make it possible to apply only a subrange of table property collectors (#8298) 2021-05-17 18:28:39 -07:00
table_properties_collector_test.cc Make it possible to apply only a subrange of table property collectors (#8298) 2021-05-17 18:28:39 -07:00
transaction_log_impl.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
transaction_log_impl.h
trim_history_scheduler.cc
trim_history_scheduler.h
version_builder.cc Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_builder.h Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_builder_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
version_edit.cc Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
version_edit.h Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
version_edit_handler.cc Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc Make it able to ignore WAL related VersionEdits in older versions (#7873) 2021-01-19 19:27:53 -08:00
version_set.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
version_set.h Remove obsolete method VersionSet::VerifyCompactionFileConsistency (#8449) 2021-06-23 13:28:34 -07:00
version_set_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
wal_edit.cc
wal_edit.h
wal_edit_test.cc
wal_manager.cc Make types of Immutable/Mutable Options fields match that of the underlying Option (#8176) 2021-04-22 20:43:54 -07:00
wal_manager.h
wal_manager_test.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
write_batch.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
write_batch_base.cc
write_batch_internal.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
write_batch_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
write_callback.h
write_callback_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) 2021-04-21 13:54:02 -07:00
write_thread.h typo: fix typo in db/write_thread's state (#8423) 2021-06-18 17:14:51 -07:00