rocksdb/utilities/backup/backup_engine_impl.h

37 lines
1.4 KiB
C
Raw Normal View History

Begin forward compatibility for new backup meta schema (#8069) Summary: This does not add any new public APIs or published functionality, but adds the ability to read and use (and in tests, write) backups with a new meta file schema, based on the old schema but not forward-compatible (before this change). The new schema enables some capabilities not in the old: * Explicit versioning, so that users get clean error messages the next time we want to break forward compatibility. * Ignoring unrecognized fields (with warning), so that new non-critical features can be added without breaking forward compatibility. * Rejecting future "non-ignorable" fields, so that new features critical to some use-cases could potentially be added outside of linear schema versions, with broken forward compatibility. * Fields at the end of the meta file, such as for checksum of the meta file's contents (up to that point) * New optional 'size' field for each file, which is checked when present * Optionally omitting 'crc32' field, so that we aren't required to have a crc32c checksum for files to take a backup. (E.g. to support backup via hard links and to better support file custom checksums.) Because we do not have a JSON parser and to share code, the new schema is simply derived from the old schema. BackupEngine code is updated to allow missing checksums in some places, and to make that easier, `has_checksum` and `verify_checksum_after_work` are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm not too afraid of regressing on data integrity, because (a) we have pretty good test coverage of corruption detection in backups, and (b) we are increasingly relying on the DB itself for data integrity rather than it being an exclusive feature of backups. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069 Test Plan: new unit tests, added to crash test (some local run with boosted backup probability) Reviewed By: ajkr Differential Revision: D27139824 Pulled By: pdillinger fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-20 03:14:38 +00:00
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#pragma once
#ifndef ROCKSDB_LITE
#include "rocksdb/utilities/backup_engine.h"
Begin forward compatibility for new backup meta schema (#8069) Summary: This does not add any new public APIs or published functionality, but adds the ability to read and use (and in tests, write) backups with a new meta file schema, based on the old schema but not forward-compatible (before this change). The new schema enables some capabilities not in the old: * Explicit versioning, so that users get clean error messages the next time we want to break forward compatibility. * Ignoring unrecognized fields (with warning), so that new non-critical features can be added without breaking forward compatibility. * Rejecting future "non-ignorable" fields, so that new features critical to some use-cases could potentially be added outside of linear schema versions, with broken forward compatibility. * Fields at the end of the meta file, such as for checksum of the meta file's contents (up to that point) * New optional 'size' field for each file, which is checked when present * Optionally omitting 'crc32' field, so that we aren't required to have a crc32c checksum for files to take a backup. (E.g. to support backup via hard links and to better support file custom checksums.) Because we do not have a JSON parser and to share code, the new schema is simply derived from the old schema. BackupEngine code is updated to allow missing checksums in some places, and to make that easier, `has_checksum` and `verify_checksum_after_work` are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm not too afraid of regressing on data integrity, because (a) we have pretty good test coverage of corruption detection in backups, and (b) we are increasingly relying on the DB itself for data integrity rather than it being an exclusive feature of backups. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069 Test Plan: new unit tests, added to crash test (some local run with boosted backup probability) Reviewed By: ajkr Differential Revision: D27139824 Pulled By: pdillinger fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-20 03:14:38 +00:00
namespace ROCKSDB_NAMESPACE {
New backup meta schema, with file temperatures (#9660) Summary: The primary goal of this change is to add support for backing up and restoring (applying on restore) file temperature metadata, without committing to either the DB manifest or the FS reported "current" temperatures being exclusive "source of truth". To achieve this goal, we need to add temperature information to backup metadata, which requires updated backup meta schema. Fortunately I prepared for this in https://github.com/facebook/rocksdb/issues/8069, which began forward compatibility in version 6.19.0 for this kind of schema update. (Previously, backup meta schema was not extensible! Making this schema update public will allow some other "nice to have" features like taking backups with hard links, and avoiding crc32c checksum computation when another checksum is already available.) While schema version 2 is newly public, the default schema version is still 1. Until we change the default, users will need to set to 2 to enable features like temperature data backup+restore. New metadata like temperature information will be ignored with a warning in versions before this change and since 6.19.0. The metadata is considered ignorable because a functioning DB can be restored without it. Some detail: * Some renaming because "future schema" is now just public schema 2. * Initialize some atomics in TestFs (linter reported) * Add temperature hint support to SstFileDumper (used by BackupEngine) Pull Request resolved: https://github.com/facebook/rocksdb/pull/9660 Test Plan: related unit test majorly updated for the new functionality, including some shared testing support for tracking temperatures in a FS. Some other tests and testing hooks into production code also updated for making the backup meta schema change public. Reviewed By: ajkr Differential Revision: D34686968 Pulled By: pdillinger fbshipit-source-id: 3ac1fa3e67ee97ca8a5103d79cc87d872c1d862a
2022-03-18 18:06:17 +00:00
struct TEST_BackupMetaSchemaOptions {
Begin forward compatibility for new backup meta schema (#8069) Summary: This does not add any new public APIs or published functionality, but adds the ability to read and use (and in tests, write) backups with a new meta file schema, based on the old schema but not forward-compatible (before this change). The new schema enables some capabilities not in the old: * Explicit versioning, so that users get clean error messages the next time we want to break forward compatibility. * Ignoring unrecognized fields (with warning), so that new non-critical features can be added without breaking forward compatibility. * Rejecting future "non-ignorable" fields, so that new features critical to some use-cases could potentially be added outside of linear schema versions, with broken forward compatibility. * Fields at the end of the meta file, such as for checksum of the meta file's contents (up to that point) * New optional 'size' field for each file, which is checked when present * Optionally omitting 'crc32' field, so that we aren't required to have a crc32c checksum for files to take a backup. (E.g. to support backup via hard links and to better support file custom checksums.) Because we do not have a JSON parser and to share code, the new schema is simply derived from the old schema. BackupEngine code is updated to allow missing checksums in some places, and to make that easier, `has_checksum` and `verify_checksum_after_work` are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm not too afraid of regressing on data integrity, because (a) we have pretty good test coverage of corruption detection in backups, and (b) we are increasingly relying on the DB itself for data integrity rather than it being an exclusive feature of backups. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069 Test Plan: new unit tests, added to crash test (some local run with boosted backup probability) Reviewed By: ajkr Differential Revision: D27139824 Pulled By: pdillinger fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-20 03:14:38 +00:00
std::string version = "2";
bool crc32c_checksums = false;
bool file_sizes = true;
std::map<std::string, std::string> meta_fields;
std::map<std::string, std::string> file_fields;
std::map<std::string, std::string> footer_fields;
};
// Modifies the BackupEngine(Impl) to write backup meta files using the
// unpublished schema version 2, for the life of this object (not backup_dir).
New backup meta schema, with file temperatures (#9660) Summary: The primary goal of this change is to add support for backing up and restoring (applying on restore) file temperature metadata, without committing to either the DB manifest or the FS reported "current" temperatures being exclusive "source of truth". To achieve this goal, we need to add temperature information to backup metadata, which requires updated backup meta schema. Fortunately I prepared for this in https://github.com/facebook/rocksdb/issues/8069, which began forward compatibility in version 6.19.0 for this kind of schema update. (Previously, backup meta schema was not extensible! Making this schema update public will allow some other "nice to have" features like taking backups with hard links, and avoiding crc32c checksum computation when another checksum is already available.) While schema version 2 is newly public, the default schema version is still 1. Until we change the default, users will need to set to 2 to enable features like temperature data backup+restore. New metadata like temperature information will be ignored with a warning in versions before this change and since 6.19.0. The metadata is considered ignorable because a functioning DB can be restored without it. Some detail: * Some renaming because "future schema" is now just public schema 2. * Initialize some atomics in TestFs (linter reported) * Add temperature hint support to SstFileDumper (used by BackupEngine) Pull Request resolved: https://github.com/facebook/rocksdb/pull/9660 Test Plan: related unit test majorly updated for the new functionality, including some shared testing support for tracking temperatures in a FS. Some other tests and testing hooks into production code also updated for making the backup meta schema change public. Reviewed By: ajkr Differential Revision: D34686968 Pulled By: pdillinger fbshipit-source-id: 3ac1fa3e67ee97ca8a5103d79cc87d872c1d862a
2022-03-18 18:06:17 +00:00
// TEST_BackupMetaSchemaOptions offers some customization for testing.
void TEST_SetBackupMetaSchemaOptions(
BackupEngine *engine, const TEST_BackupMetaSchemaOptions &options);
Begin forward compatibility for new backup meta schema (#8069) Summary: This does not add any new public APIs or published functionality, but adds the ability to read and use (and in tests, write) backups with a new meta file schema, based on the old schema but not forward-compatible (before this change). The new schema enables some capabilities not in the old: * Explicit versioning, so that users get clean error messages the next time we want to break forward compatibility. * Ignoring unrecognized fields (with warning), so that new non-critical features can be added without breaking forward compatibility. * Rejecting future "non-ignorable" fields, so that new features critical to some use-cases could potentially be added outside of linear schema versions, with broken forward compatibility. * Fields at the end of the meta file, such as for checksum of the meta file's contents (up to that point) * New optional 'size' field for each file, which is checked when present * Optionally omitting 'crc32' field, so that we aren't required to have a crc32c checksum for files to take a backup. (E.g. to support backup via hard links and to better support file custom checksums.) Because we do not have a JSON parser and to share code, the new schema is simply derived from the old schema. BackupEngine code is updated to allow missing checksums in some places, and to make that easier, `has_checksum` and `verify_checksum_after_work` are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm not too afraid of regressing on data integrity, because (a) we have pretty good test coverage of corruption detection in backups, and (b) we are increasingly relying on the DB itself for data integrity rather than it being an exclusive feature of backups. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069 Test Plan: new unit tests, added to crash test (some local run with boosted backup probability) Reviewed By: ajkr Differential Revision: D27139824 Pulled By: pdillinger fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-20 03:14:38 +00:00
Use SpecialEnv to speed up some slow BackupEngineRateLimitingTestWithParam (#9974) Summary: **Context:** `BackupEngineRateLimitingTestWithParam.RateLimiting` and `BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup` involve creating backup and restoring of a big database with rate-limiting. Using the normal env with a normal clock requires real elapse of time (13702 - 19848 ms/per test). As suggested in https://github.com/facebook/rocksdb/pull/8722#discussion_r703698603, this PR is to speed it up with SpecialEnv (`time_elapse_only_sleep=true`) where its clock accepts fake elapse of time during rate-limiting (100 - 600 ms/per test) **Summary:** - Added TEST_ function to set clock of the default rate limiters in backup engine - Shrunk testdb by 10 times while keeping it big enough for testing - Renamed some test variables and reorganized some if-else branch for clarity without changing the test Pull Request resolved: https://github.com/facebook/rocksdb/pull/9974 Test Plan: - Run tests pre/post PR the same time to verify the tests are sped up by 90 - 95% `BackupEngineRateLimitingTestWithParam.RateLimiting` Pre: ``` [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (11123 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (9441 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (11096 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (9339 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (11121 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (9413 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (11185 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (9511 ms) [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82230 ms total) ``` Post: ``` [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (395 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (564 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (358 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (567 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (173 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (176 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (191 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (177 ms) [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (2601 ms total) ``` `BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup` Pre: ``` [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (7275 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (3961 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (7117 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (3921 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (19862 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (10231 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (19848 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (10372 ms) [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82587 ms total) ``` Post: ``` [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (157 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (152 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (160 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (158 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (155 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (151 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (146 ms) [ RUN ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 [ OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (153 ms) [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (1232 ms total) ``` Reviewed By: pdillinger Differential Revision: D36336345 Pulled By: hx235 fbshipit-source-id: 724c6ba745f95f56d4440a6d2f1e4512a2987589
2022-05-16 17:54:02 +00:00
// Modifies the BackupEngine(Impl) to use specified clocks for backup and
// restore rate limiters created by default if not specified by users for
// test speedup.
void TEST_SetDefaultRateLimitersClock(
BackupEngine* engine,
const std::shared_ptr<SystemClock>& backup_rate_limiter_clock = nullptr,
const std::shared_ptr<SystemClock>& restore_rate_limiter_clock = nullptr);
Begin forward compatibility for new backup meta schema (#8069) Summary: This does not add any new public APIs or published functionality, but adds the ability to read and use (and in tests, write) backups with a new meta file schema, based on the old schema but not forward-compatible (before this change). The new schema enables some capabilities not in the old: * Explicit versioning, so that users get clean error messages the next time we want to break forward compatibility. * Ignoring unrecognized fields (with warning), so that new non-critical features can be added without breaking forward compatibility. * Rejecting future "non-ignorable" fields, so that new features critical to some use-cases could potentially be added outside of linear schema versions, with broken forward compatibility. * Fields at the end of the meta file, such as for checksum of the meta file's contents (up to that point) * New optional 'size' field for each file, which is checked when present * Optionally omitting 'crc32' field, so that we aren't required to have a crc32c checksum for files to take a backup. (E.g. to support backup via hard links and to better support file custom checksums.) Because we do not have a JSON parser and to share code, the new schema is simply derived from the old schema. BackupEngine code is updated to allow missing checksums in some places, and to make that easier, `has_checksum` and `verify_checksum_after_work` are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm not too afraid of regressing on data integrity, because (a) we have pretty good test coverage of corruption detection in backups, and (b) we are increasingly relying on the DB itself for data integrity rather than it being an exclusive feature of backups. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069 Test Plan: new unit tests, added to crash test (some local run with boosted backup probability) Reviewed By: ajkr Differential Revision: D27139824 Pulled By: pdillinger fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-20 03:14:38 +00:00
} // namespace ROCKSDB_NAMESPACE
#endif // ROCKSDB_LITE