rocksdb/file/file_util.h
Peter Dillinger a53ed91691 Fix/improve temperature handling for file ingestion (#12402)
Summary:
Partly following up on leftovers from https://github.com/facebook/rocksdb/issues/12388

In terms of public API:
* Make it clear that IngestExternalFileArg::file_temperature is just a hint for opening the existing file, though it was previously used for both copy-from temp hint and copy-to temp, which was bizarre.
* Specify how IngestExternalFile assigns temperature to file ingested into DB. (See details in comments.) This approach is not perfect in terms of matching how the DB assigns temperatures, but was the simplest way to get close. The key complication for matching DB temperature assignments is that ingestion files are copied (to a destination temp) before their target level is determined (in general).
* Add a temperature option to SstFileWriter::Open so that files intended for ingestion can be initially written to a chosen temperature.
* Note that "fail_if_not_bottommost_level" is obsolete/confusing use of "bottommost"

In terms of the implementation, there was a similar bit of oddness with the internal CopyFile API, which only took one temperature, ambiguously applicable to the source, destination, or both. This is also fixed.

Eventual suggested follow-up:
* Before copying files for ingestion, determine a tentative level assignment to use for destination temperature, and keep that even if final level assignment happens to be different at commit time (rare).
* More temperature handling for CreateColumnFamilyWithImport and Checkpoints.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12402

Test Plan:
Deeply revamped
ExternalSSTFileBasicTest.IngestWithTemperature to test the new changes. Previously this test was insufficient because it was only looking at temperatures according to the DB manifest. Incorporating FileTemperatureTestFS allows us to also test the temperatures in the storage layer.

Used macros instead of functions for better tracing to critical source location on test failures.

Some enhancements to FileTemperatureTestFS in the process of developing the revamped test.

Reviewed By: jowlyzhang

Differential Revision: D54442794

Pulled By: pdillinger

fbshipit-source-id: 41d9d0afdc073e6a983304c10bbc07c70cc7e995
2024-03-05 16:56:08 -08:00

109 lines
4.3 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
#pragma once
#include <string>
#include "file/filename.h"
#include "options/db_options.h"
#include "rocksdb/env.h"
#include "rocksdb/file_system.h"
#include "rocksdb/sst_file_writer.h"
#include "rocksdb/statistics.h"
#include "rocksdb/status.h"
#include "rocksdb/system_clock.h"
#include "rocksdb/types.h"
#include "trace_replay/io_tracer.h"
namespace ROCKSDB_NAMESPACE {
// use_fsync maps to options.use_fsync, which determines the way that
// the file is synced after copying.
IOStatus CopyFile(FileSystem* fs, const std::string& source,
Temperature src_temp_hint,
std::unique_ptr<WritableFileWriter>& dest_writer,
uint64_t size, bool use_fsync,
const std::shared_ptr<IOTracer>& io_tracer);
IOStatus CopyFile(FileSystem* fs, const std::string& source,
Temperature src_temp_hint, const std::string& destination,
Temperature dst_temp, uint64_t size, bool use_fsync,
const std::shared_ptr<IOTracer>& io_tracer);
inline IOStatus CopyFile(const std::shared_ptr<FileSystem>& fs,
const std::string& source, Temperature src_temp_hint,
const std::string& destination, Temperature dst_temp,
uint64_t size, bool use_fsync,
const std::shared_ptr<IOTracer>& io_tracer) {
return CopyFile(fs.get(), source, src_temp_hint, destination, dst_temp, size,
use_fsync, io_tracer);
}
IOStatus CreateFile(FileSystem* fs, const std::string& destination,
const std::string& contents, bool use_fsync);
inline IOStatus CreateFile(const std::shared_ptr<FileSystem>& fs,
const std::string& destination,
const std::string& contents, bool use_fsync) {
return CreateFile(fs.get(), destination, contents, use_fsync);
}
Status DeleteDBFile(const ImmutableDBOptions* db_options,
const std::string& fname, const std::string& path_to_sync,
const bool force_bg, const bool force_fg);
// TODO(hx235): pass the whole DBOptions intead of its individual fields
IOStatus GenerateOneFileChecksum(
FileSystem* fs, const std::string& file_path,
FileChecksumGenFactory* checksum_factory,
const std::string& requested_checksum_func_name, std::string* file_checksum,
std::string* file_checksum_func_name,
size_t verify_checksums_readahead_size, bool allow_mmap_reads,
std::shared_ptr<IOTracer>& io_tracer, RateLimiter* rate_limiter,
const ReadOptions& read_options, Statistics* stats, SystemClock* clock);
inline IOStatus PrepareIOFromReadOptions(const ReadOptions& ro,
SystemClock* clock, IOOptions& opts) {
if (ro.deadline.count()) {
std::chrono::microseconds now =
std::chrono::microseconds(clock->NowMicros());
// Ensure there is atleast 1us available. We don't want to pass a value of
// 0 as that means no timeout
if (now >= ro.deadline) {
return IOStatus::TimedOut("Deadline exceeded");
}
opts.timeout = ro.deadline - now;
}
if (ro.io_timeout.count() &&
(!opts.timeout.count() || ro.io_timeout < opts.timeout)) {
opts.timeout = ro.io_timeout;
}
opts.rate_limiter_priority = ro.rate_limiter_priority;
opts.io_activity = ro.io_activity;
return IOStatus::OK();
}
inline IOStatus PrepareIOFromWriteOptions(const WriteOptions& wo,
IOOptions& opts) {
opts.rate_limiter_priority = wo.rate_limiter_priority;
opts.io_activity = wo.io_activity;
return IOStatus::OK();
}
// Test method to delete the input directory and all of its contents.
// This method is destructive and is meant for use only in tests!!!
Status DestroyDir(Env* env, const std::string& dir);
inline bool CheckFSFeatureSupport(FileSystem* fs, FSSupportedOps feat) {
int64_t supported_ops = 0;
fs->SupportedOps(supported_ops);
if (supported_ops & (1ULL << feat)) {
return true;
}
return false;
}
} // namespace ROCKSDB_NAMESPACE