avoid returning a number-of-active-keys estimate of nearly 2^64

Summary:
If accumulated_num_non_deletions_ were ever smaller than
accumulated_num_deletions_, the computation of
"accumulated_num_non_deletions_ - accumulated_num_deletions_"
would result in a logically "negative" value, but since
the two operands are unsigned (uint64_t), the result corresponding
to e.g., -1 would 2^64-1.

Instead, return 0 in that case.

Test Plan:
  - ensure "make check" still passes
  - temporarily add an "abort();" call in the new "if"-block, and
      observe that it fails in some test cases.  However, note that
      this case is triggered only when the two numbers are equal.
      Thus, no test case triggers the erroneous behavior this
      change is designed to avoid. If anyone can construct a
      scenario in which that bug would be triggered, I'll be
      happy to add a test case.

Reviewers: ljin, igor, rven, igor.sugak, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D36489
This commit is contained in:
Jim Meyering 2015-04-03 14:46:35 -07:00
parent a7ac6cef1f
commit d2a92c13bc
1 changed files with 10 additions and 6 deletions

View File

@ -655,8 +655,8 @@ void Version::GetColumnFamilyMetaData(ColumnFamilyMetaData* cf_meta) {
uint64_t VersionStorageInfo::GetEstimatedActiveKeys() const {
// Estimation will be not accurate when:
// (1) there is merge keys
// Estimation will be inaccurate when:
// (1) there exist merge keys
// (2) keys are directly overwritten
// (3) deletion on non-existing keys
// (4) low number of samples
@ -664,6 +664,12 @@ uint64_t VersionStorageInfo::GetEstimatedActiveKeys() const {
return 0;
}
if (accumulated_num_non_deletions_ <= accumulated_num_deletions_) {
return 0;
}
uint64_t est = accumulated_num_non_deletions_ - accumulated_num_deletions_;
uint64_t file_count = 0;
for (int level = 0; level < num_levels_; ++level) {
file_count += files_[level].size();
@ -671,11 +677,9 @@ uint64_t VersionStorageInfo::GetEstimatedActiveKeys() const {
if (num_samples_ < file_count) {
// casting to avoid overflowing
return static_cast<uint64_t>(static_cast<double>(
accumulated_num_non_deletions_ - accumulated_num_deletions_) *
static_cast<double>(file_count) / num_samples_);
return (est * static_cast<double>(file_count) / num_samples_);
} else {
return accumulated_num_non_deletions_ - accumulated_num_deletions_;
return est;
}
}