Add a blob-specific cache priority (#10461)

Summary: RocksDB's `Cache` abstraction currently supports two priority levels for items: high (used for frequently accessed/highly valuable SST metablocks like index/filter blocks) and low (used for SST data blocks). Blobs are typically lower-value targets for caching than data blocks, since 1) with BlobDB, data blocks containing blob references conceptually form an index structure which has to be consulted before we can read the blob value, and 2) cached blobs represent only a single key-value, while cached data blocks generally contain multiple KVs. Since we would like to make it possible to use the same backing cache for the block cache and the blob cache, it would make sense to add a new, lower-than-low cache priority level (bottom level) for blobs so data blocks are prioritized over them. This task is a part of https://github.com/facebook/rocksdb/issues/10156 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10461 Reviewed By: siying Differential Revision: D38672823 Pulled By: ltamasi fbshipit-source-id: 90cf7362036563d79891f47be2cc24b827482743
2022-08-12 17:59:06 -07:00 · 2022-08-12 17:59:06 -07:00 · 275cd80cdb
parent bc575c614c
commit 275cd80cdb
23 changed files with 593 additions and 166 deletions
--- a/HISTORY.md
+++ b/HISTORY.md
@ -1,13 +1,14 @@
 # Rocksdb Change Log
 ## Unreleased
 ### New Features
- * Added `prepopulate_blob_cache` to ColumnFamilyOptions. If enabled, prepopulate warm/hot blobs which are already in memory into blob cache at the time of flush. On a flush, the blob that is in memory (in memtables) get flushed to the device. If using Direct IO, additional IO is incurred to read this blob back into memory again, which is avoided by enabling this option. This further helps if the workload exhibits high temporal locality, where most of the reads go to recently written data. This also helps in case of the remote file system since it involves network traffic and higher latencies.
+* Added `prepopulate_blob_cache` to ColumnFamilyOptions. If enabled, prepopulate warm/hot blobs which are already in memory into blob cache at the time of flush. On a flush, the blob that is in memory (in memtables) get flushed to the device. If using Direct IO, additional IO is incurred to read this blob back into memory again, which is avoided by enabling this option. This further helps if the workload exhibits high temporal locality, where most of the reads go to recently written data. This also helps in case of the remote file system since it involves network traffic and higher latencies.
 * Support using secondary cache with the blob cache. When creating a blob cache, the user can set a secondary blob cache by configuring `secondary_cache` in LRUCacheOptions.
 * Charge memory usage of blob cache when the backing cache of the blob cache and the block cache are different. If an operation reserving memory for blob cache exceeds the avaible space left in the block cache at some point (i.e, causing a cache full under `LRUCacheOptions::strict_capacity_limit` = true), creation will fail with `Status::MemoryLimit()`. To opt in this feature, enable charging `CacheEntryRole::kBlobCache` in `BlockBasedTableOptions::cache_usage_options`.
 * Improve subcompaction range partition so that it is likely to be more even. More evenly distribution of subcompaction will improve compaction throughput for some workloads. All input files' index blocks to sample some anchor key points from which we pick positions to partition the input range. This would introduce some CPU overhead in compaction preparation phase, if subcompaction is enabled, but it should be a small fraction of the CPU usage of the whole compaction process. This also brings a behavier change: subcompaction number is much more likely to maxed out than before.
 * Add CompactionPri::kRoundRobin, a compaction picking mode that cycles through all the files with a compact cursor in a round-robin manner. This feature is available since 7.5.
 * Provide support for subcompactions for user_defined_timestamp.
 * Added an option `memtable_protection_bytes_per_key` that turns on memtable per key-value checksum protection. Each memtable entry will be suffixed by a checksum that is computed during writes, and verified in reads/compaction. Detected corruption will be logged and with corruption status returned to user. 
+* Added a blob-specific cache priority level - bottom level. Blobs are typically lower-value targets for caching than data blocks, since 1) with BlobDB, data blocks containing blob references conceptually form an index structure which has to be consulted before we can read the blob value, and 2) cached blobs represent only a single key-value, while cached data blocks generally contain multiple KVs. The user can specify the new option `low_pri_pool_ratio` in `LRUCacheOptions` to configure the ratio of capacity reserved for low priority cache entries (and therefore the remaining ratio is the space reserved for the bottom level), or configuring the new argument `low_pri_pool_ratio` in `NewLRUCache()` to achieve the same effect.

 ### Public API changes
 * Removed Customizable support for RateLimiter and removed its CreateFromString() and Type() functions.
--- a/cache/cache.cc
+++ b/cache/cache.cc
@ -33,6 +33,10 @@ static std::unordered_map<std::string, OptionTypeInfo>
         {offsetof(struct LRUCacheOptions, high_pri_pool_ratio),
          OptionType::kDouble, OptionVerificationType::kNormal,
          OptionTypeFlags::kMutable}},
+        {"low_pri_pool_ratio",
+         {offsetof(struct LRUCacheOptions, low_pri_pool_ratio),
+          OptionType::kDouble, OptionVerificationType::kNormal,
+          OptionTypeFlags::kMutable}},
 };

 static std::unordered_map<std::string, OptionTypeInfo>
--- a/cache/cache_bench_tool.cc
+++ b/cache/cache_bench_tool.cc
@ -304,7 +304,9 @@ class CacheBench {
          FLAGS_cache_size, FLAGS_value_bytes, FLAGS_num_shard_bits,
          false /*strict_capacity_limit*/, kDefaultCacheMetadataChargePolicy);
    } else if (FLAGS_cache_type == "lru_cache") {
-      LRUCacheOptions opts(FLAGS_cache_size, FLAGS_num_shard_bits, false, 0.5);
+      LRUCacheOptions opts(FLAGS_cache_size, FLAGS_num_shard_bits,
+                           false /* strict_capacity_limit */,
+                           0.5 /* high_pri_pool_ratio */);
 #ifndef ROCKSDB_LITE
      if (!FLAGS_secondary_cache_uri.empty()) {
        Status s = SecondaryCache::CreateFromString(
--- a/cache/clock_cache.cc
+++ b/cache/clock_cache.cc
@ -697,8 +697,10 @@ void ClockCache::DisownData() {
 std::shared_ptr<Cache> NewClockCache(
    size_t capacity, int num_shard_bits, bool strict_capacity_limit,
    CacheMetadataChargePolicy metadata_charge_policy) {
-  return NewLRUCache(capacity, num_shard_bits, strict_capacity_limit, 0.5,
-                     nullptr, kDefaultToAdaptiveMutex, metadata_charge_policy);
+  return NewLRUCache(capacity, num_shard_bits, strict_capacity_limit,
+                     /* high_pri_pool_ratio */ 0.5, nullptr,
+                     kDefaultToAdaptiveMutex, metadata_charge_policy,
+                     /* low_pri_pool_ratio */ 0.0);
 }

 std::shared_ptr<Cache> ExperimentalNewClockCache(
--- a/cache/compressed_secondary_cache.cc
+++ b/cache/compressed_secondary_cache.cc
@ -17,17 +17,18 @@ namespace ROCKSDB_NAMESPACE {

 CompressedSecondaryCache::CompressedSecondaryCache(
    size_t capacity, int num_shard_bits, bool strict_capacity_limit,
-    double high_pri_pool_ratio,
+    double high_pri_pool_ratio, double low_pri_pool_ratio,
    std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
    CacheMetadataChargePolicy metadata_charge_policy,
    CompressionType compression_type, uint32_t compress_format_version)
    : cache_options_(capacity, num_shard_bits, strict_capacity_limit,
                     high_pri_pool_ratio, memory_allocator, use_adaptive_mutex,
                     metadata_charge_policy, compression_type,
-                     compress_format_version) {
-  cache_ = NewLRUCache(capacity, num_shard_bits, strict_capacity_limit,
-                       high_pri_pool_ratio, memory_allocator,
-                       use_adaptive_mutex, metadata_charge_policy);
+                     compress_format_version, low_pri_pool_ratio) {
+  cache_ =
+      NewLRUCache(capacity, num_shard_bits, strict_capacity_limit,
+                  high_pri_pool_ratio, memory_allocator, use_adaptive_mutex,
+                  metadata_charge_policy, low_pri_pool_ratio);
 }

 CompressedSecondaryCache::~CompressedSecondaryCache() { cache_.reset(); }
@ -225,11 +226,12 @@ std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
    double high_pri_pool_ratio,
    std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
    CacheMetadataChargePolicy metadata_charge_policy,
-    CompressionType compression_type, uint32_t compress_format_version) {
+    CompressionType compression_type, uint32_t compress_format_version,
+    double low_pri_pool_ratio) {
  return std::make_shared<CompressedSecondaryCache>(
      capacity, num_shard_bits, strict_capacity_limit, high_pri_pool_ratio,
-      memory_allocator, use_adaptive_mutex, metadata_charge_policy,
-      compression_type, compress_format_version);
+      low_pri_pool_ratio, memory_allocator, use_adaptive_mutex,
+      metadata_charge_policy, compression_type, compress_format_version);
 }

 std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
@ -240,7 +242,7 @@ std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
      opts.capacity, opts.num_shard_bits, opts.strict_capacity_limit,
      opts.high_pri_pool_ratio, opts.memory_allocator, opts.use_adaptive_mutex,
      opts.metadata_charge_policy, opts.compression_type,
-      opts.compress_format_version);
+      opts.compress_format_version, opts.low_pri_pool_ratio);
 }

 }  // namespace ROCKSDB_NAMESPACE
--- a/cache/compressed_secondary_cache.h
+++ b/cache/compressed_secondary_cache.h
@ -56,7 +56,7 @@ class CompressedSecondaryCache : public SecondaryCache {
 public:
  CompressedSecondaryCache(
      size_t capacity, int num_shard_bits, bool strict_capacity_limit,
-      double high_pri_pool_ratio,
+      double high_pri_pool_ratio, double low_pri_pool_ratio,
      std::shared_ptr<MemoryAllocator> memory_allocator = nullptr,
      bool use_adaptive_mutex = kDefaultToAdaptiveMutex,
      CacheMetadataChargePolicy metadata_charge_policy =
--- a/cache/compressed_secondary_cache_test.cc
+++ b/cache/compressed_secondary_cache_test.cc
@ -240,9 +240,11 @@ class CompressedSecondaryCacheTest : public testing::Test {
    secondary_cache_opts.num_shard_bits = 0;
    std::shared_ptr<SecondaryCache> secondary_cache =
        NewCompressedSecondaryCache(secondary_cache_opts);
-    LRUCacheOptions lru_cache_opts(1300, 0, /*_strict_capacity_limit=*/false,
-                                   0.5, nullptr, kDefaultToAdaptiveMutex,
-                                   kDefaultCacheMetadataChargePolicy);
+    LRUCacheOptions lru_cache_opts(
+        1300 /* capacity */, 0 /* num_shard_bits */,
+        false /* strict_capacity_limit */, 0.5 /* high_pri_pool_ratio */,
+        nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+        kDefaultCacheMetadataChargePolicy);
    lru_cache_opts.secondary_cache = secondary_cache;
    std::shared_ptr<Cache> cache = NewLRUCache(lru_cache_opts);
    std::shared_ptr<Statistics> stats = CreateDBStatistics();
@ -324,9 +326,11 @@ class CompressedSecondaryCacheTest : public testing::Test {
    std::shared_ptr<SecondaryCache> secondary_cache =
        NewCompressedSecondaryCache(secondary_cache_opts);

-    LRUCacheOptions opts(1024, 0, /*_strict_capacity_limit=*/false, 0.5,
-                         nullptr, kDefaultToAdaptiveMutex,
-                         kDefaultCacheMetadataChargePolicy);
+    LRUCacheOptions opts(
+        1024 /* capacity */, 0 /* num_shard_bits */,
+        false /* strict_capacity_limit */, 0.5 /* high_pri_pool_ratio */,
+        nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+        kDefaultCacheMetadataChargePolicy);
    opts.secondary_cache = secondary_cache;
    std::shared_ptr<Cache> cache = NewLRUCache(opts);

@ -371,9 +375,11 @@ class CompressedSecondaryCacheTest : public testing::Test {
    std::shared_ptr<SecondaryCache> secondary_cache =
        NewCompressedSecondaryCache(secondary_cache_opts);

-    LRUCacheOptions opts(1200, 0, /*_strict_capacity_limit=*/false, 0.5,
-                         nullptr, kDefaultToAdaptiveMutex,
-                         kDefaultCacheMetadataChargePolicy);
+    LRUCacheOptions opts(
+        1200 /* capacity */, 0 /* num_shard_bits */,
+        false /* strict_capacity_limit */, 0.5 /* high_pri_pool_ratio */,
+        nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+        kDefaultCacheMetadataChargePolicy);
    opts.secondary_cache = secondary_cache;
    std::shared_ptr<Cache> cache = NewLRUCache(opts);

@ -430,9 +436,11 @@ class CompressedSecondaryCacheTest : public testing::Test {
    std::shared_ptr<SecondaryCache> secondary_cache =
        NewCompressedSecondaryCache(secondary_cache_opts);

-    LRUCacheOptions opts(1200, 0, /*_strict_capacity_limit=*/false, 0.5,
-                         nullptr, kDefaultToAdaptiveMutex,
-                         kDefaultCacheMetadataChargePolicy);
+    LRUCacheOptions opts(
+        1200 /* capacity */, 0 /* num_shard_bits */,
+        false /* strict_capacity_limit */, 0.5 /* high_pri_pool_ratio */,
+        nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+        kDefaultCacheMetadataChargePolicy);
    opts.secondary_cache = secondary_cache;
    std::shared_ptr<Cache> cache = NewLRUCache(opts);

@ -488,9 +496,11 @@ class CompressedSecondaryCacheTest : public testing::Test {
    std::shared_ptr<SecondaryCache> secondary_cache =
        NewCompressedSecondaryCache(secondary_cache_opts);

-    LRUCacheOptions opts(1200, 0, /*_strict_capacity_limit=*/true, 0.5, nullptr,
-                         kDefaultToAdaptiveMutex,
-                         kDefaultCacheMetadataChargePolicy);
+    LRUCacheOptions opts(
+        1200 /* capacity */, 0 /* num_shard_bits */,
+        true /* strict_capacity_limit */, 0.5 /* high_pri_pool_ratio */,
+        nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+        kDefaultCacheMetadataChargePolicy);
    opts.secondary_cache = secondary_cache;
    std::shared_ptr<Cache> cache = NewLRUCache(opts);

@ -548,7 +558,7 @@ class CompressedSecondaryCacheTest : public testing::Test {

    using CacheValueChunk = CompressedSecondaryCache::CacheValueChunk;
    std::unique_ptr<CompressedSecondaryCache> sec_cache =
-        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5,
+        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5, 0.0,
                                                   allocator);
    Random rnd(301);
    // 10000 = 8169 + 1769 + 62 , so there should be 3 chunks after split.
@ -600,7 +610,7 @@ class CompressedSecondaryCacheTest : public testing::Test {
    std::string str = str1 + str2 + str3;

    std::unique_ptr<CompressedSecondaryCache> sec_cache =
-        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5);
+        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5, 0.0);
    size_t charge{0};
    CacheAllocationPtr value =
        sec_cache->MergeChunksIntoValue(chunks_head, charge);
@ -626,7 +636,7 @@ class CompressedSecondaryCacheTest : public testing::Test {

    using CacheValueChunk = CompressedSecondaryCache::CacheValueChunk;
    std::unique_ptr<CompressedSecondaryCache> sec_cache =
-        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5,
+        std::make_unique<CompressedSecondaryCache>(1000, 0, true, 0.5, 0.0,
                                                   allocator);
    Random rnd(301);
    // 10000 = 8169 + 1769 + 62 , so there should be 3 chunks after split.
--- a/cache/lru_cache.cc
+++ b/cache/lru_cache.cc
@ -111,14 +111,17 @@ void LRUHandleTable::Resize() {

 LRUCacheShard::LRUCacheShard(
    size_t capacity, bool strict_capacity_limit, double high_pri_pool_ratio,
-    bool use_adaptive_mutex, CacheMetadataChargePolicy metadata_charge_policy,
-    int max_upper_hash_bits,
+    double low_pri_pool_ratio, bool use_adaptive_mutex,
+    CacheMetadataChargePolicy metadata_charge_policy, int max_upper_hash_bits,
    const std::shared_ptr<SecondaryCache>& secondary_cache)
    : capacity_(0),
      high_pri_pool_usage_(0),
+      low_pri_pool_usage_(0),
      strict_capacity_limit_(strict_capacity_limit),
      high_pri_pool_ratio_(high_pri_pool_ratio),
      high_pri_pool_capacity_(0),
+      low_pri_pool_ratio_(low_pri_pool_ratio),
+      low_pri_pool_capacity_(0),
      table_(max_upper_hash_bits),
      usage_(0),
      lru_usage_(0),
@ -129,6 +132,7 @@ LRUCacheShard::LRUCacheShard(
  lru_.next = &lru_;
  lru_.prev = &lru_;
  lru_low_pri_ = &lru_;
+  lru_bottom_pri_ = &lru_;
  SetCapacity(capacity);
 }

@ -192,10 +196,12 @@ void LRUCacheShard::ApplyToSomeEntries(
      index_begin, index_end);
 }

-void LRUCacheShard::TEST_GetLRUList(LRUHandle** lru, LRUHandle** lru_low_pri) {
+void LRUCacheShard::TEST_GetLRUList(LRUHandle** lru, LRUHandle** lru_low_pri,
+                                    LRUHandle** lru_bottom_pri) {
  DMutexLock l(mutex_);
  *lru = &lru_;
  *lru_low_pri = lru_low_pri_;
+  *lru_bottom_pri = lru_bottom_pri_;
 }

 size_t LRUCacheShard::TEST_GetLRUSize() {
@ -214,20 +220,32 @@ double LRUCacheShard::GetHighPriPoolRatio() {
  return high_pri_pool_ratio_;
 }

+double LRUCacheShard::GetLowPriPoolRatio() {
+  DMutexLock l(mutex_);
+  return low_pri_pool_ratio_;
+}
+
 void LRUCacheShard::LRU_Remove(LRUHandle* e) {
  assert(e->next != nullptr);
  assert(e->prev != nullptr);
  if (lru_low_pri_ == e) {
    lru_low_pri_ = e->prev;
  }
+  if (lru_bottom_pri_ == e) {
+    lru_bottom_pri_ = e->prev;
+  }
  e->next->prev = e->prev;
  e->prev->next = e->next;
  e->prev = e->next = nullptr;
  assert(lru_usage_ >= e->total_charge);
  lru_usage_ -= e->total_charge;
+  assert(!e->InHighPriPool() || !e->InLowPriPool());
  if (e->InHighPriPool()) {
    assert(high_pri_pool_usage_ >= e->total_charge);
    high_pri_pool_usage_ -= e->total_charge;
+  } else if (e->InLowPriPool()) {
+    assert(low_pri_pool_usage_ >= e->total_charge);
+    low_pri_pool_usage_ -= e->total_charge;
  }
 }

@ -241,17 +259,34 @@ void LRUCacheShard::LRU_Insert(LRUHandle* e) {
    e->prev->next = e;
    e->next->prev = e;
    e->SetInHighPriPool(true);
+    e->SetInLowPriPool(false);
    high_pri_pool_usage_ += e->total_charge;
    MaintainPoolSize();
-  } else {
-    // Insert "e" to the head of low-pri pool. Note that when
-    // high_pri_pool_ratio is 0, head of low-pri pool is also head of LRU list.
+  } else if (low_pri_pool_ratio_ > 0 &&
+             (e->IsHighPri() || e->IsLowPri() || e->HasHit())) {
+    // Insert "e" to the head of low-pri pool.
    e->next = lru_low_pri_->next;
    e->prev = lru_low_pri_;
    e->prev->next = e;
    e->next->prev = e;
    e->SetInHighPriPool(false);
+    e->SetInLowPriPool(true);
+    low_pri_pool_usage_ += e->total_charge;
+    MaintainPoolSize();
    lru_low_pri_ = e;
+  } else {
+    // Insert "e" to the head of bottom-pri pool.
+    e->next = lru_bottom_pri_->next;
+    e->prev = lru_bottom_pri_;
+    e->prev->next = e;
+    e->next->prev = e;
+    e->SetInHighPriPool(false);
+    e->SetInLowPriPool(false);
+    // if the low-pri pool is empty, lru_low_pri_ also needs to be updated.
+    if (lru_bottom_pri_ == lru_low_pri_) {
+      lru_low_pri_ = e;
+    }
+    lru_bottom_pri_ = e;
  }
  lru_usage_ += e->total_charge;
 }
@ -262,8 +297,20 @@ void LRUCacheShard::MaintainPoolSize() {
    lru_low_pri_ = lru_low_pri_->next;
    assert(lru_low_pri_ != &lru_);
    lru_low_pri_->SetInHighPriPool(false);
+    lru_low_pri_->SetInLowPriPool(true);
    assert(high_pri_pool_usage_ >= lru_low_pri_->total_charge);
    high_pri_pool_usage_ -= lru_low_pri_->total_charge;
+    low_pri_pool_usage_ += lru_low_pri_->total_charge;
+  }
+
+  while (low_pri_pool_usage_ > low_pri_pool_capacity_) {
+    // Overflow last entry in low-pri pool to bottom-pri pool.
+    lru_bottom_pri_ = lru_bottom_pri_->next;
+    assert(lru_bottom_pri_ != &lru_);
+    lru_bottom_pri_->SetInHighPriPool(false);
+    lru_bottom_pri_->SetInLowPriPool(false);
+    assert(low_pri_pool_usage_ >= lru_bottom_pri_->total_charge);
+    low_pri_pool_usage_ -= lru_bottom_pri_->total_charge;
  }
 }

@ -288,6 +335,7 @@ void LRUCacheShard::SetCapacity(size_t capacity) {
    DMutexLock l(mutex_);
    capacity_ = capacity;
    high_pri_pool_capacity_ = capacity_ * high_pri_pool_ratio_;
+    low_pri_pool_capacity_ = capacity_ * low_pri_pool_ratio_;
    EvictFromLRU(0, &last_reference_list);
  }

@ -503,6 +551,13 @@ void LRUCacheShard::SetHighPriorityPoolRatio(double high_pri_pool_ratio) {
  MaintainPoolSize();
 }

+void LRUCacheShard::SetLowPriorityPoolRatio(double low_pri_pool_ratio) {
+  DMutexLock l(mutex_);
+  low_pri_pool_ratio_ = low_pri_pool_ratio;
+  low_pri_pool_capacity_ = capacity_ * low_pri_pool_ratio_;
+  MaintainPoolSize();
+}
+
 bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) {
  if (handle == nullptr) {
    return false;
@ -634,12 +689,15 @@ std::string LRUCacheShard::GetPrintableOptions() const {
    DMutexLock l(mutex_);
    snprintf(buffer, kBufferSize, "    high_pri_pool_ratio: %.3lf\n",
             high_pri_pool_ratio_);
+    snprintf(buffer + strlen(buffer), kBufferSize - strlen(buffer),
+             "    low_pri_pool_ratio: %.3lf\n", low_pri_pool_ratio_);
  }
  return std::string(buffer);
 }

 LRUCache::LRUCache(size_t capacity, int num_shard_bits,
                   bool strict_capacity_limit, double high_pri_pool_ratio,
+                   double low_pri_pool_ratio,
                   std::shared_ptr<MemoryAllocator> allocator,
                   bool use_adaptive_mutex,
                   CacheMetadataChargePolicy metadata_charge_policy,
@ -653,7 +711,7 @@ LRUCache::LRUCache(size_t capacity, int num_shard_bits,
  for (int i = 0; i < num_shards_; i++) {
    new (&shards_[i]) LRUCacheShard(
        per_shard, strict_capacity_limit, high_pri_pool_ratio,
-        use_adaptive_mutex, metadata_charge_policy,
+        low_pri_pool_ratio, use_adaptive_mutex, metadata_charge_policy,
        /* max_upper_hash_bits */ 32 - num_shard_bits, secondary_cache);
  }
  secondary_cache_ = secondary_cache;
@ -775,7 +833,8 @@ std::shared_ptr<Cache> NewLRUCache(
    double high_pri_pool_ratio,
    std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
    CacheMetadataChargePolicy metadata_charge_policy,
-    const std::shared_ptr<SecondaryCache>& secondary_cache) {
+    const std::shared_ptr<SecondaryCache>& secondary_cache,
+    double low_pri_pool_ratio) {
  if (num_shard_bits >= 20) {
    return nullptr;  // The cache cannot be sharded into too many fine pieces.
  }
@ -783,30 +842,40 @@ std::shared_ptr<Cache> NewLRUCache(
    // Invalid high_pri_pool_ratio
    return nullptr;
  }
+  if (low_pri_pool_ratio < 0.0 || low_pri_pool_ratio > 1.0) {
+    // Invalid high_pri_pool_ratio
+    return nullptr;
+  }
+  if (low_pri_pool_ratio + high_pri_pool_ratio > 1.0) {
+    // Invalid high_pri_pool_ratio and low_pri_pool_ratio combination
+    return nullptr;
+  }
  if (num_shard_bits < 0) {
    num_shard_bits = GetDefaultCacheShardBits(capacity);
  }
  return std::make_shared<LRUCache>(
      capacity, num_shard_bits, strict_capacity_limit, high_pri_pool_ratio,
-      std::move(memory_allocator), use_adaptive_mutex, metadata_charge_policy,
-      secondary_cache);
+      low_pri_pool_ratio, std::move(memory_allocator), use_adaptive_mutex,
+      metadata_charge_policy, secondary_cache);
 }

 std::shared_ptr<Cache> NewLRUCache(const LRUCacheOptions& cache_opts) {
-  return NewLRUCache(
-      cache_opts.capacity, cache_opts.num_shard_bits,
-      cache_opts.strict_capacity_limit, cache_opts.high_pri_pool_ratio,
-      cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
-      cache_opts.metadata_charge_policy, cache_opts.secondary_cache);
+  return NewLRUCache(cache_opts.capacity, cache_opts.num_shard_bits,
+                     cache_opts.strict_capacity_limit,
+                     cache_opts.high_pri_pool_ratio,
+                     cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
+                     cache_opts.metadata_charge_policy,
+                     cache_opts.secondary_cache, cache_opts.low_pri_pool_ratio);
 }

 std::shared_ptr<Cache> NewLRUCache(
    size_t capacity, int num_shard_bits, bool strict_capacity_limit,
    double high_pri_pool_ratio,
    std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
-    CacheMetadataChargePolicy metadata_charge_policy) {
+    CacheMetadataChargePolicy metadata_charge_policy,
+    double low_pri_pool_ratio) {
  return NewLRUCache(capacity, num_shard_bits, strict_capacity_limit,
                     high_pri_pool_ratio, memory_allocator, use_adaptive_mutex,
-                     metadata_charge_policy, nullptr);
+                     metadata_charge_policy, nullptr, low_pri_pool_ratio);
 }
 }  // namespace ROCKSDB_NAMESPACE
--- a/cache/lru_cache.h
+++ b/cache/lru_cache.h
@ -74,7 +74,7 @@ struct LRUHandle {
  // The number of external refs to this entry. The cache itself is not counted.
  uint32_t refs;

-  enum Flags : uint8_t {
+  enum Flags : uint16_t {
    // Whether this entry is referenced by the hash table.
    IN_CACHE = (1 << 0),
    // Whether this entry is high priority entry.
@ -89,9 +89,13 @@ struct LRUHandle {
    IS_PENDING = (1 << 5),
    // Whether this handle is still in a lower tier
    IS_IN_SECONDARY_CACHE = (1 << 6),
+    // Whether this entry is low priority entry.
+    IS_LOW_PRI = (1 << 7),
+    // Whether this entry is in low-pri pool.
+    IN_LOW_PRI_POOL = (1 << 8),
  };

-  uint8_t flags;
+  uint16_t flags;

 #ifdef __SANITIZE_THREAD__
  // TSAN can report a false data race on flags, where one thread is writing
@ -122,6 +126,8 @@ struct LRUHandle {
  bool InCache() const { return flags & IN_CACHE; }
  bool IsHighPri() const { return flags & IS_HIGH_PRI; }
  bool InHighPriPool() const { return flags & IN_HIGH_PRI_POOL; }
+  bool IsLowPri() const { return flags & IS_LOW_PRI; }
+  bool InLowPriPool() const { return flags & IN_LOW_PRI_POOL; }
  bool HasHit() const { return flags & HAS_HIT; }
  bool IsSecondaryCacheCompatible() const {
 #ifdef __SANITIZE_THREAD__
@ -144,8 +150,13 @@ struct LRUHandle {
  void SetPriority(Cache::Priority priority) {
    if (priority == Cache::Priority::HIGH) {
      flags |= IS_HIGH_PRI;
+      flags &= ~IS_LOW_PRI;
+    } else if (priority == Cache::Priority::LOW) {
+      flags &= ~IS_HIGH_PRI;
+      flags |= IS_LOW_PRI;
    } else {
      flags &= ~IS_HIGH_PRI;
+      flags &= ~IS_LOW_PRI;
    }
  }

@ -157,6 +168,14 @@ struct LRUHandle {
    }
  }

+  void SetInLowPriPool(bool in_low_pri_pool) {
+    if (in_low_pri_pool) {
+      flags |= IN_LOW_PRI_POOL;
+    } else {
+      flags &= ~IN_LOW_PRI_POOL;
+    }
+  }
+
  void SetHit() { flags |= HAS_HIT; }

  void SetSecondaryCacheCompatible(bool compat) {
@ -298,7 +317,8 @@ class LRUHandleTable {
 class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
 public:
  LRUCacheShard(size_t capacity, bool strict_capacity_limit,
-                double high_pri_pool_ratio, bool use_adaptive_mutex,
+                double high_pri_pool_ratio, double low_pri_pool_ratio,
+                bool use_adaptive_mutex,
                CacheMetadataChargePolicy metadata_charge_policy,
                int max_upper_hash_bits,
                const std::shared_ptr<SecondaryCache>& secondary_cache);
@ -315,6 +335,9 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
  // Set percentage of capacity reserved for high-pri cache entries.
  void SetHighPriorityPoolRatio(double high_pri_pool_ratio);

+  // Set percentage of capacity reserved for low-pri cache entries.
+  void SetLowPriorityPoolRatio(double low_pri_pool_ratio);
+
  // Like Cache methods, but with an extra "hash" parameter.
  virtual Status Insert(const Slice& key, uint32_t hash, void* value,
                        size_t charge, Cache::DeleterFn deleter,
@ -366,15 +389,19 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {

  virtual std::string GetPrintableOptions() const override;

-  void TEST_GetLRUList(LRUHandle** lru, LRUHandle** lru_low_pri);
+  void TEST_GetLRUList(LRUHandle** lru, LRUHandle** lru_low_pri,
+                       LRUHandle** lru_bottom_pri);

-  //  Retrieves number of elements in LRU, for unit test purpose only.
-  //  Not threadsafe.
+  // Retrieves number of elements in LRU, for unit test purpose only.
+  // Not threadsafe.
  size_t TEST_GetLRUSize();

-  //  Retrieves high pri pool ratio
+  // Retrieves high pri pool ratio
  double GetHighPriPoolRatio();

+  // Retrieves low pri pool ratio
+  double GetLowPriPoolRatio();
+
 private:
  friend class LRUCache;
  // Insert an item into the hash table and, if handle is null, insert into
@ -414,6 +441,9 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
  // Memory size for entries in high-pri pool.
  size_t high_pri_pool_usage_;

+  // Memory size for entries in low-pri pool.
+  size_t low_pri_pool_usage_;
+
  // Whether to reject insertion if cache reaches its full capacity.
  bool strict_capacity_limit_;

@ -424,6 +454,13 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
  // Remember the value to avoid recomputing each time.
  double high_pri_pool_capacity_;

+  // Ratio of capacity reserved for low priority cache entries.
+  double low_pri_pool_ratio_;
+
+  // Low-pri pool size, equals to capacity * low_pri_pool_ratio.
+  // Remember the value to avoid recomputing each time.
+  double low_pri_pool_capacity_;
+
  // Dummy head of LRU list.
  // lru.prev is newest entry, lru.next is oldest entry.
  // LRU contains items which can be evicted, ie reference only by cache
@ -432,6 +469,9 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
  // Pointer to head of low-pri pool in LRU list.
  LRUHandle* lru_low_pri_;

+  // Pointer to head of bottom-pri pool in LRU list.
+  LRUHandle* lru_bottom_pri_;
+
  // ------------^^^^^^^^^^^^^-----------
  // Not frequently modified data members
  // ------------------------------------
@ -466,7 +506,7 @@ class LRUCache
    : public ShardedCache {
 public:
  LRUCache(size_t capacity, int num_shard_bits, bool strict_capacity_limit,
-           double high_pri_pool_ratio,
+           double high_pri_pool_ratio, double low_pri_pool_ratio,
           std::shared_ptr<MemoryAllocator> memory_allocator = nullptr,
           bool use_adaptive_mutex = kDefaultToAdaptiveMutex,
           CacheMetadataChargePolicy metadata_charge_policy =
--- a/cache/lru_cache_test.cc
+++ b/cache/lru_cache_test.cc
@ -41,13 +41,14 @@ class LRUCacheTest : public testing::Test {
  }

  void NewCache(size_t capacity, double high_pri_pool_ratio = 0.0,
+                double low_pri_pool_ratio = 1.0,
                bool use_adaptive_mutex = kDefaultToAdaptiveMutex) {
    DeleteCache();
    cache_ = reinterpret_cast<LRUCacheShard*>(
        port::cacheline_aligned_alloc(sizeof(LRUCacheShard)));
    new (cache_) LRUCacheShard(
        capacity, false /*strict_capcity_limit*/, high_pri_pool_ratio,
-        use_adaptive_mutex, kDontChargeCacheMetadata,
+        low_pri_pool_ratio, use_adaptive_mutex, kDontChargeCacheMetadata,
        24 /*max_upper_hash_bits*/, nullptr /*secondary_cache*/);
  }

@ -76,32 +77,66 @@ class LRUCacheTest : public testing::Test {
  void Erase(const std::string& key) { cache_->Erase(key, 0 /*hash*/); }

  void ValidateLRUList(std::vector<std::string> keys,
-                       size_t num_high_pri_pool_keys = 0) {
+                       size_t num_high_pri_pool_keys = 0,
+                       size_t num_low_pri_pool_keys = 0,
+                       size_t num_bottom_pri_pool_keys = 0) {
    LRUHandle* lru;
    LRUHandle* lru_low_pri;
-    cache_->TEST_GetLRUList(&lru, &lru_low_pri);
+    LRUHandle* lru_bottom_pri;
+    cache_->TEST_GetLRUList(&lru, &lru_low_pri, &lru_bottom_pri);
+
    LRUHandle* iter = lru;
+
+    bool in_low_pri_pool = false;
    bool in_high_pri_pool = false;
+
    size_t high_pri_pool_keys = 0;
+    size_t low_pri_pool_keys = 0;
+    size_t bottom_pri_pool_keys = 0;
+
+    if (iter == lru_bottom_pri) {
+      in_low_pri_pool = true;
+      in_high_pri_pool = false;
+    }
    if (iter == lru_low_pri) {
+      in_low_pri_pool = false;
      in_high_pri_pool = true;
    }
+
    for (const auto& key : keys) {
      iter = iter->next;
      ASSERT_NE(lru, iter);
      ASSERT_EQ(key, iter->key().ToString());
      ASSERT_EQ(in_high_pri_pool, iter->InHighPriPool());
+      ASSERT_EQ(in_low_pri_pool, iter->InLowPriPool());
      if (in_high_pri_pool) {
+        ASSERT_FALSE(iter->InLowPriPool());
        high_pri_pool_keys++;
+      } else if (in_low_pri_pool) {
+        ASSERT_FALSE(iter->InHighPriPool());
+        low_pri_pool_keys++;
+      } else {
+        bottom_pri_pool_keys++;
+      }
+      if (iter == lru_bottom_pri) {
+        ASSERT_FALSE(in_low_pri_pool);
+        ASSERT_FALSE(in_high_pri_pool);
+        in_low_pri_pool = true;
+        in_high_pri_pool = false;
      }
      if (iter == lru_low_pri) {
+        ASSERT_TRUE(in_low_pri_pool);
        ASSERT_FALSE(in_high_pri_pool);
+        in_low_pri_pool = false;
        in_high_pri_pool = true;
      }
    }
    ASSERT_EQ(lru, iter->next);
+    ASSERT_FALSE(in_low_pri_pool);
    ASSERT_TRUE(in_high_pri_pool);
    ASSERT_EQ(num_high_pri_pool_keys, high_pri_pool_keys);
+    ASSERT_EQ(num_low_pri_pool_keys, low_pri_pool_keys);
+    ASSERT_EQ(num_bottom_pri_pool_keys, bottom_pri_pool_keys);
  }

 private:
@ -113,98 +148,219 @@ TEST_F(LRUCacheTest, BasicLRU) {
  for (char ch = 'a'; ch <= 'e'; ch++) {
    Insert(ch);
  }
-  ValidateLRUList({"a", "b", "c", "d", "e"});
+  ValidateLRUList({"a", "b", "c", "d", "e"}, 0, 5);
  for (char ch = 'x'; ch <= 'z'; ch++) {
    Insert(ch);
  }
-  ValidateLRUList({"d", "e", "x", "y", "z"});
+  ValidateLRUList({"d", "e", "x", "y", "z"}, 0, 5);
  ASSERT_FALSE(Lookup("b"));
-  ValidateLRUList({"d", "e", "x", "y", "z"});
+  ValidateLRUList({"d", "e", "x", "y", "z"}, 0, 5);
  ASSERT_TRUE(Lookup("e"));
-  ValidateLRUList({"d", "x", "y", "z", "e"});
+  ValidateLRUList({"d", "x", "y", "z", "e"}, 0, 5);
  ASSERT_TRUE(Lookup("z"));
-  ValidateLRUList({"d", "x", "y", "e", "z"});
+  ValidateLRUList({"d", "x", "y", "e", "z"}, 0, 5);
  Erase("x");
-  ValidateLRUList({"d", "y", "e", "z"});
+  ValidateLRUList({"d", "y", "e", "z"}, 0, 4);
  ASSERT_TRUE(Lookup("d"));
-  ValidateLRUList({"y", "e", "z", "d"});
+  ValidateLRUList({"y", "e", "z", "d"}, 0, 4);
  Insert("u");
-  ValidateLRUList({"y", "e", "z", "d", "u"});
+  ValidateLRUList({"y", "e", "z", "d", "u"}, 0, 5);
  Insert("v");
-  ValidateLRUList({"e", "z", "d", "u", "v"});
+  ValidateLRUList({"e", "z", "d", "u", "v"}, 0, 5);
 }

-TEST_F(LRUCacheTest, MidpointInsertion) {
-  // Allocate 2 cache entries to high-pri pool.
-  NewCache(5, 0.45);
+TEST_F(LRUCacheTest, LowPriorityMidpointInsertion) {
+  // Allocate 2 cache entries to high-pri pool and 3 to low-pri pool.
+  NewCache(5, /* high_pri_pool_ratio */ 0.40, /* low_pri_pool_ratio */ 0.60);

  Insert("a", Cache::Priority::LOW);
  Insert("b", Cache::Priority::LOW);
  Insert("c", Cache::Priority::LOW);
  Insert("x", Cache::Priority::HIGH);
  Insert("y", Cache::Priority::HIGH);
-  ValidateLRUList({"a", "b", "c", "x", "y"}, 2);
+  ValidateLRUList({"a", "b", "c", "x", "y"}, 2, 3);

  // Low-pri entries inserted to the tail of low-pri list (the midpoint).
  // After lookup, it will move to the tail of the full list.
  Insert("d", Cache::Priority::LOW);
-  ValidateLRUList({"b", "c", "d", "x", "y"}, 2);
+  ValidateLRUList({"b", "c", "d", "x", "y"}, 2, 3);
  ASSERT_TRUE(Lookup("d"));
-  ValidateLRUList({"b", "c", "x", "y", "d"}, 2);
+  ValidateLRUList({"b", "c", "x", "y", "d"}, 2, 3);

  // High-pri entries will be inserted to the tail of full list.
  Insert("z", Cache::Priority::HIGH);
-  ValidateLRUList({"c", "x", "y", "d", "z"}, 2);
+  ValidateLRUList({"c", "x", "y", "d", "z"}, 2, 3);
+}
+
+TEST_F(LRUCacheTest, BottomPriorityMidpointInsertion) {
+  // Allocate 2 cache entries to high-pri pool and 2 to low-pri pool.
+  NewCache(6, /* high_pri_pool_ratio */ 0.35, /* low_pri_pool_ratio */ 0.35);
+
+  Insert("a", Cache::Priority::BOTTOM);
+  Insert("b", Cache::Priority::BOTTOM);
+  Insert("i", Cache::Priority::LOW);
+  Insert("j", Cache::Priority::LOW);
+  Insert("x", Cache::Priority::HIGH);
+  Insert("y", Cache::Priority::HIGH);
+  ValidateLRUList({"a", "b", "i", "j", "x", "y"}, 2, 2, 2);
+
+  // Low-pri entries will be inserted to the tail of low-pri list (the
+  // midpoint). After lookup, 'k' will move to the tail of the full list, and
+  // 'x' will spill over to the low-pri pool.
+  Insert("k", Cache::Priority::LOW);
+  ValidateLRUList({"b", "i", "j", "k", "x", "y"}, 2, 2, 2);
+  ASSERT_TRUE(Lookup("k"));
+  ValidateLRUList({"b", "i", "j", "x", "y", "k"}, 2, 2, 2);
+
+  // High-pri entries will be inserted to the tail of full list. Although y was
+  // inserted with high priority, it got spilled over to the low-pri pool. As
+  // a result, j also got spilled over to the bottom-pri pool.
+  Insert("z", Cache::Priority::HIGH);
+  ValidateLRUList({"i", "j", "x", "y", "k", "z"}, 2, 2, 2);
+  Erase("x");
+  ValidateLRUList({"i", "j", "y", "k", "z"}, 2, 1, 2);
+  Erase("y");
+  ValidateLRUList({"i", "j", "k", "z"}, 2, 0, 2);
+
+  // Bottom-pri entries will be inserted to the tail of bottom-pri list.
+  Insert("c", Cache::Priority::BOTTOM);
+  ValidateLRUList({"i", "j", "c", "k", "z"}, 2, 0, 3);
+  Insert("d", Cache::Priority::BOTTOM);
+  ValidateLRUList({"i", "j", "c", "d", "k", "z"}, 2, 0, 4);
+  Insert("e", Cache::Priority::BOTTOM);
+  ValidateLRUList({"j", "c", "d", "e", "k", "z"}, 2, 0, 4);
+
+  // Low-pri entries will be inserted to the tail of low-pri list (the
+  // midpoint).
+  Insert("l", Cache::Priority::LOW);
+  ValidateLRUList({"c", "d", "e", "l", "k", "z"}, 2, 1, 3);
+  Insert("m", Cache::Priority::LOW);
+  ValidateLRUList({"d", "e", "l", "m", "k", "z"}, 2, 2, 2);
+
+  Erase("k");
+  ValidateLRUList({"d", "e", "l", "m", "z"}, 1, 2, 2);
+  Erase("z");
+  ValidateLRUList({"d", "e", "l", "m"}, 0, 2, 2);
+
+  // Bottom-pri entries will be inserted to the tail of bottom-pri list.
+  Insert("f", Cache::Priority::BOTTOM);
+  ValidateLRUList({"d", "e", "f", "l", "m"}, 0, 2, 3);
+  Insert("g", Cache::Priority::BOTTOM);
+  ValidateLRUList({"d", "e", "f", "g", "l", "m"}, 0, 2, 4);
+
+  // High-pri entries will be inserted to the tail of full list.
+  Insert("o", Cache::Priority::HIGH);
+  ValidateLRUList({"e", "f", "g", "l", "m", "o"}, 1, 2, 3);
+  Insert("p", Cache::Priority::HIGH);
+  ValidateLRUList({"f", "g", "l", "m", "o", "p"}, 2, 2, 2);
 }

 TEST_F(LRUCacheTest, EntriesWithPriority) {
-  // Allocate 2 cache entries to high-pri pool.
-  NewCache(5, 0.45);
+  // Allocate 2 cache entries to high-pri pool and 2 to low-pri pool.
+  NewCache(6, /* high_pri_pool_ratio */ 0.35, /* low_pri_pool_ratio */ 0.35);

  Insert("a", Cache::Priority::LOW);
  Insert("b", Cache::Priority::LOW);
+  ValidateLRUList({"a", "b"}, 0, 2, 0);
+  // Low-pri entries can overflow to bottom-pri pool.
  Insert("c", Cache::Priority::LOW);
-  ValidateLRUList({"a", "b", "c"}, 0);
+  ValidateLRUList({"a", "b", "c"}, 0, 2, 1);

-  // Low-pri entries can take high-pri pool capacity if available
+  // Bottom-pri entries can take high-pri pool capacity if available
+  Insert("t", Cache::Priority::LOW);
  Insert("u", Cache::Priority::LOW);
+  ValidateLRUList({"a", "b", "c", "t", "u"}, 0, 2, 3);
  Insert("v", Cache::Priority::LOW);
-  ValidateLRUList({"a", "b", "c", "u", "v"}, 0);
+  ValidateLRUList({"a", "b", "c", "t", "u", "v"}, 0, 2, 4);
+  Insert("w", Cache::Priority::LOW);
+  ValidateLRUList({"b", "c", "t", "u", "v", "w"}, 0, 2, 4);

  Insert("X", Cache::Priority::HIGH);
  Insert("Y", Cache::Priority::HIGH);
-  ValidateLRUList({"c", "u", "v", "X", "Y"}, 2);
+  ValidateLRUList({"t", "u", "v", "w", "X", "Y"}, 2, 2, 2);

-  // High-pri entries can overflow to low-pri pool.
+  // After lookup, the high-pri entry 'X' got spilled over to the low-pri pool.
+  // The low-pri entry 'v' got spilled over to the bottom-pri pool.
  Insert("Z", Cache::Priority::HIGH);
-  ValidateLRUList({"u", "v", "X", "Y", "Z"}, 2);
+  ValidateLRUList({"u", "v", "w", "X", "Y", "Z"}, 2, 2, 2);

  // Low-pri entries will be inserted to head of low-pri pool.
  Insert("a", Cache::Priority::LOW);
-  ValidateLRUList({"v", "X", "a", "Y", "Z"}, 2);
+  ValidateLRUList({"v", "w", "X", "a", "Y", "Z"}, 2, 2, 2);

-  // Low-pri entries will be inserted to head of high-pri pool after lookup.
+  // After lookup, the high-pri entry 'Y' got spilled over to the low-pri pool.
+  // The low-pri entry 'X' got spilled over to the bottom-pri pool.
  ASSERT_TRUE(Lookup("v"));
-  ValidateLRUList({"X", "a", "Y", "Z", "v"}, 2);
+  ValidateLRUList({"w", "X", "a", "Y", "Z", "v"}, 2, 2, 2);

-  // High-pri entries will be inserted to the head of the list after lookup.
+  // After lookup, the high-pri entry 'Z' got spilled over to the low-pri pool.
+  // The low-pri entry 'a' got spilled over to the bottom-pri pool.
  ASSERT_TRUE(Lookup("X"));
-  ValidateLRUList({"a", "Y", "Z", "v", "X"}, 2);
+  ValidateLRUList({"w", "a", "Y", "Z", "v", "X"}, 2, 2, 2);
+
+  // After lookup, the low pri entry 'Z' got promoted back to high-pri pool. The
+  // high-pri entry 'v' got spilled over to the low-pri pool.
  ASSERT_TRUE(Lookup("Z"));
-  ValidateLRUList({"a", "Y", "v", "X", "Z"}, 2);
+  ValidateLRUList({"w", "a", "Y", "v", "X", "Z"}, 2, 2, 2);

  Erase("Y");
-  ValidateLRUList({"a", "v", "X", "Z"}, 2);
+  ValidateLRUList({"w", "a", "v", "X", "Z"}, 2, 1, 2);
  Erase("X");
-  ValidateLRUList({"a", "v", "Z"}, 1);
+  ValidateLRUList({"w", "a", "v", "Z"}, 1, 1, 2);
+
  Insert("d", Cache::Priority::LOW);
  Insert("e", Cache::Priority::LOW);
-  ValidateLRUList({"a", "v", "d", "e", "Z"}, 1);
+  ValidateLRUList({"w", "a", "v", "d", "e", "Z"}, 1, 2, 3);
+
  Insert("f", Cache::Priority::LOW);
  Insert("g", Cache::Priority::LOW);
-  ValidateLRUList({"d", "e", "f", "g", "Z"}, 1);
+  ValidateLRUList({"v", "d", "e", "f", "g", "Z"}, 1, 2, 3);
  ASSERT_TRUE(Lookup("d"));
-  ValidateLRUList({"e", "f", "g", "Z", "d"}, 2);
+  ValidateLRUList({"v", "e", "f", "g", "Z", "d"}, 2, 2, 2);
+
+  // Erase some entries.
+  Erase("e");
+  Erase("f");
+  Erase("Z");
+  ValidateLRUList({"v", "g", "d"}, 1, 1, 1);
+
+  // Bottom-pri entries can take low- and high-pri pool capacity if available
+  Insert("o", Cache::Priority::BOTTOM);
+  ValidateLRUList({"v", "o", "g", "d"}, 1, 1, 2);
+  Insert("p", Cache::Priority::BOTTOM);
+  ValidateLRUList({"v", "o", "p", "g", "d"}, 1, 1, 3);
+  Insert("q", Cache::Priority::BOTTOM);
+  ValidateLRUList({"v", "o", "p", "q", "g", "d"}, 1, 1, 4);
+
+  // High-pri entries can overflow to low-pri pool, and bottom-pri entries will
+  // be evicted.
+  Insert("x", Cache::Priority::HIGH);
+  ValidateLRUList({"o", "p", "q", "g", "d", "x"}, 2, 1, 3);
+  Insert("y", Cache::Priority::HIGH);
+  ValidateLRUList({"p", "q", "g", "d", "x", "y"}, 2, 2, 2);
+  Insert("z", Cache::Priority::HIGH);
+  ValidateLRUList({"q", "g", "d", "x", "y", "z"}, 2, 2, 2);
+
+  // 'g' is bottom-pri before this lookup, it will be inserted to head of
+  // high-pri pool after lookup.
+  ASSERT_TRUE(Lookup("g"));
+  ValidateLRUList({"q", "d", "x", "y", "z", "g"}, 2, 2, 2);
+
+  // High-pri entries will be inserted to head of high-pri pool after lookup.
+  ASSERT_TRUE(Lookup("z"));
+  ValidateLRUList({"q", "d", "x", "y", "g", "z"}, 2, 2, 2);
+
+  // Bottom-pri entries will be inserted to head of high-pri pool after lookup.
+  ASSERT_TRUE(Lookup("d"));
+  ValidateLRUList({"q", "x", "y", "g", "z", "d"}, 2, 2, 2);
+
+  // Bottom-pri entries will be inserted to the tail of bottom-pri list.
+  Insert("m", Cache::Priority::BOTTOM);
+  ValidateLRUList({"x", "m", "y", "g", "z", "d"}, 2, 2, 2);
+
+  // Bottom-pri entries will be inserted to head of high-pri pool after lookup.
+  ASSERT_TRUE(Lookup("m"));
+  ValidateLRUList({"x", "y", "g", "z", "d", "m"}, 2, 2, 2);
 }

 // TODO: FastLRUCache and ClockCache use the same tests. We can probably remove
@ -547,8 +703,9 @@ class TestSecondaryCache : public SecondaryCache {

  explicit TestSecondaryCache(size_t capacity)
      : num_inserts_(0), num_lookups_(0), inject_failure_(false) {
-    cache_ = NewLRUCache(capacity, 0, false, 0.5, nullptr,
-                         kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+    cache_ =
+        NewLRUCache(capacity, 0, false, 0.5 /* high_pri_pool_ratio */, nullptr,
+                    kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
  }
  ~TestSecondaryCache() override { cache_.reset(); }

@ -785,7 +942,10 @@ Cache::CacheItemHelper LRUCacheSecondaryCacheTest::helper_fail_(
    LRUCacheSecondaryCacheTest::DeletionCallback);

 TEST_F(LRUCacheSecondaryCacheTest, BasicTest) {
-  LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(2048);
@ -831,7 +991,10 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicTest) {
 }

 TEST_F(LRUCacheSecondaryCacheTest, BasicFailTest) {
-  LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(2048);
@ -862,7 +1025,10 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicFailTest) {
 }

 TEST_F(LRUCacheSecondaryCacheTest, SaveFailTest) {
-  LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(2048);
@ -909,7 +1075,10 @@ TEST_F(LRUCacheSecondaryCacheTest, SaveFailTest) {
 }

 TEST_F(LRUCacheSecondaryCacheTest, CreateFailTest) {
-  LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(2048);
@ -952,8 +1121,11 @@ TEST_F(LRUCacheSecondaryCacheTest, CreateFailTest) {
 }

 TEST_F(LRUCacheSecondaryCacheTest, FullCapacityTest) {
-  LRUCacheOptions opts(1024, 0, /*_strict_capacity_limit=*/true, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(1024 /* capacity */, 0 /* num_shard_bits */,
+                       true /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(2048);
  opts.secondary_cache = secondary_cache;
@ -1003,8 +1175,11 @@ TEST_F(LRUCacheSecondaryCacheTest, FullCapacityTest) {
 // if we try to insert block_1 to the block cache, it will always fails. Only
 // block_2 will be successfully inserted into the block cache.
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheCorrectness1) {
-  LRUCacheOptions opts(4 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(4 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -1097,7 +1272,10 @@ TEST_F(DBSecondaryCacheTest, TestSecondaryCacheCorrectness1) {
 // insert and cache block_1 in the block cache (this is the different place
 // from TestSecondaryCacheCorrectness1)
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheCorrectness2) {
-  LRUCacheOptions opts(6100, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(6100 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
@ -1187,8 +1365,11 @@ TEST_F(DBSecondaryCacheTest, TestSecondaryCacheCorrectness2) {
 // cache all the blocks in the block cache and there is not secondary cache
 // insertion. 2 lookup is needed for the blocks.
 TEST_F(DBSecondaryCacheTest, NoSecondaryCacheInsertion) {
-  LRUCacheOptions opts(1024 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(1024 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -1238,8 +1419,11 @@ TEST_F(DBSecondaryCacheTest, NoSecondaryCacheInsertion) {
 }

 TEST_F(DBSecondaryCacheTest, SecondaryCacheIntensiveTesting) {
-  LRUCacheOptions opts(8 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(8 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -1284,8 +1468,11 @@ TEST_F(DBSecondaryCacheTest, SecondaryCacheIntensiveTesting) {
 // if we try to insert block_1 to the block cache, it will always fails. Only
 // block_2 will be successfully inserted into the block cache.
 TEST_F(DBSecondaryCacheTest, SecondaryCacheFailureTest) {
-  LRUCacheOptions opts(4 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(4 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -1373,7 +1560,10 @@ TEST_F(DBSecondaryCacheTest, SecondaryCacheFailureTest) {
 }

 TEST_F(LRUCacheSecondaryCacheTest, BasicWaitAllTest) {
-  LRUCacheOptions opts(1024, 2, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1024 /* capacity */, 2 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache =
      std::make_shared<TestSecondaryCache>(32 * 1024);
@ -1433,7 +1623,10 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicWaitAllTest) {
 // a sync point callback in TestSecondaryCache::Lookup. We then control the
 // lookup result by setting the ResultMap.
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheMultiGet) {
-  LRUCacheOptions opts(1 << 20, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
+  LRUCacheOptions opts(1 << 20 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
@ -1516,15 +1709,16 @@ class LRUCacheWithStat : public LRUCache {
 public:
  LRUCacheWithStat(
      size_t _capacity, int _num_shard_bits, bool _strict_capacity_limit,
-      double _high_pri_pool_ratio,
+      double _high_pri_pool_ratio, double _low_pri_pool_ratio,
      std::shared_ptr<MemoryAllocator> _memory_allocator = nullptr,
      bool _use_adaptive_mutex = kDefaultToAdaptiveMutex,
      CacheMetadataChargePolicy _metadata_charge_policy =
          kDontChargeCacheMetadata,
      const std::shared_ptr<SecondaryCache>& _secondary_cache = nullptr)
      : LRUCache(_capacity, _num_shard_bits, _strict_capacity_limit,
-                 _high_pri_pool_ratio, _memory_allocator, _use_adaptive_mutex,
-                 _metadata_charge_policy, _secondary_cache) {
+                 _high_pri_pool_ratio, _low_pri_pool_ratio, _memory_allocator,
+                 _use_adaptive_mutex, _metadata_charge_policy,
+                 _secondary_cache) {
    insert_count_ = 0;
    lookup_count_ = 0;
  }
@ -1567,13 +1761,17 @@ class LRUCacheWithStat : public LRUCache {
 #ifndef ROCKSDB_LITE

 TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadBasic) {
-  LRUCacheOptions cache_opts(1024 * 1024, 0, false, 0.5, nullptr,
+  LRUCacheOptions cache_opts(1024 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                             false /* strict_capacity_limit */,
+                             0.5 /* high_pri_pool_ratio */,
+                             nullptr /* memory_allocator */,
                             kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
  LRUCacheWithStat* tmp_cache = new LRUCacheWithStat(
      cache_opts.capacity, cache_opts.num_shard_bits,
      cache_opts.strict_capacity_limit, cache_opts.high_pri_pool_ratio,
-      cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
-      cache_opts.metadata_charge_policy, cache_opts.secondary_cache);
+      cache_opts.low_pri_pool_ratio, cache_opts.memory_allocator,
+      cache_opts.use_adaptive_mutex, cache_opts.metadata_charge_policy,
+      cache_opts.secondary_cache);
  std::shared_ptr<Cache> cache(tmp_cache);
  BlockBasedTableOptions table_options;
  table_options.block_cache = cache;
@ -1644,8 +1842,9 @@ TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadBasic) {
  tmp_cache = new LRUCacheWithStat(
      cache_opts.capacity, cache_opts.num_shard_bits,
      cache_opts.strict_capacity_limit, cache_opts.high_pri_pool_ratio,
-      cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
-      cache_opts.metadata_charge_policy, cache_opts.secondary_cache);
+      cache_opts.low_pri_pool_ratio, cache_opts.memory_allocator,
+      cache_opts.use_adaptive_mutex, cache_opts.metadata_charge_policy,
+      cache_opts.secondary_cache);
  std::shared_ptr<Cache> cache_new(tmp_cache);
  table_options.block_cache = cache_new;
  table_options.block_size = 4 * 1024;
@ -1702,13 +1901,17 @@ TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadBasic) {
 }

 TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadWithFilter) {
-  LRUCacheOptions cache_opts(1024 * 1024, 0, false, 0.5, nullptr,
+  LRUCacheOptions cache_opts(1024 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                             false /* strict_capacity_limit */,
+                             0.5 /* high_pri_pool_ratio */,
+                             nullptr /* memory_allocator */,
                             kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
  LRUCacheWithStat* tmp_cache = new LRUCacheWithStat(
      cache_opts.capacity, cache_opts.num_shard_bits,
      cache_opts.strict_capacity_limit, cache_opts.high_pri_pool_ratio,
-      cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
-      cache_opts.metadata_charge_policy, cache_opts.secondary_cache);
+      cache_opts.low_pri_pool_ratio, cache_opts.memory_allocator,
+      cache_opts.use_adaptive_mutex, cache_opts.metadata_charge_policy,
+      cache_opts.secondary_cache);
  std::shared_ptr<Cache> cache(tmp_cache);
  BlockBasedTableOptions table_options;
  table_options.block_cache = cache;
@ -1806,8 +2009,9 @@ TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadWithFilter) {
  tmp_cache = new LRUCacheWithStat(
      cache_opts.capacity, cache_opts.num_shard_bits,
      cache_opts.strict_capacity_limit, cache_opts.high_pri_pool_ratio,
-      cache_opts.memory_allocator, cache_opts.use_adaptive_mutex,
-      cache_opts.metadata_charge_policy, cache_opts.secondary_cache);
+      cache_opts.low_pri_pool_ratio, cache_opts.memory_allocator,
+      cache_opts.use_adaptive_mutex, cache_opts.metadata_charge_policy,
+      cache_opts.secondary_cache);
  std::shared_ptr<Cache> cache_new(tmp_cache);
  table_options.block_cache = cache_new;
  table_options.block_size = 4 * 1024;
@ -1873,8 +2077,11 @@ TEST_F(DBSecondaryCacheTest, LRUCacheDumpLoadWithFilter) {

 // Test the option not to use the secondary cache in a certain DB.
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheOptionBasic) {
-  LRUCacheOptions opts(4 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(4 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -1965,8 +2172,11 @@ TEST_F(DBSecondaryCacheTest, TestSecondaryCacheOptionBasic) {
 // with new options, which set the lowest_used_cache_tier to
 // kNonVolatileBlockTier. So secondary cache will be used.
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheOptionChange) {
-  LRUCacheOptions opts(4 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(4 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
@ -2057,8 +2267,11 @@ TEST_F(DBSecondaryCacheTest, TestSecondaryCacheOptionChange) {
 // Two DB test. We create 2 DBs sharing the same block cache and secondary
 // cache. We diable the secondary cache option for DB2.
 TEST_F(DBSecondaryCacheTest, TestSecondaryCacheOptionTwoDB) {
-  LRUCacheOptions opts(4 * 1024, 0, false, 0.5, nullptr,
-                       kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
+  LRUCacheOptions opts(4 * 1024 /* capacity */, 0 /* num_shard_bits */,
+                       false /* strict_capacity_limit */,
+                       0.5 /* high_pri_pool_ratio */,
+                       nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+                       kDontChargeCacheMetadata);
  std::shared_ptr<TestSecondaryCache> secondary_cache(
      new TestSecondaryCache(2048 * 1024));
  opts.secondary_cache = secondary_cache;
--- a/db/blob/blob_file_builder.cc
+++ b/db/blob/blob_file_builder.cc
@ -404,7 +404,7 @@ Status BlobFileBuilder::PutBlobIntoCacheIfNeeded(const Slice& blob,
    const CacheKey cache_key = base_cache_key.WithOffset(blob_offset);
    const Slice key = cache_key.AsSlice();

-    const Cache::Priority priority = Cache::Priority::LOW;
+    const Cache::Priority priority = Cache::Priority::BOTTOM;

    // Objects to be put into the cache have to be heap-allocated and
    // self-contained, i.e. own their contents. The Cache has to be able to
--- a/db/blob/blob_source.cc
+++ b/db/blob/blob_source.cc
@ -70,7 +70,7 @@ Status BlobSource::PutBlobIntoCache(const Slice& cache_key,
  assert(blob_cache_);

  Status s;
-  const Cache::Priority priority = Cache::Priority::LOW;
+  const Cache::Priority priority = Cache::Priority::BOTTOM;

  // Objects to be put into the cache have to be heap-allocated and
  // self-contained, i.e. own their contents. The Cache has to be able to take
@ -108,7 +108,7 @@ Cache::Handle* BlobSource::GetEntryFromCache(const Slice& key) const {
      return Status::OK();
    };
    cache_handle = blob_cache_->Lookup(key, GetCacheItemHelper(), create_cb,
-                                       Cache::Priority::LOW,
+                                       Cache::Priority::BOTTOM,
                                       true /* wait_for_cache */, statistics_);
  } else {
    cache_handle = blob_cache_->Lookup(key, statistics_);
--- a/db/blob/blob_source_test.cc
+++ b/db/blob/blob_source_test.cc
@ -121,6 +121,8 @@ class BlobSourceTest : public DBTestBase {
    co.capacity = 8 << 20;
    co.num_shard_bits = 2;
    co.metadata_charge_policy = kDontChargeCacheMetadata;
+    co.high_pri_pool_ratio = 0.2;
+    co.low_pri_pool_ratio = 0.2;
    options_.blob_cache = NewLRUCache(co);
    options_.lowest_used_cache_tier = CacheTier::kVolatileTier;

@ -1042,6 +1044,8 @@ class BlobSecondaryCacheTest : public DBTestBase {
    lru_cache_ops_.num_shard_bits = 0;
    lru_cache_ops_.strict_capacity_limit = true;
    lru_cache_ops_.metadata_charge_policy = kDontChargeCacheMetadata;
+    lru_cache_ops_.high_pri_pool_ratio = 0.2;
+    lru_cache_ops_.low_pri_pool_ratio = 0.2;

    secondary_cache_opts_.capacity = 8 << 20;  // 8 MB
    secondary_cache_opts_.num_shard_bits = 0;
@ -1275,7 +1279,13 @@ class BlobSourceCacheReservationTest : public DBTestBase {
    co.capacity = kCacheCapacity;
    co.num_shard_bits = kNumShardBits;
    co.metadata_charge_policy = kDontChargeCacheMetadata;
+
+    co.high_pri_pool_ratio = 0.0;
+    co.low_pri_pool_ratio = 0.0;
    std::shared_ptr<Cache> blob_cache = NewLRUCache(co);
+
+    co.high_pri_pool_ratio = 0.5;
+    co.low_pri_pool_ratio = 0.5;
    std::shared_ptr<Cache> block_cache = NewLRUCache(co);

    options_.blob_cache = blob_cache;
--- a/db/db_block_cache_test.cc
+++ b/db/db_block_cache_test.cc
@ -819,8 +819,8 @@ class MockCache : public LRUCache {

  MockCache()
      : LRUCache((size_t)1 << 25 /*capacity*/, 0 /*num_shard_bits*/,
-                 false /*strict_capacity_limit*/, 0.0 /*high_pri_pool_ratio*/) {
-  }
+                 false /*strict_capacity_limit*/, 0.0 /*high_pri_pool_ratio*/,
+                 0.0 /*low_pri_pool_ratio*/) {}

  using ShardedCache::Insert;

--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@ -651,8 +651,12 @@ TEST_F(DBTest2, SharedWriteBufferLimitAcrossDB) {
 TEST_F(DBTest2, TestWriteBufferNoLimitWithCache) {
  Options options = CurrentOptions();
  options.arena_block_size = 4096;
-  std::shared_ptr<Cache> cache =
-      NewLRUCache(LRUCacheOptions(10000000, 1, false, 0.0));
+  std::shared_ptr<Cache> cache = NewLRUCache(LRUCacheOptions(
+      10000000 /* capacity */, 1 /* num_shard_bits */,
+      false /* strict_capacity_limit */, 0.0 /* high_pri_pool_ratio */,
+      nullptr /* memory_allocator */, kDefaultToAdaptiveMutex,
+      kDontChargeCacheMetadata));
+
  options.write_buffer_size = 50000;  // this is never hit
  // Use a write buffer total size so that the soft limit is about
  // 105000.
--- a/include/rocksdb/cache.h
+++ b/include/rocksdb/cache.h
@ -72,6 +72,17 @@ struct LRUCacheOptions {
  // BlockBasedTableOptions::cache_index_and_filter_blocks_with_high_priority.
  double high_pri_pool_ratio = 0.5;

+  // Percentage of cache reserved for low priority entries.
+  // If greater than zero, the LRU list will be split into a high-pri list, a
+  // low-pri list and a bottom-pri list. High-pri entries will be inserted to
+  // the tail of high-pri list, while low-pri entries will be first inserted to
+  // the low-pri list (the midpoint) and bottom-pri entries will be first
+  // inserted to the bottom-pri list.
+  //
+  //
+  // See also high_pri_pool_ratio.
+  double low_pri_pool_ratio = 0.0;
+
  // If non-nullptr will use this allocator instead of system allocator when
  // allocating memory for cache blocks. Call this method before you start using
  // the cache!
@ -99,11 +110,13 @@ struct LRUCacheOptions {
                  std::shared_ptr<MemoryAllocator> _memory_allocator = nullptr,
                  bool _use_adaptive_mutex = kDefaultToAdaptiveMutex,
                  CacheMetadataChargePolicy _metadata_charge_policy =
-                      kDefaultCacheMetadataChargePolicy)
+                      kDefaultCacheMetadataChargePolicy,
+                  double _low_pri_pool_ratio = 0.0)
      : capacity(_capacity),
        num_shard_bits(_num_shard_bits),
        strict_capacity_limit(_strict_capacity_limit),
        high_pri_pool_ratio(_high_pri_pool_ratio),
+        low_pri_pool_ratio(_low_pri_pool_ratio),
        memory_allocator(std::move(_memory_allocator)),
        use_adaptive_mutex(_use_adaptive_mutex),
        metadata_charge_policy(_metadata_charge_policy) {}
@ -123,7 +136,8 @@ extern std::shared_ptr<Cache> NewLRUCache(
    std::shared_ptr<MemoryAllocator> memory_allocator = nullptr,
    bool use_adaptive_mutex = kDefaultToAdaptiveMutex,
    CacheMetadataChargePolicy metadata_charge_policy =
-        kDefaultCacheMetadataChargePolicy);
+        kDefaultCacheMetadataChargePolicy,
+    double low_pri_pool_ratio = 0.0);

 extern std::shared_ptr<Cache> NewLRUCache(const LRUCacheOptions& cache_opts);

@ -151,10 +165,11 @@ struct CompressedSecondaryCacheOptions : LRUCacheOptions {
      CacheMetadataChargePolicy _metadata_charge_policy =
          kDefaultCacheMetadataChargePolicy,
      CompressionType _compression_type = CompressionType::kLZ4Compression,
-      uint32_t _compress_format_version = 2)
+      uint32_t _compress_format_version = 2, double _low_pri_pool_ratio = 0.0)
      : LRUCacheOptions(_capacity, _num_shard_bits, _strict_capacity_limit,
                        _high_pri_pool_ratio, std::move(_memory_allocator),
-                        _use_adaptive_mutex, _metadata_charge_policy),
+                        _use_adaptive_mutex, _metadata_charge_policy,
+                        _low_pri_pool_ratio),
        compression_type(_compression_type),
        compress_format_version(_compress_format_version) {}
 };
@ -169,7 +184,7 @@ extern std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
    CacheMetadataChargePolicy metadata_charge_policy =
        kDefaultCacheMetadataChargePolicy,
    CompressionType compression_type = CompressionType::kLZ4Compression,
-    uint32_t compress_format_version = 2);
+    uint32_t compress_format_version = 2, double low_pri_pool_ratio = 0.0);

 extern std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
    const CompressedSecondaryCacheOptions& opts);
@ -196,7 +211,17 @@ class Cache {
 public:
  // Depending on implementation, cache entries with high priority could be less
  // likely to get evicted than low priority entries.
-  enum class Priority { HIGH, LOW };
+  //
+  // The BOTTOM priority is mainly used for blob caching. Blobs are typically
+  // lower-value targets for caching than data blocks, since 1) with BlobDB,
+  // data blocks containing blob references conceptually form an index structure
+  // which has to be consulted before we can read the blob value, and 2) cached
+  // blobs represent only a single key-value, while cached data blocks generally
+  // contain multiple KVs. Since we would like to make it possible to use the
+  // same backing cache for the block cache and the blob cache, it would make
+  // sense to add a new, bottom cache priority level for blobs so data blocks
+  // are prioritized over them.
+  enum class Priority { HIGH, LOW, BOTTOM };

  // A set of callbacks to allow objects in the primary block cache to be
  // be persisted in a secondary cache. The purpose of the secondary cache
--- a/java/rocksjni/lru_cache.cc
+++ b/java/rocksjni/lru_cache.cc
@ -22,12 +22,16 @@ jlong Java_org_rocksdb_LRUCache_newLRUCache(JNIEnv* /*env*/, jclass /*jcls*/,
                                            jlong jcapacity,
                                            jint jnum_shard_bits,
                                            jboolean jstrict_capacity_limit,
-                                            jdouble jhigh_pri_pool_ratio) {
+                                            jdouble jhigh_pri_pool_ratio,
+                                            jdouble jlow_pri_pool_ratio) {
  auto* sptr_lru_cache = new std::shared_ptr<ROCKSDB_NAMESPACE::Cache>(
      ROCKSDB_NAMESPACE::NewLRUCache(
          static_cast<size_t>(jcapacity), static_cast<int>(jnum_shard_bits),
          static_cast<bool>(jstrict_capacity_limit),
-          static_cast<double>(jhigh_pri_pool_ratio)));
+          static_cast<double>(jhigh_pri_pool_ratio),
+          nullptr /* memory_allocator */, rocksdb::kDefaultToAdaptiveMutex,
+          rocksdb::kDontChargeCacheMetadata,
+          static_cast<double>(jlow_pri_pool_ratio)));
  return GET_CPLUSPLUS_POINTER(sptr_lru_cache);
 }

--- a/java/src/main/java/org/rocksdb/LRUCache.java
+++ b/java/src/main/java/org/rocksdb/LRUCache.java
@ -16,7 +16,7 @@ public class LRUCache extends Cache {
   * @param capacity The fixed size capacity of the cache
   */
  public LRUCache(final long capacity) {
-    this(capacity, -1, false, 0.0);
+    this(capacity, -1, false, 0.0, 0.0);
  }

  /**
@ -31,7 +31,7 @@ public class LRUCache extends Cache {
   *     by hash of the key
   */
  public LRUCache(final long capacity, final int numShardBits) {
-    super(newLRUCache(capacity, numShardBits, false,0.0));
+    super(newLRUCache(capacity, numShardBits, false, 0.0, 0.0));
  }

  /**
@ -49,7 +49,7 @@ public class LRUCache extends Cache {
   */
  public LRUCache(final long capacity, final int numShardBits,
                  final boolean strictCapacityLimit) {
-    super(newLRUCache(capacity, numShardBits, strictCapacityLimit,0.0));
+    super(newLRUCache(capacity, numShardBits, strictCapacityLimit, 0.0, 0.0));
  }

  /**
@ -69,14 +69,38 @@ public class LRUCache extends Cache {
   * @param highPriPoolRatio percentage of the cache reserves for high priority
   *     entries
   */
-  public LRUCache(final long capacity, final int numShardBits,
-      final boolean strictCapacityLimit, final double highPriPoolRatio) {
-    super(newLRUCache(capacity, numShardBits, strictCapacityLimit,
-        highPriPoolRatio));
+  public LRUCache(final long capacity, final int numShardBits, final boolean strictCapacityLimit,
+      final double highPriPoolRatio) {
+    super(newLRUCache(capacity, numShardBits, strictCapacityLimit, highPriPoolRatio, 0.0));
  }

-  private native static long newLRUCache(final long capacity,
-      final int numShardBits, final boolean strictCapacityLimit,
-      final double highPriPoolRatio);
+  /**
+   * Create a new cache with a fixed size capacity. The cache is sharded
+   * to 2^numShardBits shards, by hash of the key. The total capacity
+   * is divided and evenly assigned to each shard. If strictCapacityLimit
+   * is set, insert to the cache will fail when cache is full. User can also
+   * set percentage of the cache reserves for high priority entries and low
+   * priority entries via highPriPoolRatio and lowPriPoolRatio.
+   * numShardBits = -1 means it is automatically determined: every shard
+   * will be at least 512KB and number of shard bits will not exceed 6.
+   *
+   * @param capacity The fixed size capacity of the cache
+   * @param numShardBits The cache is sharded to 2^numShardBits shards,
+   *     by hash of the key
+   * @param strictCapacityLimit insert to the cache will fail when cache is full
+   * @param highPriPoolRatio percentage of the cache reserves for high priority
+   *     entries
+   * @param lowPriPoolRatio percentage of the cache reserves for low priority
+   *     entries
+   */
+  public LRUCache(final long capacity, final int numShardBits, final boolean strictCapacityLimit,
+      final double highPriPoolRatio, final double lowPriPoolRatio) {
+    super(newLRUCache(
+        capacity, numShardBits, strictCapacityLimit, highPriPoolRatio, lowPriPoolRatio));
+  }
+
+  private native static long newLRUCache(final long capacity, final int numShardBits,
+      final boolean strictCapacityLimit, final double highPriPoolRatio,
+      final double lowPriPoolRatio);
  @Override protected final native void disposeInternal(final long handle);
 }
--- a/java/src/test/java/org/rocksdb/LRUCacheTest.java
+++ b/java/src/test/java/org/rocksdb/LRUCacheTest.java
@ -20,9 +20,10 @@ public class LRUCacheTest {
    final long capacity = 80000000;
    final int numShardBits = 16;
    final boolean strictCapacityLimit = true;
-    final double highPriPoolRatio = 0.05;
-    try(final Cache lruCache = new LRUCache(capacity,
-        numShardBits, strictCapacityLimit, highPriPoolRatio)) {
+    final double highPriPoolRatio = 0.5;
+    final double lowPriPoolRatio = 0.5;
+    try (final Cache lruCache = new LRUCache(
+             capacity, numShardBits, strictCapacityLimit, highPriPoolRatio, lowPriPoolRatio)) {
      //no op
      assertThat(lruCache.getUsage()).isGreaterThanOrEqualTo(0);
      assertThat(lruCache.getPinnedUsage()).isGreaterThanOrEqualTo(0);
--- a/memory/memory_allocator_test.cc
+++ b/memory/memory_allocator_test.cc
@ -83,7 +83,7 @@ TEST_P(MemoryAllocatorTest, DatabaseBlockCache) {

  options.create_if_missing = true;
  BlockBasedTableOptions table_options;
-  auto cache = NewLRUCache(1024 * 1024, 6, false, false, allocator_);
+  auto cache = NewLRUCache(1024 * 1024, 6, false, 0.0, allocator_);
  table_options.block_cache = cache;
  options.table_factory.reset(NewBlockBasedTableFactory(table_options));
  DB* db = nullptr;
--- a/table/block_based/block_based_table_factory.cc
+++ b/table/block_based/block_based_table_factory.cc
@ -454,6 +454,7 @@ void BlockBasedTableFactory::InitializeOptions() {
    // It makes little sense to pay overhead for mid-point insertion while the
    // block size is only 8MB.
    co.high_pri_pool_ratio = 0.0;
+    co.low_pri_pool_ratio = 0.0;
    table_options_.block_cache = NewLRUCache(co);
  }
  if (table_options_.block_size_deviation < 0 ||
--- a/tools/benchmark.sh
+++ b/tools/benchmark.sh
@ -192,7 +192,7 @@ if [[ $cache_index_and_filter -eq 0 ]]; then
 elif [[ $cache_index_and_filter -eq 1 ]]; then
  cache_meta_flags="\
  --cache_index_and_filter_blocks=$cache_index_and_filter \
-  --cache_high_pri_pool_ratio=0.5"
+  --cache_high_pri_pool_ratio=0.5 --cache_low_pri_pool_ratio=0"
 else
  echo CACHE_INDEX_AND_FILTER_BLOCKS was $CACHE_INDEX_AND_FILTER_BLOCKS but must be 0 or 1
  exit $EXIT_INVALID_ARGS
--- a/tools/db_bench_tool.cc
+++ b/tools/db_bench_tool.cc
@ -570,6 +570,9 @@ DEFINE_double(cache_high_pri_pool_ratio, 0.0,
              "If > 0.0, we also enable "
              "cache_index_and_filter_blocks_with_high_priority.");

+DEFINE_double(cache_low_pri_pool_ratio, 0.0,
+              "Ratio of block cache reserve for low pri blocks.");
+
 DEFINE_string(cache_type, "lru_cache", "Type of block cache.");

 DEFINE_bool(use_compressed_secondary_cache, false,
@ -589,6 +592,9 @@ DEFINE_double(compressed_secondary_cache_high_pri_pool_ratio, 0.0,
              "If > 0.0, we also enable "
              "cache_index_and_filter_blocks_with_high_priority.");

+DEFINE_double(compressed_secondary_cache_low_pri_pool_ratio, 0.0,
+              "Ratio of block cache reserve for low pri blocks.");
+
 DEFINE_string(compressed_secondary_cache_compression_type, "lz4",
              "The compression algorithm to use for large "
              "values stored in CompressedSecondaryCache.");
@ -3022,11 +3028,12 @@ class Benchmark {
 #ifdef MEMKIND
          FLAGS_use_cache_memkind_kmem_allocator
              ? std::make_shared<MemkindKmemAllocator>()
-              : nullptr
+              : nullptr,
 #else
-          nullptr
+          nullptr,
 #endif
-      );
+          kDefaultToAdaptiveMutex, kDefaultCacheMetadataChargePolicy,
+          FLAGS_cache_low_pri_pool_ratio);
      if (FLAGS_use_cache_memkind_kmem_allocator) {
 #ifndef MEMKIND
        fprintf(stderr, "Memkind library is not linked with the binary.");
@ -3055,6 +3062,8 @@ class Benchmark {
            FLAGS_compressed_secondary_cache_numshardbits;
        secondary_cache_opts.high_pri_pool_ratio =
            FLAGS_compressed_secondary_cache_high_pri_pool_ratio;
+        secondary_cache_opts.low_pri_pool_ratio =
+            FLAGS_compressed_secondary_cache_low_pri_pool_ratio;
        secondary_cache_opts.compression_type =
            FLAGS_compressed_secondary_cache_compression_type_e;
        secondary_cache_opts.compress_format_version =
@ -4296,6 +4305,12 @@ class Benchmark {
        block_based_options.cache_index_and_filter_blocks_with_high_priority =
            true;
      }
+      if (FLAGS_cache_high_pri_pool_ratio + FLAGS_cache_low_pri_pool_ratio >
+          1.0) {
+        fprintf(stderr,
+                "Sum of high_pri_pool_ratio and low_pri_pool_ratio "
+                "cannot exceed 1.0.\n");
+      }
      block_based_options.block_cache = cache_;
      block_based_options.cache_usage_options.options_overrides.insert(
          {CacheEntryRole::kCompressionDictionaryBuildingBuffer,