rocksdb

mirror of https://github.com/facebook/rocksdb.git synced 2024-11-29 09:36:17 +00:00

Author	SHA1	Message	Date
Guido Tagliavini Ponce	57a0e2f304	Clock cache (#10273 ) Summary: This is the initial step in the development of a lock-free clock cache. This PR includes the base hash table design (which we mostly ported over from FastLRUCache) and the clock eviction algorithm. Importantly, it's still _not_ lock-free---all operations use a shard lock. Besides the locking, there are other features left as future work: - Remove keys from the handles. Instead, use 128-bit bijective hashes of them for handle comparisons, probing (we need two 32-bit hashes of the key for double hashing) and sharding (we need one 6-bit hash). - Remove the clock_usage_ field, which is updated on every lookup. Even if it were atomically updated, it could cause memory invalidations across cores. - Middle insertions into the clock list. - A test that exercises the clock eviction policy. - Update the Java API of ClockCache and Java calls to C++. Along the way, we improved the code and comments quality of FastLRUCache. These changes are relatively minor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10273 Test Plan: ``make -j24 check`` Reviewed By: pdillinger Differential Revision: D37522461 Pulled By: guidotag fbshipit-source-id: 3d70b737dbb70dcf662f00cef8c609750f083943	2022-06-29 21:50:39 -07:00
Guido Tagliavini Ponce	c6055cba30	Calculate table size of FastLRUCache more accurately (#10235 ) Summary: Calculate the required size of the hash table in FastLRUCache more accurately. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10235 Test Plan: ``make -j24 check`` Reviewed By: gitbw95 Differential Revision: D37460546 Pulled By: guidotag fbshipit-source-id: 7945128d6f002832f8ed922ef0151919f4350854	2022-06-27 21:04:59 -07:00
Guido Tagliavini Ponce	3afed7408c	Replace per-shard chained hash tables with open-addressing scheme (#10194 ) Summary: In FastLRUCache, we replace the current chained per-shard hash table by an open-addressing hash table. In particular, this allows us to preallocate all handles. Because all handles are preallocated, this implementation doesn't support strict_capacity_limit = false (i.e., allowing insertions beyond the predefined capacity). This clashes with current assumptions of some tests, namely two tests in cache_test and the crash tests. We have disabled these for now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10194 Test Plan: ``make -j24 check`` Reviewed By: pdillinger Differential Revision: D37296770 Pulled By: guidotag fbshipit-source-id: 232ff1b8260331d868ebf4e3e5d8ad709390b0ad	2022-06-21 08:45:04 -07:00
Peter Dillinger	1aac814578	Use optimized folly DistributedMutex in LRUCache when available (#10179 ) Summary: folly DistributedMutex is faster than standard mutexes though imposes some static obligations on usage. See https://github.com/facebook/folly/blob/main/folly/synchronization/DistributedMutex.h for details. Here we use this alternative for our Cache implementations (especially LRUCache) for better locking performance, when RocksDB is compiled with folly. Also added information about which distributed mutex implementation is being used to cache_bench output and to DB LOG. Intended follow-up: * Use DMutex in more places, perhaps improving API to support non-scoped locking * Fix linking with fbcode compiler (needs ROCKSDB_NO_FBCODE=1 currently) Credit: Thanks Siying for reminding me about this line of work that was previously left unfinished. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10179 Test Plan: for correctness, existing tests. CircleCI config updated. Also Meta-internal buck build updated. For performance, ran simultaneous before & after cache_bench. Out of three comparison runs, the middle improvement to ops/sec was +21%: Baseline: USE_CLANG=1 DEBUG_LEVEL=0 make -j24 cache_bench (fbcode compiler) ``` Complete in 20.201 s; Rough parallel ops/sec = 1584062 Thread ops/sec = 107176 Operation latency (ns): Count: 32000000 Average: 9257.9421 StdDev: 122412.04 Min: 134 Median: 3623.0493 Max: 56918500 Percentiles: P50: 3623.05 P75: 10288.02 P99: 30219.35 P99.9: 683522.04 P99.99: 7302791.63 ``` New: (add USE_FOLLY=1) ``` Complete in 16.674 s; Rough parallel ops/sec = 1919135 (+21%) Thread ops/sec = 135487 Operation latency (ns): Count: 32000000 Average: 7304.9294 StdDev: 108530.28 Min: 132 Median: 3777.6012 Max: 91030902 Percentiles: P50: 3777.60 P75: 10169.89 P99: 24504.51 P99.9: 59721.59 P99.99: 1861151.83 ``` Reviewed By: anand1976 Differential Revision: D37182983 Pulled By: pdillinger fbshipit-source-id: a17eb05f25b832b6a2c1356f5c657e831a5af8d1	2022-06-17 13:08:45 -07:00
Guido Tagliavini Ponce	f105e1a501	Make the per-shard hash table fixed-size. (#10154 ) Summary: We make the size of the per-shard hash table fixed. The base level of the hash table is now preallocated with the required capacity. The user must provide an estimate of the size of the values. Notice that even though the base level becomes fixed, the chains are still dynamic. Overall, the shard capacity mechanisms haven't changed, so we don't need to test this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10154 Test Plan: `make -j24 check` Reviewed By: pdillinger Differential Revision: D37124451 Pulled By: guidotag fbshipit-source-id: cba6ac76052fe0ec60b8ff4211b3de7650e80d0c	2022-06-13 20:29:00 -07:00
Guido Tagliavini Ponce	415200d792	Assume fixed size key (#10137 ) Summary: FastLRUCache now only supports 16B keys. The tests have changed to reflect this. Because the unit tests were designed for caches that accept any string as keys, some tests are no longer compatible with FastLRUCache. We have disabled those for runs with FastLRUCache. (We could potentially change all tests to use 16B keys, but we don't because the cache public API does not require this.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/10137 Test Plan: make -j24 check Reviewed By: gitbw95 Differential Revision: D37083934 Pulled By: guidotag fbshipit-source-id: be1719cf5f8364a9a32bc4555bce1a0de3833b0d	2022-06-10 19:12:18 -07:00
sdong	c78a87cd71	Avoid malloc_usable_size() call inside LRU Cache mutex (#10026 ) Summary: In LRU Cache mutex, we sometimes call malloc_usable_size() to calculate memory used by the metadata object. We prevent it by saving the charge + metadata size, rather than charge, inside the metadata itself. Within the mutex, usually only total charge is needed so we don't need to repeat. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10026 Test Plan: Run existing tests. Reviewed By: pdillinger Differential Revision: D36556253 fbshipit-source-id: f60c96d13cde3af77732e5548e4eac4182fa9801	2022-05-24 13:31:16 -07:00
Peter Dillinger	bb87164db3	Fork and simplify LRUCache for developing enhancements (#9917 ) Summary: To support a project to prototype and evaluate algorithmic enhancments and alternatives to LRUCache, here I have separated out LRUCache into internal-only "FastLRUCache" and cut it down to essentials, so that details like secondary cache handling and priorities do not interfere with prototyping. These can be re-integrated later as needed, along with refactoring to minimize code duplication (which would slow down prototyping for now). Pull Request resolved: https://github.com/facebook/rocksdb/pull/9917 Test Plan: unit tests updated to ensure basic functionality has (likely) been preserved Reviewed By: anand1976 Differential Revision: D35995554 Pulled By: pdillinger fbshipit-source-id: d67b20b7ada3b5d3bfe56d897a73885894a1d9db	2022-05-03 12:32:02 -07:00

8 commits