Commit graph

3 commits

Author SHA1 Message Date
akankshamahajan 749b179c04 Remove reallocation of AlignedBuffer in direct_io sync reads if already aligned (#11600)
Summary:
Remove reallocation of AlignedBuffer in direct_io sync reads in RandomAccessFileReader::Read if buffer passed is already aligned.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11600

Test Plan:
Setup: `TEST_TMPDIR=./tmp-db/ ./db_bench -benchmarks=filluniquerandom -disable_auto_compactions=true -target_file_size_base=1048576 -write_buffer_size=1048576 -compression_type=none`
Benchmark: `TEST_TMPDIR=./tmp-db/ perf record ./db_bench --cache_size=8388608 --use_existing_db=true --disable_auto_compactions=true --benchmarks=seekrandom --use_direct_reads=true -use_direct_io_for_flush_and_compaction=true -reads=1000 -seek_nexts=1 -max_auto_readahead_size=131072 -initial_auto_readahead_size=16384 -adaptive_readahead=true -num_file_reads_for_auto_readahead=0`

Perf profile-
Before:
```
8.73% db_bench libc.so.6 [.] __memmove_evex_unaligned_erms
3.34% db_bench [kernel.vmlinux] [k] filemap_get_read_batch
```

After:
```
2.50% db_bench [kernel.vmlinux] [k] filemap_get_read_batch
2.29% db_bench libc.so.6 [.] __memmove_evex_unaligned_erms
```

`make  crash_test -j `with direct_io enabled completed succesfully locally.

Ran few benchmarks with direct_io from seek_nexts varying between 912 to 327680 and different readahead_size parameters and it showed no regression so far.

Reviewed By: ajkr

Differential Revision: D47478598

Pulled By: akankshamahajan15

fbshipit-source-id: 6a48e21cb34696f5d09c22a6311a3a1cb5f9cf33
2023-07-14 20:08:05 -07:00
Peter Dillinger b1b6f87fbe Some small improvements to HyperClockCache (#11601)
Summary:
Stacked on https://github.com/facebook/rocksdb/issues/11572
* Minimize use of std::function and lambdas to minimize chances of
compiler heap-allocating closures (unnecessary stress on allocator). It
appears that converting FindSlot to a template enables inlining the
lambda parameters, avoiding heap allocations.
* Clean up some logic with FindSlot (FIXMEs from https://github.com/facebook/rocksdb/issues/11572)
* Fix handling of rare case of probing all slots, with new unit test.
(Previously Insert would not roll back displacements in that case, which
would kill performance if it were to happen.)
* Add an -early_exit option to cache_bench for gathering memory stats
before deallocation.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11601

Test Plan:
unit test added for probing all slots

## Seeing heap allocations
Run `MALLOC_CONF="stats_print:true" ./cache_bench -cache_type=hyper_clock_cache`
before https://github.com/facebook/rocksdb/issues/11572 vs. after this change. Before, we see this in the
interesting bin statistics:

```
size  nrequests
----  ---------
  32     578460
  64      24340
8192     578460
```
And after:
```
size  nrequests
----  ---------
  32  (insignificant)
  64      24370
8192     579130
```

## Performance test
Build with `make USE_CLANG=1 PORTABLE=0 DEBUG_LEVEL=0 -j32 cache_bench`

Run `./cache_bench -cache_type=hyper_clock_cache -ops_per_thread=5000000`
in before and after configurations, simultaneously:

```
Before: Complete in 33.244 s; Rough parallel ops/sec = 2406442
After:  Complete in 32.773 s; Rough parallel ops/sec = 2441019
```

Reviewed By: jowlyzhang

Differential Revision: D47375092

Pulled By: pdillinger

fbshipit-source-id: 46f0f57257ddb374290a0a38c651764ea60ba410
2023-07-14 16:19:22 -07:00
Peter Dillinger 7a9b264f36 Some fixes to unreleased_history/ (#11504)
Summary:
* Add a "Performance Improvements" section
* Add required copyright headers

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11504

Test Plan: manual

Reviewed By: hx235

Differential Revision: D46405128

Pulled By: pdillinger

fbshipit-source-id: 4f878dfd0170d381d3051a44c13479c860e812c0
2023-06-02 15:55:02 -07:00