Commit Graph

11 Commits

Author SHA1 Message Date
Jesse Rosenstock e441a8cb11
perf-counters: Make tests pass on Android (#1653)
* perf_counters_gtest: Make test pass on Android

Tested on Pixel 3 and Pixel 6.  Reduce test to the intersection of
what passes on all platforms.

Pixel 6 doesn't support BRANCHES, and only supports two perf
counters.


---------

Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
2023-08-21 15:04:50 +01:00
Andy Christiansen 4931aefb51
Fix broken PFM-enabled tests (#1623)
* Add pfm CI actions for bazel

* Fix problems in unit test.

* Undo enabling the CI tests for pfm - github CI machines seemingly do not support performance counters.

* Remove commented code - can be revisited in github history when needed, and there's a comment explaining the rationale behind the new test code.

---------

Co-authored-by: Andy Christiansen <achristiansen@google.com>
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
2023-07-07 09:58:16 +01:00
Vy Nguyen fec77322b4
Fix code triggering -Wsign-conversion (#1596)
* Fix code triggering -Wsign-conversion

* more test
2023-05-11 08:40:05 +01:00
Henrique Bucher fbc6efa9b5
Refactoring of PerfCounters infrastructure (#1559)
* Refactoring of PerfCounters infrastructure

The main feature in this pull request is the removal of the static
sharing of PerfCounters and instead creating them at the top
`RunBenchmarks()`  function where all benchmark runners are created. A
single PerfCountersMeasurement object is created and then shared with
all the new BenchmarkRunners objects, one per existing benchmark.

Other features conflated here in this PR are:
- Added BENCHMARK_DONT_OPTIMIZE macro in global scope
- Removal of the `IsValid()` query, being replaced by checking the
  number of remaining counters after validity tests
- Refactoring of all GTests to reflect the changes and new semantics
- extra comments throughout the new code to clarify intent

It was extremely hard to separate all those features in different PRs
as requested since they are so interdependent on each other so I'm just
pushing them altogether and asking for forgiveness.

This PR comes replacing PRs 1555 and 1558 which have been closed.

* Fixed whitespace issue with clang-format

My clang-format insists in deleting this single white space on line 601
while Github's clang format breaks when it is added. I had to disable
format-on-save to check-in this revert change.
I'm using clang 14.0.6.
2023-03-07 10:27:52 +00:00
Henrique Bucher 2d5012275a
Filter performance counter names, not invalidate all (#1554)
* Filter performance counter names, not invalidate all

Currently, the performance counters are validated while they
are being created and one failure returns NoCounters(), ie it
effecitvely invalidates all the counters.

I would like to propose a new behavior: filter instead. If an
invalid name is added to the counter list, or if that particular
counter is not supported on this platform, that counter is dropped
from the list and an error messages is created, while all the
other counters remain active.

This will give testers a peace of mind that if one mistake is made
or if something is changed or removed from libpfm, their entire
test will not be invalidated. This feature gives more tolerance
with respect to versioning.

Another positive is that testers can now input a superset of all
desired counters for all platforms they support and just let
Benchmark drop all those that are not supported, although it will
create quite a lot of noise down the line, in which case perhaps
we should drop silently or make a consolidated, single error line
but this was not implemented in this change set.

* Removed unused helper type.
2023-03-02 14:56:13 +00:00
Henrique Bucher 27c1d8ace9
Implement unlimited number of performance counters (#1552)
* Implement unlimited number of performance counters

Linux performance counters will limit the number of hardware
counters per reading group. For that reason the implementation of
PerfCounters is limited to 3. However if only software counters
are added, there is no reason to limit the counters. For hardware
counters, we create multiple groups and store a vector or leaders
in the PerfCounters object. When reading, there is an extra time
waste by iterating through all the group leaders. However this
should be the same performance as with today. Reading is done by
groups and it had to be heavily adjusted with the logic being
moved to PerfCounterValues. I created a test for x86-64 and took
care of filtering out the events in case it runs in a platform
that does not support those counters - the test will not fail. The
current tests were already failing (ReOpenExistingCounters,
CreateExistingMeasurements and MultiThreaded) on the main branch
and they continue to fail after this implementation - I did not
fix those not to conflate all here.

* Moved the PerfCounterValues::Read() implementation from header to body.

* Added missing implementation of PerfCounters::IsCounterSupported when HAVE_LIBPFM is not defined.

* Changed comments to reflect the implementation

* Removed arg name so it does not generate an error when HAVE_LIBPBM is not defined.

* Made loop counter a const reference for clang-tidy

* Added missig BENCHMARK_EXPORT to PerfCounterValues
2023-03-01 15:30:41 +00:00
Liqiang TAO d0fbf8ac23
Cache PerfCounters instance in PerfCountersMeasurement (#1308)
This patch fixes #1306, by reducing the pinned instances of
PerfCounters.

The issue is caused by creating multiple pinned events in the
same thread, doing so results in the Snapshot(PerfCounterValues* values)
failing, and that's now discoverable.
Creating multile pinned events is an unsupported behavior currently.
The error would be detected at read() time, not
perf_event_open() / iotcl() time.

The unsupported benavior above is confirmed by Stephane Eranian @seranian,
and he also pointed the dectection method.

Finished this patch under the guidance of Mircea Trofin @mtrofin.
2022-01-25 10:14:20 +00:00
Dominic Hamon c07a498924
format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
Dominic Hamon 6a5bf081d3
prefix macros to avoid clashes (#1186) 2021-06-24 18:21:59 +01:00
Mircea Trofin e539e807da
[PFM] Extend perf counter support to multi-threaded cases. (#1153)
* Extend perf counter support to multi-threaded cases.

* Docs update

* const-ed Snapshot
2021-05-19 09:49:05 +01:00
Mircea Trofin 376ebc2635
Support optional, user-directed collection of performance counters (#1114)
* Support optional, user-directed collection of performance counters

The patch allows an engineer wishing to drill into the root causes
of a regression, for example. Currently, only single threaded runs
are supported. The feature is a build-time opt in, and then a runtime
opt in.

The engineer may run the benchmark executable, passing a list of
performance counter names (using libpfm's naming scheme) at the
command line. The counter values will then be collected and reported
back as UserCounters.

This is different from #240 in that it is a benchmark user opt-in, and
the counter collection is transparent to the benchmark.

Currently, this is only supported on platforms where libpfm is
supported.

libpfm: http://perfmon2.sourceforge.net/

* 'Use' values param in Snapshot when BENCHMARK_OS_WINDOWS

This is to avoid unused parameter warning-as-error

* Added missing include for <vector> in perf_counters.cc

* Moved doc to docs

* Added license blurbs
2021-04-28 09:25:29 +01:00