benchmark

Commit Graph

Author	SHA1	Message	Date
dominic	9885aefb96	get rid of warnings in tests (#1562 )	2023-03-06 14:47:54 +00:00
Roman Lebedev	4f8070590c	COnsole reporter: if statistic produces percents, format it as such (#1221 )	2021-09-06 11:33:27 +03:00
Roman Lebedev	45b194e4d4	Introduce Coefficient of variation aggregate (#1220 ) * Introduce Coefficient of variation aggregate I believe, it is much more useful / use to understand, because it is already normalized by the mean, so it is not affected by the duration of the benchmark, unlike the standard deviation. Example of real-world output: ``` raw.pixls.us-unique/GoPro/HERO6 Black$ ~/rawspeed/build-old/src/utilities/rsbench/rsbench GOPR9172.GPR --benchmark_repetitions=27 --benchmark_display_aggregates_only=true --benchmark_counters_tabular=true 2021-09-03T18:05:56+03:00 Running /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench Run on (32 X 3596.16 MHz CPU s) CPU Caches: L1 Data 32 KiB (x16) L1 Instruction 32 KiB (x16) L2 Unified 512 KiB (x16) L3 Unified 32768 KiB (x2) Load Average: 7.00, 2.99, 1.85 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations CPUTime,s CPUTime/WallTime Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GOPR9172.GPR/threads:32/process_time/real_time_mean 11.1 ms 353 ms 27 0.353122 31.9473 12M 33.9879M 1085.84M 2.83232 90.4864 0.0110535 GOPR9172.GPR/threads:32/process_time/real_time_median 11.0 ms 352 ms 27 0.351696 31.9599 12M 34.1203M 1090.11M 2.84336 90.8425 0.0110081 GOPR9172.GPR/threads:32/process_time/real_time_stddev 0.159 ms 4.60 ms 27 4.59539m 0.0462064 0 426.371k 14.9631M 0.0355309 1.24692 158.944u GOPR9172.GPR/threads:32/process_time/real_time_cv 1.44 % 1.30 % 27 0.0130136 1.44633m 0 0.0125448 0.0137802 0.0125448 0.0137802 0.0143795 ``` Fixes https://github.com/google/benchmark/issues/1146 * Be consistent, it's CV, not 'rel std dev'	2021-09-03 18:44:10 +01:00
Roman Lebedev	12dc5eeafc	Statistics: add support for percentage unit in addition to time (#1219 ) * Statistics: add support for percentage unit in addition to time I think, `stddev` statistic is useful, but confusing. What does it mean if `stddev` of `1ms` is reported? Is that good or bad? If the `median` is `1s`, then that means that the measurements are pretty noise-less. And what about `stddev` of `100ms` is reported? If the `median` is `1s` - awful, if the `median` is `10s` - good. And hurray, there is just the statistic that we need: https://en.wikipedia.org/wiki/Coefficient_of_variation But, naturally, that produces a value in percents, but the statistics are currently hardcoded to produce time. So this refactors thinkgs a bit, and allows a percentage unit for statistics. I'm not sure whether or not `benchmark` would be okay with adding this `RSD` statistic by default, but regales, that is a separate patch. Refs. https://github.com/google/benchmark/issues/1146 * Address review notes	2021-09-03 15:36:56 +01:00
Roman Lebedev	80a62618e8	Introduce per-family instance index (#1165 ) Much like it makes sense to enumerate all the families, it makes sense to enumerate stuff within families. Alternatively, we could have a global instance index, but i'm not sure why that would be better. This will be useful when the benchmarks are run not in order, for the tools to sort the results properly.	2021-06-02 23:45:41 +03:00
Roman Lebedev	4c2e32f1d0	Introduce "family index" field into JSON output (#1164 ) It may be useful for those wishing to further post-process JSON results, but it is mainly geared towards better support for run interleaving, where results from the same family may not be close-by in the JSON. While we won't be able to do much about that for outputs, the tools can and perhaps should reorder the results to that at least in their output they are in proper order, not run order. Note that this only counts the families that were filtered-in, so if e.g. there were three families, and we filtered-out the second one, the two families (which were first and third) will have family indexes 0 and 1.	2021-06-02 18:06:45 +03:00
Roman Lebedev	a54ef37aea	Ensure that we print repetition count even when it was specified via flag `--benchmark_repetitions=`	2021-06-02 12:34:00 +03:00
Eric Backus	32a1e39720	Bugfix/wsl selftest fixes. Fixes #839 (#843 ) * Update AUTHORS and CONTRIBUTORS * Fix WSL self-test failures Some of the benchmark self-tests expect and check for a particular output format from the benchmark library. The numerical values must not be infinity or not-a-number, or the test will report an error. Some of the values are computed bytes-per-second or items-per-second values, so these require that the measured CPU time for the test to be non-zero. But the loop that is being measured was empty, so the measured CPU time for the loop was extremely small. On systems like Windows Subsystem for Linux (WSL) the timer doesn't have enough resolution to measure this, so the measured CPU time was zero. This fix just makes sure that these tests have something within the timing loop, so that the benchmark library will not decide that the loop takes zero CPU time. This makes these tests more robust, and in particular makes them pass on WSL.	2019-07-27 19:02:31 +03:00
BaaMeow	478eafa36b	[JSON] add threads and repetitions to the json output (#748 ) * [JSON] add threads and repetitions to the json output, for better ide… [Tests] explicitly check for thread == 1 [Tests] specifically mark all repetition checks [JSON] add repetition_index reporting, but only for non-aggregates (i… * [Formatting] Be very, very explicit about pointer alignment so clang-format can not put pointers/references on the wrong side of arguments. [Benchmark::Run] Make sure to use explanatory sentinel variable rather than a magic number. * Do not pass redundant information	2019-03-26 09:53:07 +00:00
Roman Lebedev	58588476ce	Track two more details about runs - the aggregate name, and run name. (#675 ) This is related to @BaaMeow's work in https://github.com/google/benchmark/pull/616 but is not based on it. Two new fields are tracked, and dumped into JSON: * If the run is an aggregate, the aggregate's name is stored. It can be RMS, BigO, mean, median, stddev, or any custom stat name. * The aggregate-name-less run name is additionally stored. I.e. not some name of the benchmark function, but the actual name, but without the 'aggregate name' suffix. This way one can group/filter all the runs, and filter by the particular aggregate type. I might need this for further tooling improvement. Or maybe not. But this is certainly worthwhile for custom tooling.	2018-09-13 15:08:15 +03:00
Roman Lebedev	8688c5c4cf	Track 'type' of the run - is it an actual measurement, or an aggregate. (#658 ) This is only exposed in the JSON. Not in CSV, which is deprecated. This only supposed to track these two states. An additional field could later track which aggregate this is, specifically (statistic name, rms, bigo, ...) The motivation is that we already have ReportAggregatesOnly, but it affects the entire reports, both the display, and the reporters (json files), which isn't ideal. It would be very useful to have a 'display aggregates only' option, both in the library's console reporter, and the python tooling, This will be especially needed for the 'store separate iterations'.	2018-08-28 18:11:36 +03:00
BaaMeow	4c2af07889	(clang-)format all the things (#610 ) * format all documents according to contributor guidelines and specifications use clang-format on/off to stop formatting when it makes excessively poor decisions * format all tests as well, and mark blocks which change too much	2018-06-01 11:14:19 +01:00
Eric	25acf220a4	Refactor most usages of KeepRunning to use the perfered ranged-for. (#459 ) Recently the library added a new ranged-for variant of the KeepRunning loop that is much faster. For this reason it should be preferred in all new code. Because a library, its documentation, and its tests should all embody the best practices of using the library, this patch changes all but a few usages of KeepRunning() into for (auto _ : state). The remaining usages in the tests and documentation persist only to document and test behavior that is different between the two formulations. Also note that because the range-for loop requires C++11, the KeepRunning variant has not been deprecated at this time.	2017-10-17 12:17:02 -06:00
Roman Lebedev	3347a20e0e	reporter_output_test: json: iterations is int, not float (#431 ) May be relevant for flakiness of win builds Noted by @KindDragon	2017-07-31 19:04:02 -06:00
Roman Lebedev	b9be142d1e	Json reporter: don't cast floating-point to int; adjust tooling (#426 ) * Json reporter: passthrough fp, don't cast it to int; adjust tooling Json output format is generally meant for further processing using some automated tools. Thus, it makes sense not to intentionally limit the precision of the values contained in the report. As it can be seen, FormatKV() for doubles, used %.2f format, which was meant to preserve at least some of the precision. However, before that function is ever called, the doubles were already cast to the integer via RoundDouble()... This is also the case for console reporter, where it makes sense because the screen space is limited, and this reporter, however the CSV reporter does output some( decimal digits. Thus i can only conclude that the loss of the precision was not really considered, so i have decided to adjust the code of the json reporter to output the full fp precision. There can be several reasons why that is the right thing to do, the bigger the time_unit used, the greater the precision loss, so i'd say any sort of further processing (like e.g. tools/compare_bench.py does) is best done on the values with most precision. Also, that cast skewed the data away from zero, which i think may or may not result in false- positives/negatives in the output of tools/compare_bench.py * Json reporter: FormatKV(double): address review note * tools/gbench/report.py: skip benchmarks with different time units While it may be useful to teach it to operate on the measurements with different time units, which is now possible since floats are stored, and not the integers, but for now at least doing such a sanity-checking is better than providing misinformation.	2017-07-24 16:13:55 -07:00
Joao Paulo Magalhaes	020bac985b	Extend tabular counter tests to different counter sets.	2017-05-02 23:00:45 +01:00
Joao Paulo Magalhaes	2506044902	Add unit test for counter sets.	2017-05-02 22:14:49 +01:00
Joao Paulo Magalhaes	cf20dc967f	Add test for tabular output of rate counters.	2017-05-02 20:47:41 +01:00
Joao Paulo Magalhaes	c69b385c9c	Add first unit test for benchmark_tabular_counters.	2017-05-02 20:33:28 +01:00

19 Commits