benchmark

mirror of https://github.com/google/benchmark.git synced 2024-11-28 15:34:33 +00:00

Author	SHA1	Message	Date
Roman Lebedev	1b44120cd1	Un-deprecate [SG]et{Item,Byte}sProcessed, re-implement as custom counters. (#676 ) As discussed with @dominichamon and @dbabokin, sugar is nice. Well, maybe not for the health, but it's sweet. Alright, enough puns. A special care needs to be applied not to break csv reporter. UGH. We end up shedding some code over this. We no longer specially pretty-print them, they are printed just like the rest of custom counters. Fixes #627.	2018-09-13 22:03:47 +03:00
Roman Lebedev	58588476ce	Track two more details about runs - the aggregate name, and run name. (#675 ) This is related to @BaaMeow's work in https://github.com/google/benchmark/pull/616 but is not based on it. Two new fields are tracked, and dumped into JSON: * If the run is an aggregate, the aggregate's name is stored. It can be RMS, BigO, mean, median, stddev, or any custom stat name. * The aggregate-name-less run name is additionally stored. I.e. not some name of the benchmark function, but the actual name, but without the 'aggregate name' suffix. This way one can group/filter all the runs, and filter by the particular aggregate type. I might need this for further tooling improvement. Or maybe not. But this is certainly worthwhile for custom tooling.	2018-09-13 15:08:15 +03:00
Roman Lebedev	c614dfc0d4	Display aggregates only. (#665 ) There is a flag `d9cab612e4/src/benchmark.cc (L75-L78)` and a call `d9cab612e4/include/benchmark/benchmark.h (L837-L840)` But that affects everything, every reporter, destination: `d9cab612e4/src/benchmark.cc (L316)` It would be quite useful to have an ability to be more picky. More specifically, i would like to be able to only see the aggregates in the on-screen output, but for the file output to still contain everything. The former is useful in case of a lot of repetition (or even more so if every iteration is reported separately), while the former is great for tooling. Fixes https://github.com/google/benchmark/issues/664	2018-09-12 16:26:17 +03:00
Roman Lebedev	f274c503e1	Backport LLVM's r341717 "Fix flags used to compile benchmark library with clang-cl" (#673 ) `MSVC` is true for clang-cl, but `"${CMAKE_CXX_COMPILER_ID}" STREQUAL "MSVC"` is false, so we would enable -Wall, which means -Weverything with clang-cl, and we get tons of undesired warnings. Use the simpler condition to fix things. Patch by: Reid Kleckner @rnk	2018-09-10 16:30:40 -04:00
Roman Lebedev	f0901417c8	GetCacheSizesMacOSX(): use consistent types. (#667 ) I have absolutely no way to test this, but this looks obviously-good. This was reported by Tim Northover @TNorthover in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180903/584223.html > I think this breaks some 32-bit configurations (well, mine at least). > I was using Clang (from Xcode 10 beta) on macOS and got a bunch of > errors referencing sysinfo.cc:292 and onwards: > /Users/tim/llvm/llvm-project/llvm/utils/benchmark/src/sysinfo.cc:292:47: > error: non-constant-expression cannot be narrowed from type > 'std::__1::array<unsigned long long, 4>::value_type' (aka 'unsigned > long long') to 'size_t' (aka 'unsigned long') in initializer list > [-Wc++11-narrowing] > } Cases[] = {{"hw.l1dcachesize", "Data", 1, CacheCounts[1]}, > ^~~~~~~~~~~~~~ > > The same happens when self-hosting ToT. Unfortunately I couldn't > reproduce the issue on Debian (Clang 6.0.1) even with libc++; I'm not > sure what the difference is.	2018-09-05 12:20:18 +01:00
Roman Lebedev	a7ed76ad78	Travis-ci: attempt to add 32-bit osx xcode build (#669 ) Maybe will show https://github.com/google/benchmark/pull/667 or maybe this is completely wrong.	2018-09-05 12:19:54 +01:00
Changming Sun	305ba313be	Pass name by const-reference instead of by value in class Statistics' constructor (#668 )	2018-09-04 23:46:40 +03:00
pseyfert	fbfc495d7f	add missing closing bracket in --help message (#666 )	2018-09-03 19:45:09 +03:00
Roman Lebedev	5159967520	Mark Set{Items,Bytes}Processed()/{items,bytes}_processed() as deprecated. (#654 ) They are basically proto-version of custom user counters. It does not seem that they do anything that custom user counters don't do. And having two similar entities is not good for generalization. Migration plan: * ``` SetItemsProcessed(<val>) => state.counters.insert({ {"<Name>", benchmark::Counter(<val>, benchmark::Counter::kIsRate)}, ... }); ``` * ``` SetBytesProcessed(<val>) => state.counters.insert({ {"<Name>", benchmark::Counter(<val>, benchmark::Counter::kIsRate, benchmark::Counter::OneK::kIs1024)}, ... }); ``` * ``` <Name>_processed() => state.counters["<Name>"] ``` One thing the custom user counters miss is better support for units of measurement. Refs. https://github.com/google/benchmark/issues/627	2018-08-30 11:59:50 +03:00
Roman Lebedev	caa2fcb19c	Counter(): add 'one thousand' param. (#657 ) * Counter(): add 'one thousand' param. Needed for https://github.com/google/benchmark/pull/654 Custom user counters are quite custom. It is not guaranteed that the user always expects for these to have 1k == 1000. If the counter represents bytes/memory/etc, 1k should be 1024. Some bikeshedding points: 1. Is this sufficient, or do we really want to go full on into custom types with names? I think just the '1000' is sufficient for now. 2. Should there be a helper benchmark::Counter::Counter{1000,1024}() static 'constructor' functions, since these two, by far, will be the most used? 3. In the future, we should be somehow encoding this info into JSON. * Counter(): use std::pair<> to represent 'one thousand' * Counter(): just use a new enum with two values 1000 vs 1024. Simpler is better. If someone comes up with a real reason to need something more advanced, it can be added later on. * Counter: just store the 1000 or 1024 in the One_K values directly * Counter: s/One_K/OneK/	2018-08-29 21:11:06 +03:00
Roman Lebedev	d9cab612e4	[NFC] s/console_reporter/display_reporter/ (#663 ) There are two destinations: * display (console, terminal) and * file. And each of the destinations can be poplulated with one of the reporters: * console - human-friendly table-like display * json * csv (deprecated) So using the name console_reporter is confusing. Is it talking about the console reporter in the sense of table-like reporter, or in the sense of display destination?	2018-08-29 14:58:54 +03:00
Michael "Croydon" Keck	a0018c3931	Ignore 32 bit build option when using MSVC (#638 )	2018-08-29 12:51:20 +01:00
Roman Lebedev	8688c5c4cf	Track 'type' of the run - is it an actual measurement, or an aggregate. (#658 ) This is only exposed in the JSON. Not in CSV, which is deprecated. This only supposed to track these two states. An additional field could later track which aggregate this is, specifically (statistic name, rms, bigo, ...) The motivation is that we already have ReportAggregatesOnly, but it affects the entire reports, both the display, and the reporters (json files), which isn't ideal. It would be very useful to have a 'display aggregates only' option, both in the library's console reporter, and the python tooling, This will be especially needed for the 'store separate iterations'.	2018-08-28 18:11:36 +03:00
Roman Lebedev	9a179cb93f	[NFC] Prefix "report(_)?mode" with Aggregation. (#656 ) This only specifically represents handling of reporting of aggregates. Not of anything else. Making it more specific makes the name less generic. This is an issue because i want to add "iteration report mode", so the naming would be conflicting.	2018-08-28 17:19:25 +03:00
Bernhard M. Wiedemann	ede90ba6c8	Make tests pass on 1-core VMs (#653 ) found while working on reproducible builds for openSUSE To reproduce there osc checkout openSUSE:Factory/benchmark && cd $_ osc build -j1 --vm-type=kvm	2018-08-28 17:10:14 +03:00
BaaMeow	af441fc114	properly escape json names (#652 )	2018-08-16 09:47:09 -07:00
Roman Lebedev	94c4d6d5c6	[Tools] Drop compare_bench.py, compare.py is to be used, add U-test docs. (#645 ) As discussed in IRC, time to deduplicate.	2018-08-13 07:42:35 -07:00
Kirill Bobyrev	f85304e4e3	Remove redundant default which causes failures (#649 ) * Remove redundant default which causes failures * Fix old GCC warnings caused by poor analysis * Use __builtin_unreachable * Use BENCHMARK_UNREACHABLE() * Pull __has_builtin to benchmark.h too * Also move compiler identification macro to main header * Move custom compiler identification macro back	2018-08-08 14:39:57 +01:00
Dominic Hamon	d939634b8c	README improvements (#648 ) * Clarifications and cleaning of the core documentation.	2018-07-26 14:29:33 +01:00
Dominic Hamon	f965eab508	Memory management and reporting hooks (#625 ) * Introduce memory manager interface * Add memory stats to JSON reporter and a test * Add comments and switch json output test to int	2018-07-24 15:57:15 +01:00
Dominic Hamon	63e183b389	Add note to tools.md regarding scipy.	2018-07-23 12:08:20 +01:00
Federico Ficarelli	1f35fa4aa7	Update AUTHORS and CONTRIBUTORS (#632 ) Adding myself to AUTHORS and CONTRIBUTORS according to guidelines.	2018-07-09 12:47:16 +01:00
Federico Ficarelli	0c21bc369a	Fix build with Intel compiler (#631 ) * Set -Wno-deprecated-declarations for Intel Intel compiler silently ignores -Wno-deprecated-declarations so warning no. 1786 must be explicitly suppressed. * Make std::int64_t → double casts explicit While std::int64_t → double is a perfectly conformant implicit conversion, Intel compiler warns about it. Make them explicit via static_cast<double>. * Make std::int64_t → int casts explicit Intel compiler warns about emplacing an std::int64_t into an int container. Just make the conversion explicit via static_cast<int>. * Cleanup Intel -Wno-deprecated-declarations workaround logic	2018-07-09 11:45:10 +01:00
Federico Ficarelli	5946795e82	Disable Intel invalid offsetof warning (#629 )	2018-07-03 10:13:22 +01:00
Yoshinari Takaoka	847c006902	fixed Google Test (Primer) Documentation link (#628 )	2018-06-28 10:25:54 +01:00
Roman Lebedev	b123abdcf4	Add Iteration-related Counter::Flags. Fixes #618 (#621 ) Inspired by these [two](`a1ebe07bea`) [bugs](`0891555be5`) in my code due to the lack of those i have found fixed in my code: * `kIsIterationInvariant` - `* state.iterations()` The value is constant for every iteration, and needs to be multiplied by the iteration count. * `kAvgIterations` - `/ state.iterations()` The is global over all the iterations, and needs to be divided by the iteration count. They play nice with `kIsRate`: * `kIsIterationInvariantRate` * `kAvgIterationsRate`. I'm not sure how meaningful they are when combined with `kAvgThreads`. I guess the `kIsThreadInvariant` can be added, too, for symmetry with `kAvgThreads`.	2018-06-27 15:45:30 +01:00
Dominic Hamon	d8584bda67	Use EXPECT_DOUBLE_EQ when comparing doubles in tests. (#624 ) * Use EXPECT_DOUBLE_EQ when comparing doubles in tests. Fixes #623 * disable 'float-equal' warning	2018-06-27 12:11:30 +01:00
Roman Lebedev	7d03f2df49	[Tooling] Enable U Test by default, add tooltip about repetition count. (#617 ) As previously discussed, let's flip the switch ^^. This exposes the problem that it will now be run for everyone, even if one did not read the help about the recommended repetition count. This is not good. So i think we can do the smart thing: ``` $ ./compare.py benchmarks gbench/Inputs/test3_run{0,1}.json Comparing gbench/Inputs/test3_run0.json to gbench/Inputs/test3_run1.json Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------- BM_One -0.1000 +0.1000 10 9 100 110 BM_Two +0.1111 -0.0111 9 10 90 89 BM_Two +0.2500 +0.1125 8 10 80 89 BM_Two_pvalue 0.2207 0.6831 U Test, Repetitions: 2. WARNING: Results unreliable! 9+ repetitions recommended. BM_Two_stat +0.0000 +0.0000 8 8 80 80 ``` (old screenshot) ![image](https://user-images.githubusercontent.com/88600/41502182-ea25d872-71bc-11e8-9842-8aa049509b14.png) Or, in the good case (noise omitted): ``` s$ ./compare.py benchmarks /tmp/run{0,1}.json Comparing /tmp/run0.json to /tmp/run1.json Benchmark Time CPU Time Old Time New CPU Old CPU New --------------------------------------------------------------------------------------------------------------------------------- <99 more rows like this> ./_T012014.RW2/threads:8/real_time +0.0160 +0.0596 46 47 10 10 ./_T012014.RW2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 100 ./_T012014.RW2/threads:8/real_time_mean +0.0094 +0.0609 46 47 10 10 ./_T012014.RW2/threads:8/real_time_median +0.0104 +0.0613 46 46 10 10 ./_T012014.RW2/threads:8/real_time_stddev -0.1160 -0.1807 1 1 0 0 ``` (old screenshot) ![image](https://user-images.githubusercontent.com/88600/41502185-fb8193f4-71bc-11e8-85fa-cbba83e39db4.png)	2018-06-18 12:58:16 +01:00
Dominic Hamon	151ead6242	Disable deprecation warnings when -Werror is enabled. (#609 ) Fixes #608	2018-06-07 12:54:14 +01:00
Marat Dukhan	505be96ab2	Avoid using CMake 3.6 feature list(FILTER ...) (#612 ) list(FILTER ...) is a CMake 3.6 feature, but benchmark targets CMake 2.8.12	2018-06-06 12:32:42 +01:00
Sergiu Deitsch	1301f53e31	cmake: use numeric version in package config (#611 )	2018-06-05 15:01:44 +01:00
Marat Dukhan	7fb3c564e5	Fix compilation on Android with GNU STL (#596 ) * Fix compilation on Android with GNU STL GNU STL in Android NDK lacks string conversion functions from C++11, including std::stoul, std::stoi, and std::stod. This patch reimplements these functions in benchmark:: namespace using C-style equivalents from C++03. * Avoid use of log2 which doesn't exist in Android GNU STL GNU STL in Android NDK lacks log2 function from C99/C++11. This patch replaces their use in the code with double log(double) function.	2018-06-05 11:36:26 +01:00
BaaMeow	4c2af07889	(clang-)format all the things (#610 ) * format all documents according to contributor guidelines and specifications use clang-format on/off to stop formatting when it makes excessively poor decisions * format all tests as well, and mark blocks which change too much	2018-06-01 11:14:19 +01:00
Dominic Hamon	4fbfa2f336	Some platforms and environments don't pass a valid argc/argv. (#607 ) Specifically some iOS targets.	2018-05-30 13:17:41 +01:00
Dominic Hamon	d07372e64b	clang-format run on the benchmark header (#606 )	2018-05-29 14:12:51 +01:00
Eric	7b8d0249d8	Deprecate CSVReporter - A first step to overhauling reporting. (#488 ) As @dominichamon and I have discussed, the current reporter interface is poor at best. And something should be done to fix it. I strongly suspect such a fix will require an entire reimagining of the API, and therefore breaking backwards compatibility fully. For that reason we should start deprecating and removing parts that we don't intend to replace. One of these parts, I argue, is the CSVReporter. I propose that the new reporter interface should choose a single output format (JSON) and traffic entirely in that. If somebody really wanted to replace the functionality of the CSVReporter they would do so as an external tool which transforms the JSON. For these reasons I propose deprecating the CSVReporter.	2018-05-29 13:25:32 +01:00
Dominic Hamon	16703ff83c	cleaner and slightly larger statistics tests (#604 )	2018-05-29 13:13:06 +01:00
Dominic Hamon	c8adf4531f	Add some 'travis_wait' commands to avoid gcc@7 installation timeouts. (#605 )	2018-05-29 13:12:48 +01:00
Roman Lebedev	a6a1b0d765	Benchmarking is hard. Making sense of the benchmarking results is even harder. (#593 ) The first problem you have to solve yourself. The second one can be aided. The benchmark library can compute some statistics over the repetitions, which helps with grasping the results somewhat. But that is only for the one set of results. It does not really help to compare the two benchmark results, which is the interesting bit. Thankfully, there are these bundled `tools/compare.py` and `tools/compare_bench.py` scripts. They can provide a diff between two benchmarking results. Yay! Except not really, it's just a diff, while it is very informative and better than nothing, it does not really help answer The Question - am i just looking at the noise? It's like not having these per-benchmark statistics... Roughly, we can formulate the question as: > Are these two benchmarks the same? > Did my change actually change anything, or is the difference below the noise level? Well, this really sounds like a [null hypothesis](https://en.wikipedia.org/wiki/Null_hypothesis), does it not? So maybe we can use statistics here, and solve all our problems? lol, no, it won't solve all the problems. But maybe it will act as a tool, to better understand the output, just like the usual statistics on the repetitions... I'm making an assumption here that most of the people care about the change of average value, not the standard deviation. Thus i believe we can use T-Test, be it either [Student's t-test](https://en.wikipedia.org/wiki/Student%27s_t-test), or [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test). EDIT: however, after @dominichamon review, it was decided that it is better to use more robust [Mann–Whitney U test](https://en.wikipedia.org/wiki/Mann–Whitney_U_test) I'm using [scipy.stats.mannwhitneyu](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html#scipy.stats.mannwhitneyu). There are two new user-facing knobs: ``` $ ./compare.py --help usage: compare.py [-h] [-u] [--alpha UTEST_ALPHA] {benchmarks,filters,benchmarksfiltered} ... versatile benchmark output compare tool <...> optional arguments: -h, --help show this help message and exit -u, --utest Do a two-tailed Mann-Whitney U test with the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample. WARNING: requires LARGE (9 or more) number of repetitions to be meaningful! --alpha UTEST_ALPHA significance level alpha. if the calculated p-value is below this value, then the result is said to be statistically significant and the null hypothesis is rejected. (default: 0.0500) ``` Example output: ![screenshot_20180512_175517](https://user-images.githubusercontent.com/88600/39958581-ae897924-560d-11e8-81b9-806db6c3e691.png) As you can guess, the alpha does affect anything but the coloring of the computed p-values. If it is green, then the change in the average values is statistically-significant. I'm detecting the repetitions by matching name. This way, no changes to the json are _needed_. Caveats: * This won't work if the json is not in the same order as outputted by the benchmark, or if the parsing does not retain the ordering. * This won't work if after the grouped repetitions there isn't at least one row with different name (e.g. statistic). Since there isn't a knob to disable printing of statistics (only the other way around), i'm not too worried about this. * The results will be wrong if the repetition count is different between the two benchmarks being compared. * Even though i have added (hopefully full) test coverage, the code of these python tools is staring to look a bit jumbled. * So far i have added this only to the `tools/compare.py`. Should i add it to `tools/compare_bench.py` too? Or should we deduplicate them (by removing the latter one)?	2018-05-29 11:13:28 +01:00
Dominic Hamon	ec0f69c28e	Update README.md	2018-05-29 10:36:54 +01:00
Alex Strelnikov	e776aa0275	Add benchmark_main target. (#601 ) * Add benchmark_main library with support for Bazel. * fix newline at end of file * Add CMake support for benchmark_main. * Mention optionally using benchmark_main in README.	2018-05-25 11:18:58 +01:00
mattreecebentley	d7aed73677	Corrections, additions to initial doc (#600 ) * Correct/clarify build/install instructions GTest is google test, don't obsfucate needlessly for newcomers. Adding google test into installation guide helps newcomers. Third option under this line: "Note that Google Benchmark requires Google Test to build and run the tests. This dependency can be provided three ways:" Was not true (did not occur). If there is a further option that needs to be specified in order for that functionality to work it needs to be specified. * Add prerequisite knowledge section A lot of assumptions are made about the reader in the documentation. This is unfortunate. * Removal of abbreviations for google test	2018-05-24 10:50:35 +01:00
Samuel Panzer	ce3fde16cb	Return 0 from State::iterations() when not yet started. (#598 ) * Return a reasonable value from State::iterations() even before starting a benchmark * Optimize State::iterations() for started case.	2018-05-24 10:33:19 +01:00
Deniz Evrenci	6d74c0625b	split_list is not defined for assembly tests (#595 ) * Update AUTHORS and CONTRIBUTORS * split_list is not defined for assembly tests	2018-05-14 15:02:49 +01:00
Nan Xiao	e90801ae47	Remove unnecessary memset functions. (#591 )	2018-05-09 10:31:24 +01:00
Roman Lebedev	718cc91d00	[Tools] Fix a few python3-compatibility issues (#585 )	2018-05-08 11:34:31 +01:00
Roman Lebedev	e8ddd907bb	There is no "FATAL" in message(), only "FATAL_ERROR" (#584 )	2018-05-08 11:33:37 +01:00
php1ic	16af64500f	Run git from the source directory (#589 ) (#590 ) Git was being executed in the current directory, so could not get the latest tag if cmake was run from a build directory. Force git to be run from with the source directory.	2018-05-08 11:29:09 +01:00
Sam Clegg	8986839e4a	Use __EMSCRIPTEN__ (rather then EMSCRIPTEN) to check for emscripten (#583 ) The old EMSCRIPTEN macro is deprecated and not enabled when EMCC_STRICT is set. Also fix a typo in EMSCRIPTN (not sure how this ever worked).	2018-05-03 09:34:26 +01:00
Nan Xiao	ea5551e7b3	Porting into OpenBSD (#582 )	2018-05-02 11:26:43 +01:00

1 2 3 4 5 ...

903 commits