benchmark

mirror of https://github.com/google/benchmark.git synced 2024-11-29 18:34:47 +00:00

Author	SHA1	Message	Date
Victor Costan	95a1435b81	Fix compilation error with GCC on OSX (issue #490 ). (#491 )	2017-11-30 08:05:38 -08:00
Roman Lebedev	c45f01866b	CMake: implement LTO for clang. Fixes #478 (#487 ) * CMake: implement LTO for clang. Fixes #478 * LTO: add basic docs about required executables.	2017-11-29 12:48:43 -08:00
Kishan Kumar	eae42212ce	Added the installation guide for Ubuntu (#489 ) * Initial CLA Requirement * Added Installation steps to the Readme.md * Fixed error in running benchmark of Installation * Remove unwanted commands Removed the lengthy install procedure with suggested install mechanism	2017-11-29 09:36:19 -08:00
Roman Lebedev	ec5684ed75	Console reporter: properly account for the lenght of custom counter names (#484 ) Old output example: ``` Benchmark Time CPU Iterations CPUTime,s Pixels/s ThreadingFactor ------------------------------------------------------------------------------------------------------------------------------ 20170525_0036TEST.RAF/threads:8/real_time 45 ms 45 ms 16 0.718738 79.6277M/s 0.999978 2.41419GB/s 22.2613 items/s FileSize,MB=111.050781; MPix=57.231360 ``` New output example: ``` Benchmark Time CPU Iterations CPUTime,s Pixels/s ThreadingFactor ------------------------------------------------------------------------------------------------------------------------------ 20170525_0036TEST.RAF/threads:8/real_time 45 ms 45 ms 16 0.713575 80.1713M/s 0.999571 2.43067GB/s 22.4133 items/s FileSize,MB=111.050781; MPix=57.231360 ```	2017-11-27 09:01:01 -08:00
Eric Fiselier	2ec7399cf1	Improve BENCHMARK_UNREACHABLE() implementation. This patch primarily changes the BENCHMARK_UNREACHABLE() implementation under MSVC to use __assume(false) instead of being a NORETURN function, which ironically caused unreachable code warnings. Second, since the NOTHROW function attempt generated the warnings we meant to avoid, it has been replaced with a dummy null statement.	2017-11-26 13:58:24 -07:00
Eric	11dc36822b	Improve CPU Cache info reporting -- Add Windows support. (#486 ) * Improve CPU Cache info reporting -- Add Windows support. This patch does a couple of thing regarding CPU Cache reporting. First, it adds an implementation on Windows. Second it fixes the JSONReporter to correctly (and actually) output the CPU configuration information. And finally, third, it detects and reports the number of physical CPU's that share the same cache.	2017-11-26 13:33:01 -07:00
Eric	27e0b439cf	Refactor System information collection -- Add CPU Cache Info (#483 ) * Refactor System information collection. This patch refactors the system information collection, and in particular information about the target CPU. The motivation is to make it easier to access CPU information, and easier to add new information as need be. This patch additionally adds information about the cache sizes of the CPU. * Address review comments: Clean up integer types. This commit cleans up the integer types used in ValueUnion to follow the Google style guide. Additionally it adds a BENCHMARK_UNREACHABLE macro to assist in documenting/catching unreachable code paths. * Rename ValueUnion accessors.	2017-11-22 08:33:52 -08:00
Kamil Rytarowski	aad6a5fa76	Add NetBSD support (#482 ) Define BENCHMARK_OS_NETBSD for NetBSD. Add detection of cpuinfo_cycles_per_second and cpuinfo_num_cpus. This code shared detection of these properties with FreeBSD.	2017-11-17 08:46:08 -08:00
Steinar H. Gunderson	0c3ec998c4	Add a pkg-config file, for the benefit of projects not using CMake. (#480 )	2017-11-15 11:51:22 -08:00
Dominic Hamon	ed5764ea28	Add doc specifying the scope of the timing calculation Fixes #479	2017-11-13 09:20:12 -08:00
Roman Lebedev	5e66248b44	[Tools] A new, more versatile benchmark output compare tool (#474 ) * [Tools] A new, more versatile benchmark output compare tool Sometimes, there is more than one implementation of some functionality. And the obvious use-case is to benchmark them, which is better? Currently, there is no easy way to compare the benchmarking results in that case: The obvious solution is to have multiple binaries, each one containing/running one implementation. And each binary must use exactly the same benchmark family name, which is super bad, because now the binary name should contain all the info about benchmark family... What if i tell you that is not the solution? What if we could avoid producing one binary per benchmark family, with the same family name used in each binary, but instead could keep all the related families in one binary, with their proper names, AND still be able to compare them? There are three modes of operation: 1. Just compare two benchmarks, what `compare_bench.py` did: ``` $ ../tools/compare.py benchmarks ./a.out ./a.out RUNNING: ./a.out --benchmark_out=/tmp/tmprBT5nW Run on (8 X 4000 MHz CPU s) 2017-11-07 21:16:44 ------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------ BM_memcpy/8 36 ns 36 ns 19101577 211.669MB/s BM_memcpy/64 76 ns 76 ns 9412571 800.199MB/s BM_memcpy/512 84 ns 84 ns 8249070 5.64771GB/s BM_memcpy/1024 116 ns 116 ns 6181763 8.19505GB/s BM_memcpy/8192 643 ns 643 ns 1062855 11.8636GB/s BM_copy/8 222 ns 222 ns 3137987 34.3772MB/s BM_copy/64 1608 ns 1608 ns 432758 37.9501MB/s BM_copy/512 12589 ns 12589 ns 54806 38.7867MB/s BM_copy/1024 25169 ns 25169 ns 27713 38.8003MB/s BM_copy/8192 201165 ns 201112 ns 3486 38.8466MB/s RUNNING: ./a.out --benchmark_out=/tmp/tmpt1wwG_ Run on (8 X 4000 MHz CPU s) 2017-11-07 21:16:53 ------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------ BM_memcpy/8 36 ns 36 ns 19397903 211.255MB/s BM_memcpy/64 73 ns 73 ns 9691174 839.635MB/s BM_memcpy/512 85 ns 85 ns 8312329 5.60101GB/s BM_memcpy/1024 118 ns 118 ns 6438774 8.11608GB/s BM_memcpy/8192 656 ns 656 ns 1068644 11.6277GB/s BM_copy/8 223 ns 223 ns 3146977 34.2338MB/s BM_copy/64 1611 ns 1611 ns 435340 37.8751MB/s BM_copy/512 12622 ns 12622 ns 54818 38.6844MB/s BM_copy/1024 25257 ns 25239 ns 27779 38.6927MB/s BM_copy/8192 205013 ns 205010 ns 3479 38.108MB/s Comparing ./a.out to ./a.out Benchmark Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------ BM_memcpy/8 +0.0020 +0.0020 36 36 36 36 BM_memcpy/64 -0.0468 -0.0470 76 73 76 73 BM_memcpy/512 +0.0081 +0.0083 84 85 84 85 BM_memcpy/1024 +0.0098 +0.0097 116 118 116 118 BM_memcpy/8192 +0.0200 +0.0203 643 656 643 656 BM_copy/8 +0.0046 +0.0042 222 223 222 223 BM_copy/64 +0.0020 +0.0020 1608 1611 1608 1611 BM_copy/512 +0.0027 +0.0026 12589 12622 12589 12622 BM_copy/1024 +0.0035 +0.0028 25169 25257 25169 25239 BM_copy/8192 +0.0191 +0.0194 201165 205013 201112 205010 ``` 2. Compare two different filters of one benchmark: (for simplicity, the benchmark is executed twice) ``` $ ../tools/compare.py filters ./a.out BM_memcpy BM_copy RUNNING: ./a.out --benchmark_filter=BM_memcpy --benchmark_out=/tmp/tmpBWKk0k Run on (8 X 4000 MHz CPU s) 2017-11-07 21:37:28 ------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------ BM_memcpy/8 36 ns 36 ns 17891491 211.215MB/s BM_memcpy/64 74 ns 74 ns 9400999 825.646MB/s BM_memcpy/512 87 ns 87 ns 8027453 5.46126GB/s BM_memcpy/1024 111 ns 111 ns 6116853 8.5648GB/s BM_memcpy/8192 657 ns 656 ns 1064679 11.6247GB/s RUNNING: ./a.out --benchmark_filter=BM_copy --benchmark_out=/tmp/tmpAvWcOM Run on (8 X 4000 MHz CPU s) 2017-11-07 21:37:33 ---------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------- BM_copy/8 227 ns 227 ns 3038700 33.6264MB/s BM_copy/64 1640 ns 1640 ns 426893 37.2154MB/s BM_copy/512 12804 ns 12801 ns 55417 38.1444MB/s BM_copy/1024 25409 ns 25407 ns 27516 38.4365MB/s BM_copy/8192 202986 ns 202990 ns 3454 38.4871MB/s Comparing BM_memcpy to BM_copy (from ./a.out) Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------------------- [BM_memcpy vs. BM_copy]/8 +5.2829 +5.2812 36 227 36 227 [BM_memcpy vs. BM_copy]/64 +21.1719 +21.1856 74 1640 74 1640 [BM_memcpy vs. BM_copy]/512 +145.6487 +145.6097 87 12804 87 12801 [BM_memcpy vs. BM_copy]/1024 +227.1860 +227.1776 111 25409 111 25407 [BM_memcpy vs. BM_copy]/8192 +308.1664 +308.2898 657 202986 656 202990 ``` 3. Compare filter one from benchmark one to filter two from benchmark two: (for simplicity, the benchmark is executed twice) ``` $ ../tools/compare.py benchmarksfiltered ./a.out BM_memcpy ./a.out BM_copy RUNNING: ./a.out --benchmark_filter=BM_memcpy --benchmark_out=/tmp/tmp_FvbYg Run on (8 X 4000 MHz CPU s) 2017-11-07 21:38:27 ------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------ BM_memcpy/8 37 ns 37 ns 18953482 204.118MB/s BM_memcpy/64 74 ns 74 ns 9206578 828.245MB/s BM_memcpy/512 91 ns 91 ns 8086195 5.25476GB/s BM_memcpy/1024 120 ns 120 ns 5804513 7.95662GB/s BM_memcpy/8192 664 ns 664 ns 1028363 11.4948GB/s RUNNING: ./a.out --benchmark_filter=BM_copy --benchmark_out=/tmp/tmpDfL5iE Run on (8 X 4000 MHz CPU s) 2017-11-07 21:38:32 ---------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------- BM_copy/8 230 ns 230 ns 2985909 33.1161MB/s BM_copy/64 1654 ns 1653 ns 419408 36.9137MB/s BM_copy/512 13122 ns 13120 ns 53403 37.2156MB/s BM_copy/1024 26679 ns 26666 ns 26575 36.6218MB/s BM_copy/8192 215068 ns 215053 ns 3221 36.3283MB/s Comparing BM_memcpy (from ./a.out) to BM_copy (from ./a.out) Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------------------- [BM_memcpy vs. BM_copy]/8 +5.1649 +5.1637 37 230 37 230 [BM_memcpy vs. BM_copy]/64 +21.4352 +21.4374 74 1654 74 1653 [BM_memcpy vs. BM_copy]/512 +143.6022 +143.5865 91 13122 91 13120 [BM_memcpy vs. BM_copy]/1024 +221.5903 +221.4790 120 26679 120 26666 [BM_memcpy vs. BM_copy]/8192 +322.9059 +323.0096 664 215068 664 215053 ``` * [Docs] Document tools/compare.py * [docs] Document how the change is calculated	2017-11-07 13:35:25 -08:00
Dominic Hamon	90aa8665b5	Reorder inline to avoid warning on MSVC (#469 ) Fixes #467	2017-11-07 10:33:07 -08:00
Dominic Hamon	f4009ef8e3	Fix #476 . Explicit coersion of size_t to boolean (#477 )	2017-11-07 10:30:17 -08:00
Eric	72a4581caf	Fix #382 - MinGW often reports negative CPU times. (#475 ) When stopping a timer, the current time is subtracted from the start time. However, when the times are identical, or sufficiently close together, the subtraction can result in a negative number. For some reason MinGW is the only platform where this problem manifests. I suspect it's due to MinGW specific behavior in either the CPU timing code, floating point model, or printf formatting. Either way, the fix for MinGW should be correct across all platforms.	2017-11-07 09:44:39 -08:00
Dominic Hamon	f65c6d9a2c	Remove deprecated headers (#473 )	2017-11-06 08:53:23 -08:00
Dominic Hamon	1e52560157	Add releasing doc (#472 )	2017-11-03 12:45:16 -07:00
Roman Lebedev	336bb8db98	Update AUTHORS/CONTRIBUTORS (#471 ) As requested, in a pr form :)	2017-11-03 10:00:29 -07:00
Stefan Sauer	4463a60efe	Mention how to disable CPU frequency scaling while running the benchmark. (#466 ) Describe how to use the cpupower command to disable CPU frequency scaling. Document this, since there are other ways that don't see to have the same effect. See #325	2017-11-02 08:34:06 -07:00
Leo Koppel	fa341e51cb	Improve BM_SetInsert example (#465 ) * Fix BM_SetInsert example Move declaration of `std::set<int> data` outside the timing loop, so that the destructor is not timed. * Speed up BM_SetInsert test Since the time taken to ConstructRandomSet() is so large compared to the time to insert one element, but only the latter is used to determine number of iterations, this benchmark now takes an extremely long time to run in benchmark_test. Speed it up two ways: - Increase the Ranges() parameters - Cache ConstructRandomSet() result (it's not random anyway), and do only O(N) copy every iteration * Fix same issue in BM_MapLookup test * Make BM_SetInsert test consistent with README - Use the same Ranges everywhere, but increase the 2nd range - Change order of Args() calls in README to more closely match the result of Ranges - Don't cache ConstructRandomSet, since it doesn't make sense in README - Get a smaller optimization inside it, by givint a hint to insert()	2017-10-31 11:00:39 -07:00
Yangqing Jia	491360b833	Add option to install benchmark (#463 ) * Add option to install benchmark * Change to BENCHMARK_ENABLE_INSTALL per @dominichamon	2017-10-20 13:49:37 -07:00
Eric	25acf220a4	Refactor most usages of KeepRunning to use the perfered ranged-for. (#459 ) Recently the library added a new ranged-for variant of the KeepRunning loop that is much faster. For this reason it should be preferred in all new code. Because a library, its documentation, and its tests should all embody the best practices of using the library, this patch changes all but a few usages of KeepRunning() into for (auto _ : state). The remaining usages in the tests and documentation persist only to document and test behavior that is different between the two formulations. Also note that because the range-for loop requires C++11, the KeepRunning variant has not been deprecated at this time.	2017-10-17 12:17:02 -06:00
Eric Fiselier	22fd1a556e	Fix and document SkipWithError(...) using ranged-for loop.	2017-10-17 10:24:13 -06:00
Eric	a37fc0c48a	Improve KeepRunning loop performance to be similar to the range-based for. (#460 ) This patch improves the performance of the KeepRunning loop in two ways: (A) it removes the dependency on the max_iterations variable, preventing it from being loaded every iteration. (B) it loops to zero, instead of to an upper bound. This allows a single decrement instruction to be used instead of a arithmetic op followed by a comparison.	2017-10-17 08:40:44 -07:00
Fred Tingaud	2fc2ea0e45	Correct typo in sample code for range-based for loop. (#458 )	2017-10-16 09:17:17 -07:00
Raúl Marín	cacd321808	Avoid implicit float to double conversion (#457 ) Triggered by -Werror=double-promotion	2017-10-13 09:17:02 -07:00
Eric	0526755944	Add C++11 Ranged For loop alternative to KeepRunning (#454 ) * Add C++11 Ranged For loop alternative to KeepRunning As pointed out by @astrelni and @dominichamon, the KeepRunning loop requires a bunch of memory loads and stores every iterations, which affects the measurements. The main reason for these additional loads and stores is that the State object is passed in by reference, making its contents externally visible memory, and the compiler doesn't know it hasn't been changed by non-visible code. It's also possible the large size of the State struct is hindering optimizations. This patch allows the `State` object to be iterated over using a range-based for loop. Example: void BM_Foo(benchmark::State& state) { for (auto _ : state) { [...] } } This formulation is much more efficient, because the variable counting the loop index is stored in the iterator produced by `State::begin()`, which itself is stored in function-local memory and therefore not accessible by code outside of the function. Therefore the compiler knows the iterator hasn't been changed every iteration. This initial patch and idea was from Alex Strelnikov. * Fix null pointer initialization in C++03	2017-10-10 08:56:42 -07:00
mwinterb	f3cd636f18	Always use inline asm DoNotOptimize with clang. (#452 ) * Always use inline asm DoNotOptimize with clang. clang-cl masquerades as MSVC but not GCC, so it was using the MSVC-compatible definitions of DoNotOptimize and ClobberMemory. Presumably, it's better in general to use the targeted assembly for this functionality (the codegen is different), but the specific issue is that clang-cl deprecates the usage of _ReadWriteBarrier, and this gets rid of that warning. * triggering another AppVeyor run	2017-10-10 00:19:01 +02:00
Anton Lashkov	819adb4cd1	Add macros for create benchmark with templated fixture (#451 ) * Add macros for create benchmark with templated fixture * Add info about templated fixtures to README.md * Add tests for templated fixtures	2017-10-09 21:10:37 +02:00
Dominic Hamon	2409cb2eb1	Minor move of code to cleanup up namespace spaghetti a bit	2017-10-09 12:01:30 -07:00
Dominic Hamon	a96ff121b3	Alphabets are hard. AUTHORS version. #448	2017-09-27 11:53:16 -07:00
Dominic Hamon	5d47e9878f	Alphabets are hard. CONTRIBUTORS version. #448	2017-09-27 11:52:47 -07:00
Dominic Hamon	8792dff1c9	Remove myself from AUTHORS Covered by Google Inc here and i'm in CONTRIBUTORS	2017-09-27 20:01:49 +02:00
Dominic Hamon	359120be78	Order CONTRIBUTORS Fixes #448	2017-09-27 20:01:10 +02:00
Dominic Hamon	84a54ae9f4	Organize AUTHORS Part of #448	2017-09-27 20:00:12 +02:00
Eric	6d8339dd97	Fix #444 - Use BENCHMARK_HAS_CXX11 over __cplusplus. (#446 ) * Fix #444 - Use BENCHMARK_HAS_CXX11 over __cplusplus. MSVC incorrectly defines __cplusplus to report C++03, despite the compiler actually providing C++11 or greater. Therefore we have to detect C++11 differently for MSVC. This patch uses `_MSVC_LANG` which has been defined since Visual Studio 2015 Update 3; which should be sufficient for detecting C++11. Secondly this patch changes over most usages of __cplusplus >= 201103L to check BENCHMARK_HAS_CXX11 instead. * remove redunant comment	2017-09-14 15:50:33 -06:00
Disconnect3d	2a05f248be	Improve README's basic usage example (#433 )	2017-09-14 09:31:35 +02:00
Andre Schroeder	24b8042733	Fix Markdown typos in readme. (#445 )	2017-09-13 15:42:45 -06:00
Roman Lebedev	886585a3b7	[RFC] Tools: compare-bench.py: print change% with two decimal digits (#440 ) * Tools: compare-bench.py: print change% with two decimal digits Here is a comparison of before vs. after: ```diff -Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------- -BM_SameTimes +0.00 +0.00 10 10 10 10 -BM_2xFaster -0.50 -0.50 50 25 50 25 -BM_2xSlower +1.00 +1.00 50 100 50 100 -BM_1PercentFaster -0.01 -0.01 100 99 100 99 -BM_1PercentSlower +0.01 +0.01 100 101 100 101 -BM_10PercentFaster -0.10 -0.10 100 90 100 90 -BM_10PercentSlower +0.10 +0.10 100 110 100 110 -BM_100xSlower +99.00 +99.00 100 10000 100 10000 -BM_100xFaster -0.99 -0.99 10000 100 10000 100 -BM_10PercentCPUToTime +0.10 -0.10 100 110 100 90 +Benchmark Time CPU Time Old Time New CPU Old CPU New +------------------------------------------------------------------------------------------------------------- +BM_SameTimes +0.0000 +0.0000 10 10 10 10 +BM_2xFaster -0.5000 -0.5000 50 25 50 25 +BM_2xSlower +1.0000 +1.0000 50 100 50 100 +BM_1PercentFaster -0.0100 -0.0100 100 99 100 99 +BM_1PercentSlower +0.0100 +0.0100 100 101 100 101 +BM_10PercentFaster -0.1000 -0.1000 100 90 100 90 +BM_10PercentSlower +0.1000 +0.1000 100 110 100 110 +BM_100xSlower +99.0000 +99.0000 100 10000 100 10000 +BM_100xFaster -0.9900 -0.9900 10000 100 10000 100 +BM_10PercentCPUToTime +0.1000 -0.1000 100 110 100 90 +BM_ThirdFaster -0.3333 -0.3333 100 67 100 67 ``` So the first ("Time") column is exactly where it was, but with two more decimal digits. The position of the '.' in the second ("CPU") column is shifted right by those two positions, and the rest is unmodified, but simply shifted right by those 4 positions. As for the reasoning, i guess it is more or less the same as with #426. In some sad times, microbenchmarking is not applicable. In those cases, the more precise the change report is, the better. The current formatting prints not so much the percentages, but the fraction i'd say. It is more useful for huge changes, much more than 100%. That is not always the case, especially if it is not a microbenchmark. Then, even though the change may be good/bad, the change is small (<0.5% or so), rounding happens, and it is no longer possible to tell. I do acknowledge that this change does not fix that problem. Of course, confidence intervals and such would be better, and they would probably fix the problem. But i think this is good as-is too, because now the you see 2 fractional percentage digits!1 The obvious downside is that the output is now even wider. * Revisit tests, more closely documents the current behavior.	2017-08-28 16:12:18 -07:00
Roman Lebedev	6e06648133	Attempting to resolve a submoduling issues... (#439 )	2017-08-28 16:10:19 -07:00
Roman Lebedev	a271c36af9	Drop Stat1, refactor statistics to be user-providable, add median. (#428 ) * Drop Stat1, refactor statistics to be user-providable, add median. My main goal was to add median statistic. Since Stat1 calculated the stats incrementally, and did not store the values themselves, it is was not possible. Thus, i have replaced Stat1 with simple std::vector<double>, containing all the values. Then, i have refactored current mean/stdev to be a function that is provided with values vector, and returns the statistic. While there, it seemed to make sense to deduplicate the code by storing all the statistics functions in a map, and then simply iterate over it. And the interface to add new statistics is intentionally exposed, so they may be added easily. The notable change is that Iterations are no longer displayed as 0 for stdev. Is could be changed, but i'm not sure how to nicely fit that into the API. Similarly, this dance about sometimes (for some fields, for some statistics) dividing by run.iterations, and then multiplying the calculated stastic back is also dropped, and if you do the math, i fail to see why it was needed there in the first place. Since that was the only use of stat.h, it is removed. * complexity.h: attempt to fix MSVC build * Update README.md * Store statistics to compute in a vector, ensures ordering. * Add a bit more tests for repetitions. * Partially address review notes. * Fix gcc build: drop extra ';' clang, why didn't you warn me? * Address review comments. * double() -> 0.0 * early return	2017-08-23 16:44:29 -07:00
Dominic Hamon	d70417994a	Allow the definition of 1k to be flexible. (#438 ) When generating a human-readable number for user counters, we don't generally expect 1k to be 1024. This is the default due to the more general purpose string utility. Fixes #437	2017-08-21 16:05:24 -07:00
Roman Lebedev	c7192c8a9a	compare_bench.py: fixup benchmark_options. (#435 ) `2373382284` reworked parsing, and introduced a regression in handling of the optional options that should be passed to both of the benchmarks. Now, unless the first optional argument starts with '-', it would just complain about that argument: Unrecognized positional argument arguments: '['q']' which is wrong. However if some dummy arg like '-q' was passed first, it would then happily passthrough them all... This commit fixes benchmark_options behavior, by restoring original passthrough behavior for all the optional positional arguments.	2017-08-18 10:55:27 -07:00
Victor Costan	902936033d	CMake: Fallback from try_run to try_compile when cross-compiling. (#436 )	2017-08-15 15:53:30 -07:00
Roman Lebedev	3347a20e0e	reporter_output_test: json: iterations is int, not float (#431 ) May be relevant for flakiness of win builds Noted by @KindDragon	2017-07-31 19:04:02 -06:00
Eric Fiselier	abafced990	Suppress -Wodr on C++03 tests when LTO is enabled. The benchmark library is compiled as C++11, but certain tests are compiled as C++03. When -flto is enabled GCC 5.4 and above will diagnose an ODR violation in libstdc++'s <map>. This ODR violation, although real, should likely be benign. For this reason it seems sensible to simply suppress -Wodr when building the C++03 test. This patch fixes #420 and supersede's PR #424.	2017-07-30 18:44:04 -06:00
Roman Lebedev	d474450b89	Tooling: generate_difference_report(): show old/new for both values (#427 ) While the percentages are displayed for both of the columns, the old/new values are only displayed for the second column, for the CPU time. And the column is not even spelled out. In cases where b->UseRealTime(); is used, this is at the very least highly confusing. So why don't we just display both the old/new for both the columns? Fixes #425	2017-07-25 09:09:26 -07:00
Roman Lebedev	b9be142d1e	Json reporter: don't cast floating-point to int; adjust tooling (#426 ) * Json reporter: passthrough fp, don't cast it to int; adjust tooling Json output format is generally meant for further processing using some automated tools. Thus, it makes sense not to intentionally limit the precision of the values contained in the report. As it can be seen, FormatKV() for doubles, used %.2f format, which was meant to preserve at least some of the precision. However, before that function is ever called, the doubles were already cast to the integer via RoundDouble()... This is also the case for console reporter, where it makes sense because the screen space is limited, and this reporter, however the CSV reporter does output some( decimal digits. Thus i can only conclude that the loss of the precision was not really considered, so i have decided to adjust the code of the json reporter to output the full fp precision. There can be several reasons why that is the right thing to do, the bigger the time_unit used, the greater the precision loss, so i'd say any sort of further processing (like e.g. tools/compare_bench.py does) is best done on the values with most precision. Also, that cast skewed the data away from zero, which i think may or may not result in false- positives/negatives in the output of tools/compare_bench.py * Json reporter: FormatKV(double): address review note * tools/gbench/report.py: skip benchmarks with different time units While it may be useful to teach it to operate on the measurements with different time units, which is now possible since floats are stored, and not the integers, but for now at least doing such a sanity-checking is better than providing misinformation.	2017-07-24 16:13:55 -07:00
Dominic Hamon	5b7683f49e	more clang tidy cleanups (#417 )	2017-07-15 00:21:20 +02:00
Dominic Hamon	e8fc2a2b8c	Google-style cleanups (#416 )	2017-07-13 18:33:43 +02:00
Tom Madams	ee3cfca651	Fix ThreadCPUUsage when running on RTEMS. (#414 ) Change ThreadCPUUsage to call ProcessCPUUsage if __rtems__ is defined. RTEMS real time OS doesn't support CLOCK_THREAD_CPUTIME_ID. See https://github.com/RTEMS/rtems/blob/master/cpukit/posix/src/clockgettime.c#L58-L59 Prior to this change, ThreadCPUUsage would fail when running on RTEMS with: ERROR: clock_gettime(CLOCK_THREAD_CPUTIME_ID, ...) failed	2017-07-06 15:59:13 -07:00

1 2 3 4 5 ...

806 commits