* Implementation of random interleaving. See
http://github.com/google/benchmark/issues/1051 for the feature requests.
Committer: Hai Huang (http://github.com/haih-g)
On branch fr-1051
Changes to be committed:
modified: include/benchmark/benchmark.h
modified: src/benchmark.cc
new file: src/benchmark_adjust_repetitions.cc
new file: src/benchmark_adjust_repetitions.h
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: src/benchmark_register.cc
modified: src/benchmark_runner.cc
modified: src/benchmark_runner.h
modified: test/CMakeLists.txt
new file: test/benchmark_random_interleaving_gtest.cc
* Fix benchmark_random_interleaving_gtest.cc for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
modified: src/benchmark_runner.cc
modified: test/benchmark_random_interleaving_gtest.cc
* Fix macos build for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: src/benchmark_runner.cc
* Fix macos and windows build for fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_runner.cc
* Fix benchmark_random_interleaving_test.cc for macos and windows in fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: test/benchmark_random_interleaving_gtest.cc
* Fix int type benchmark_random_interleaving_gtest for macos in fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: test/benchmark_random_interleaving_gtest.cc
* Address dominichamon's comments 03/29 for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: test/benchmark_random_interleaving_gtest.cc
* Address dominichamon's comment on default min_time / repetitions for fr-1051.
Also change sentinel of random_interleaving_repetitions to -1. Hopefully it
fixes the failures on Windows.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
* Fix windows test failures for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_api_internal.cc
modified: src/benchmark_runner.cc
* Add license blurb for fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_adjust_repetitions.cc
modified: src/benchmark_adjust_repetitions.h
* Switch to std::shuffle() for fr-1105.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
* Change to 1e-9 in fr-1105
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_adjust_repetitions.cc
* Fix broken build caused by bad merge for fr-1105.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_api_internal.cc
modified: src/benchmark_runner.cc
* Fix build breakage for fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: src/benchmark_register.cc
modified: src/benchmark_runner.cc
* Print out reports as they come in if random interleaving is disabled (fr-1051)
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
* size_t, int64_t --> int in benchmark_runner for fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_runner.cc
modified: src/benchmark_runner.h
* Address comments from dominichamon for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
modified: src/benchmark_adjust_repetitions.cc
modified: src/benchmark_adjust_repetitions.h
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: test/benchmark_random_interleaving_gtest.cc
* benchmar_indices --> size_t to make CI pass: fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark.cc
* Fix min_time not initialized issue for fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
* min_time --> MinTime in fr-1051.
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: src/benchmark_api_internal.cc
modified: src/benchmark_api_internal.h
modified: src/benchmark_runner.cc
* Add doc for random interleaving for fr-1051
Committer: Hai Huang <haih@google.com>
On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.
Changes to be committed:
modified: README.md
new file: docs/random_interleaving.md
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
* Support -Wsuggest-override
google/benchmark is C++11 compatible but doesn't use the `override` keyword.
Projects using google/benchmark with enabled `-Wsuggest-override` and `-Werror` will fail to compile.
* Add -Wsuggest-override cxx flag
* Revert unrelated formatting
* Revert unrelated formatting, take 2
* Revert unrelated formatting, take 3
* Disable -Wsuggest-override when compiling tests, gtest does not handle it yet
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
This also fixes#1135. Because StrSplit was returning a vector with an
empty string, it was treated by PerfCounters::Create as a legitimate ask
for setting up a counter with that name. The empty vector is understood
by PerfCounters as "just return NoCounters()".
* Add API to benchmark allowing for custom context to be added
Fixes#525
* add docs
* Add context flag output to JSON reporter
* Plumb everything into the global context.
* Add googletests for custom context
* update docs with duplicate key behaviour
* Add `benchmark_context` flag that allows per-run custom context.
Add support for key-value flags in general.
Added test for key-value flags.
Added `benchmark_context` flag.
Output content of `benchmark_context` to base reporter.
Solves the first part of #525.
* Docs and better help
* Support optional, user-directed collection of performance counters
The patch allows an engineer wishing to drill into the root causes
of a regression, for example. Currently, only single threaded runs
are supported. The feature is a build-time opt in, and then a runtime
opt in.
The engineer may run the benchmark executable, passing a list of
performance counter names (using libpfm's naming scheme) at the
command line. The counter values will then be collected and reported
back as UserCounters.
This is different from #240 in that it is a benchmark user opt-in, and
the counter collection is transparent to the benchmark.
Currently, this is only supported on platforms where libpfm is
supported.
libpfm: http://perfmon2.sourceforge.net/
* 'Use' values param in Snapshot when BENCHMARK_OS_WINDOWS
This is to avoid unused parameter warning-as-error
* Added missing include for <vector> in perf_counters.cc
* Moved doc to docs
* Added license blurbs
Use the benchmark's reported iteration count when estimating
iterations for the next repetition, rather than the requested
iteration count. When the benchmark uses KeepRunningBatch the actual
iteration count can be larger than the one the runner requested.
Prior to this fix the runner was underestimating the next iteration
count, sometimes significantly so. Consider the case of a benchmark
using a batch size of 1024. Prior to this change, the benchmark
runner would attempt iteration counts 1, 10, 100 and 1000, yet the
benchmark itself would do the same amount of work each time: a single
batch of 1024 iterations. The discrepancy could also contribute to
estimation errors once the benchmark time reached 10% of the target.
For example, if the very first batch of 1024 iterations reached 10% of
benchmark_min_min time, the runner would attempt to scale that to 100%
from a basis of one iteration rather than 1024.
This bug was particularly noticeable in benchmarks with large batch
sizes, especially when the benchmark also had slow set up or tear down
phases.
With this fix in place it is possible to use KeepRunningBatch to
achieve a kind of "minimum iteration count" feature by using a larger
fixed batch size. For example, a benchmark may build a map of 500K
elements and test a "find" operation. There is no point in running
"find" just 1, 10, 100, etc., times. The benchmark can now pick a
batch size of something like 10K, and the runner will arrive at the
final max iteration count with in noticeably fewer repetitions.
* Implement custom benchmark name
The benchmark's name can be changed using the Name() function
which internally uses SetName().
* Update AUTHORS and CONTRIBUTORS
* Describe new feature in README
* Move new name function up
Fixes#1106
The existing behavior results in the `0` value being added twice. Since
`lo` is always added to `dst`, we never want to explicitly add `0` if
`lo` is equal to `0`.
* Add CartesianProduct with associated test
* Use CartesianProduct in Ranges to avoid code duplication
* Add new cartesian_product_test to CMakeLists.txt
* Update AUTHORS & CONTRIBUTORS
* Rename CartesianProduct to ArgsProduct
* Rename test & fixture accordingly
* Add example for ArgsProduct to README
As noted in #995, this causes issues when the command line flag already
starts with "benchmark_", which they all do.
Not caught by tests as the test flags didn't start with "benchmark".
Fixes#995
* JSONReporter: don't report on scaling if we didn't get it (#1005)
* JSONReporter: fix due to review (std::pair<bool, bool> -> enum)
* JSONReporter: scaling: fix the algo (due to review discussion)
* benchmark.h: revert to old-fashioned enum's (C++03 compatibility); rreporter_output_test: let's skip scaling
* timestamp: use rfc3339-formatted timestamps in output
Replace localized timestamps with machine-readable IETF RFC 3339 format
timestamps. This is an attempt to make the output timestamps easily
machine-readable. ISO8601 specifies standards for time interchange
formats. IETF RFC 3339: https://tools.ietf.org/html/rfc3339 defines a
subset of these for use in the internet. The general form for these
timestamps is:
YYYY-MM-DDTHH:mm:SS[+-]hhmm
This replaces the localized time formats that are currently being used
in the benchmark output to prioritize interchangeability and
machine-readability.
This might break existing programs that rely on the particular date-time
format. This might also may make times less human readable. RFC3339 was
intended to balance human readability and simplicity for machine
readability, but it is primarily intended as an internal representation.
* timers: remove utc string formatting
We only ever need local time printing. Remove the UTC printing
and cosnolidate the logic slightly.
* timers: manually create rfc3339 string
The C++ standard library does not output the time offset in RFC3339
format, it is missing the : between hours and minutes. VS does not
appear to support timezone information by default. To avoid adding too
much complexity to benchmark around timezone handling e.g. a full
date library like https://github.com/HowardHinnant/date, we fall back
to outputting GMT time with a -00:00 offset for those cases.
* timers: use reentrant form for localtime_r & tmtime_r
For non-windows, use the reentrant form for the time conversion
functions.
* timers: cleanup
Use strtol instead of brittle moving characters around.
* timers: only call strftime twice.
Also size buffers to known maximum necessary size and name constants
more appropriately.
* timers: fix unused variable warning
In a previous commit[1], diagnostic pragmas were used to avoid this
warning. However, the incorrect warning flag was indicated, leaving the
warning in place. -Wdeprecated is for deprecated features while
-Wdeprecated-declarations for deprecated functions, variables, and
types[2].
[1] c408461983
[2] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Fix type conversion warnings.
Fixes#949
Tested locally (Linux/clang), but warnings are on MSVC so may differ.
* Drop the ULP so the double test passes
* Add State::error_occurred()
* Relax CHECK condition in benchmark_runner.cc
If the benchmark state contains an error, do not expect any iterations has been run.
This allows using SkipWithError() and return early from the benchmark function.
* README.md: document new possible usage of SkipWithError()
* add Jordan Williams to both CONTRIBUTORS and AUTHORS
* alias benchmark libraries
Provide aliased CMake targets for the benchmark and benchmark_main targets.
The alias targets are namespaced under benchmark::, which is the namespace when they are exported.
I chose not to use either the PROJECT_NAME or the namespace variable but to hard-code the namespace.
This is because the benchmark and benchmark_main targets are hard-coded by name themselves.
Hard-coding the namespace is also much cleaner and easier to read.
* link to aliased benchmark targets
It is safer to link against namespaced targets because of how CMake interprets the double colon.
Typo's will be caught by CMake at configuration-time instead of during compile / link time.
* document the provided alias targets
* add "Usage with CMake" section in documentation
This section covers linking against the alias/import CMake targets and including them using either find_package or add_subdirectory.
* format the "Usage with CMake" README section
Added a newline after the "Usage with CMake" section header.
Dropped the header level of the section by one to make it a direct subsection of the "Usage" section.
Wrapped lines to be no longer than 80 characters in length.
* CTest must use proper paths to executables
With the following syntax:
```
add_test(NAME <name> COMMAND <command> [<arg>...])
```
if `<command>` specifies an executable target it will automatically
be replaced by the location of the executable created at build time.
This is important if a `<Configuration>_POSTFIX` like `_d` is used.
* Fix typo in ctest invocation
Instead of `-c` the uppercase `-C` must be used to select a config.
But better use the longopt.
Initialize option flags from environment variables values if they are defined, eg. `BENCHMARK_OUT=<filename>` for `--benchmark_out=<filename>`. Command line flag value always prevails.
Fixes https://github.com/google/benchmark/issues/881.
* Guard ASSERT_THROWS checks with BENCHMARK_HAS_NO_EXCEPTIONS
This allows the test be run with exceptions turned off
* Add myself to CONTRIBUTORS
I don't need to be added to AUTHORS, as I am a Google employee
While current counters can e.g. answer the question
"how many items is processed per second", it is impossible to get
it to tell "how many seconds it takes to process a single item".
The solution is to add a yet another modifier `kInvert`,
that is *always* considered last, which simply inverts the answer.
Fixes#781, #830, #848.
The CSVReporter is deprecated, but we still need to reference it in
a few places. To avoid breaking the build when warnings are errors,
we need to disable the warning when we do so.
* Update AUTHORS and CONTRIBUTORS
* Fix WSL self-test failures
Some of the benchmark self-tests expect and check for a particular
output format from the benchmark library. The numerical values must
not be infinity or not-a-number, or the test will report an error.
Some of the values are computed bytes-per-second or items-per-second
values, so these require that the measured CPU time for the test to be
non-zero. But the loop that is being measured was empty, so the
measured CPU time for the loop was extremely small. On systems like
Windows Subsystem for Linux (WSL) the timer doesn't have enough
resolution to measure this, so the measured CPU time was zero.
This fix just makes sure that these tests have something within the
timing loop, so that the benchmark library will not decide that the
loop takes zero CPU time. This makes these tests more robust, and in
particular makes them pass on WSL.
This is a shameless rip-off of https://github.com/google/benchmark/pull/646
I did promise to look into why that proposed PR was producing
so much worse assembly, and so i finally did.
The reason is - that diff changes `size_t` (unsigned) to `int64_t` (signed).
There is this nice little `assert`:
7a1c370283/include/benchmark/benchmark.h (L744)
It ensures that we didn't magically decide to advance our iterator
when we should have finished benchmarking.
When `cached_` was unsigned, the `assert` was `cached_ UGT 0`.
But we only ever get to that `assert` if `cached_ NE 0`,
and naturally if `cached_` is not `0`, then it is bigger than `0`,
so the `assert` is tautological, and gets folded away.
But now that `cached_` became signed, the assert became `cached_ SGT 0`.
And we still only know that `cached_ NE 0`, so the assert can't be
optimized out, or at least it doesn't currently.
Regardless of whether or not that is a bug in itself,
that particular diff would have regressed the normal 64-bit systems,
by halving the maximal iteration space (since we go from unsigned counter
to signed one, of the same bit-width), which seems like a bug.
And just so it happens, fixing *this* bug, fixes the other bug.
This produces fully (bit-by-bit) identical state_assembly_test.s
The filecheck change is actually needed regardless of this patch,
else this test does not pass for me even without this diff.
https://github.com/google/benchmark/pull/801 is stuck with some cryptic cmake failure due to
some linking issue between googletest and threading libraries.
I suspect that is mostly happening because of the, uhm,
intentionally extremely twisted-in-the-brains approach that is being used to
actually build the library as part of the buiild,
except without actually building it as part of the build.
If we do actually build it as part of the build,
then all the transitive dependencies should magically be in order,
and maybe everything will just work.
This new version of cmake magic was written by me in
0e22f085c5/cmake/Modules/GoogleTest.cmake.in0e22f085c5/cmake/Modules/GoogleTest.cmake, based on the official googletest docs and LOTS of experimentation.
* escape special chars in csv and json output.
- escape \b,\f,\n,\r,\t,\," from strings before dumping
them to json or csv.
- also faithfully reproduce the sign of nan in json.
this fixes github issue #745.
* functionalize.
* split string escape functions between csv and json
* Update src/csv_reporter.cc
Co-Authored-By: tesch1 <tesch1@gmail.com>
* Update src/json_reporter.cc
Co-Authored-By: tesch1 <tesch1@gmail.com>
* Add FIXME in multiple_ranges_test.cc
* Improve handling of large bounds in AddRange.
Due to breaking the loop too early, AddRange
would miss a final multplier of 'mult' that
was within the numeric range of T.
* Enable negative values for Range argument
Fixes#762.
* Try to fix build of benchmark_gtest
* Try some more to fix build
* Attempt to fix format macros
* Attempt to resolve format errors for mingw32
* Review feedback
Put unit tests in benchmark::internal namespace
Fix error reporting in multiple_ranges_test.cc
* [JSON] add threads and repetitions to the json output, for better ide…
[Tests] explicitly check for thread == 1
[Tests] specifically mark all repetition checks
[JSON] add repetition_index reporting, but only for non-aggregates (i…
* [Formatting] Be very, very explicit about pointer alignment so clang-format can not put pointers/references on the wrong side of arguments.
[Benchmark::Run] Make sure to use explanatory sentinel variable rather than a magic number.
* Do not pass redundant information
Created BenchmarkName class which holds the full benchmark
name and allows specifying and retrieving different components
of the name (e.g. ARGS, THREADS etc.)
Fixes#730.
Some benchmarks are particularly sensitive and they run in less than
a nanosecond. In order for the console reporter to provide meaningful
output for such benchmarks it needs to be able to display the times
using more resolution than a single nanosecond.
This patch changes the console reporter to print at least three
significant digits for all results.
Unlike the initial attempt, this patch does not align the decimal point.
* Adding Host Name and test
* Addressing Review Comments
* Adding Test for JSON Reporter
* Adding HOST_NAME_MAX for MacOS systems
* Adding Explaination for MacOS HOST_NAME_MAX Addition
* Addressing Peer Review Comments
* Adding codecvt in windows header guard
* Changing name SystemInfo and adding empty message incase host name fetch fails
* Adding Comment on Struct SystemInfo
Unit-tests fail to build due to the following errors:
/home/cfx/Dev/google-benchmark/benchmark.git/test/string_util_gtest.cc:12:5: required from here
/home/cfx/Applications/googletest-1.8.1/include/gtest/gtest.h:1444:11: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (lhs == rhs) {
~~~~^~~~~~
Fixes#741
std::tmpnam is deprecated and its use is discouraged. For our purposes
in the tests, we really just need a file name which is unlikely to
exist.
This patch converts the tests to using a dummy random file name
generator, which should hopefully avoid name conflicts.
It is incorrect to say that an aggregate is computed over
run's iterations, because those iterations already got averaged.
Similarly, if there are N repetitions with 1 iterations each,
an aggregate will be computed over N measurements, not 1.
Thus it is best to simply use the count of separate reports.
Fixes#586.
As discussed with @dominichamon and @dbabokin, sugar is nice.
Well, maybe not for the health, but it's sweet.
Alright, enough puns.
A special care needs to be applied not to break csv reporter. UGH.
We end up shedding some code over this.
We no longer specially pretty-print them, they are printed just like the rest of custom counters.
Fixes#627.
This is related to @BaaMeow's work in https://github.com/google/benchmark/pull/616 but is not based on it.
Two new fields are tracked, and dumped into JSON:
* If the run is an aggregate, the aggregate's name is stored.
It can be RMS, BigO, mean, median, stddev, or any custom stat name.
* The aggregate-name-less run name is additionally stored.
I.e. not some name of the benchmark function, but the actual
name, but without the 'aggregate name' suffix.
This way one can group/filter all the runs,
and filter by the particular aggregate type.
I *might* need this for further tooling improvement.
Or maybe not.
But this is certainly worthwhile for custom tooling.
* Counter(): add 'one thousand' param.
Needed for https://github.com/google/benchmark/pull/654
Custom user counters are quite custom. It is not guaranteed
that the user *always* expects for these to have 1k == 1000.
If the counter represents bytes/memory/etc, 1k should be 1024.
Some bikeshedding points:
1. Is this sufficient, or do we really want to go full on
into custom types with names?
I think just the '1000' is sufficient for now.
2. Should there be a helper benchmark::Counter::Counter{1000,1024}()
static 'constructor' functions, since these two, by far,
will be the most used?
3. In the future, we should be somehow encoding this info into JSON.
* Counter(): use std::pair<> to represent 'one thousand'
* Counter(): just use a new enum with two values 1000 vs 1024.
Simpler is better. If someone comes up with a real reason
to need something more advanced, it can be added later on.
* Counter: just store the 1000 or 1024 in the One_K values directly
* Counter: s/One_K/OneK/
This is *only* exposed in the JSON. Not in CSV, which is deprecated.
This *only* supposed to track these two states.
An additional field could later track which aggregate this is,
specifically (statistic name, rms, bigo, ...)
The motivation is that we already have ReportAggregatesOnly,
but it affects the entire reports, both the display,
and the reporters (json files), which isn't ideal.
It would be very useful to have a 'display aggregates only' option,
both in the library's console reporter, and the python tooling,
This will be especially needed for the 'store separate iterations'.
found while working on reproducible builds for openSUSE
To reproduce there
osc checkout openSUSE:Factory/benchmark && cd $_
osc build -j1 --vm-type=kvm
High system load can skew benchmark results. By including system load averages
in the library's output, we help users identify a potential issue in the
quality of their measurements, and thus assist them in producing better (more
reproducible) results.
I got the idea for this from Brendan Gregg's checklist for benchmark accuracy
(http://www.brendangregg.com/blog/2018-06-30/benchmarking-checklist.html).
* Set -Wno-deprecated-declarations for Intel
Intel compiler silently ignores -Wno-deprecated-declarations
so warning no. 1786 must be explicitly suppressed.
* Make std::int64_t → double casts explicit
While std::int64_t → double is a perfectly conformant
implicit conversion, Intel compiler warns about it.
Make them explicit via static_cast<double>.
* Make std::int64_t → int casts explicit
Intel compiler warns about emplacing an std::int64_t
into an int container. Just make the conversion explicit
via static_cast<int>.
* Cleanup Intel -Wno-deprecated-declarations workaround logic
Inspired by these [two](a1ebe07bea) [bugs](0891555be5) in my code due to the lack of those i have found fixed in my code:
* `kIsIterationInvariant` - `* state.iterations()`
The value is constant for every iteration, and needs to be **multiplied** by the iteration count.
* `kAvgIterations` - `/ state.iterations()`
The is global over all the iterations, and needs to be **divided** by the iteration count.
They play nice with `kIsRate`:
* `kIsIterationInvariantRate`
* `kAvgIterationsRate`.
I'm not sure how meaningful they are when combined with `kAvgThreads`.
I guess the `kIsThreadInvariant` can be added, too, for symmetry with `kAvgThreads`.
* Fix compilation on Android with GNU STL
GNU STL in Android NDK lacks string conversion functions from C++11, including std::stoul, std::stoi, and std::stod.
This patch reimplements these functions in benchmark:: namespace using C-style equivalents from C++03.
* Avoid use of log2 which doesn't exist in Android GNU STL
GNU STL in Android NDK lacks log2 function from C99/C++11.
This patch replaces their use in the code with double log(double) function.
* format all documents according to contributor guidelines and specifications
use clang-format on/off to stop formatting when it makes excessively poor decisions
* format all tests as well, and mark blocks which change too much
* Add benchmark_main library with support for Bazel.
* fix newline at end of file
* Add CMake support for benchmark_main.
* Mention optionally using benchmark_main in README.
* Allow support for negative regex filtering
This patch allows one to apply a negation to the entire regex filter
by appending it with a '-' character, much in the same style as
GoogleTest uses.
* Address issues in PR
* Add unit tests for negative filtering
* Ensure 64-bit truncation doesn't happen for complexity results
* One more complexity_n 64-bit fix
* Missed another vector of int
* Piping through the int64_t
* Allow AddRange to work with int64_t.
Fixes#516
Also, tweak how we manage per-test build needs, and create a standard
_gtest suffix for googletest to differentiate from non-googletest tests.
I also ran clang-format on the files that I changed (but not the
benchmark include or main src as they have too many clang-format
issues).
* Add benchmark_gtest to cmake
* Set(Items|Bytes)Processed now take int64_t
* Add tests to verify assembler output -- Fix DoNotOptimize.
For things like `DoNotOptimize`, `ClobberMemory`, and even `KeepRunning()`,
it is important exactly what assembly they generate. However, we currently
have no way to test this. Instead it must be manually validated every
time a change occurs -- including a change in compiler version.
This patch attempts to introduce a way to test the assembled output automatically.
It's mirrors how LLVM verifies compiler output, and it uses LLVM FileCheck to run
the tests in a similar way.
The tests function by generating the assembly for a test in CMake, and then
using FileCheck to verify the // CHECK lines in the source file are found
in the generated assembly.
Currently, the tests only run on 64-bit x86 systems under GCC and Clang,
and when FileCheck is found on the system.
Additionally, this patch tries to improve the code gen from DoNotOptimize.
This should probably be a separate change, but I needed something to test.
* Disable assembly tests on Bazel for now
* Link FIXME to github issue
* Fix Tests on OS X
* fix strip_asm.py to work on both Linux and OS X like targets
Having the copts set on a per-target level can lead to ODR violations
in some cases. Avoid this by ensuring the regex engine is picked
through compiler intrinsics in the header directly.
Older CMake versions, in particular 2.8, don't seem to correctly handle
interface include directories. This causes failures when building the
tests. Additionally, older CMake versions use a different library install
directory than expected (i.e. they use lib/<target-triple>). This caused
certain tests to fail to link.
This patch fixes both those issues. The first by manually adding the
correct include directory when building the tests. The second by specifying
the library output directory when configuring the GTest build.
* Add myself to CONTRIBUTORS under the corp CLA for Stripe, Inc.
* Add support for building with Bazel.
Limitations compared to existing CMake rules:
* Defaults to using C++11 `<regex>`, with an override via Bazel flag
`--define` of `google_benchmark.have_regex`. The TravisCI config sets
the regex implementation to `posix` because it uses ancient compilers.
* Debug vs Opt mode can't be set per test. TravisCI runs all the tests
in debug mode to satisfy `diagnostics_test`, which depends on `CHECK`
being live.
* Set Bazel workspace name so other repos can refer to it by stable name.
This is recommended by the Bazel style guide to avoid each dependent
workspace defining its own name for the dependency.
* Print the executable name as part of the context.
A common use case of the library is to run two different
versions of a benchmark to compare them. In my experience
this often means compiling a benchmark twice, renaming
one of the executables, and then running the executables
back-to-back. In this case the name of the executable
is important contextually information. Unfortunately the
benchmark does not report this information.
This patch adds the executable name to the context reported
by the benchmark.
* attempt to fix tests on Windows
* attempt to fix tests on Windows
Due to ADL lookup performed on the begin and end functions
of `for (auto _ : State)`, std::iterator_traits may get
incidentally instantiated. This patch ensures the library
can tolerate that.
The appveyor bot sometimes fails because the time it
outputs is 6 digits long, but the output test regex expects at most
5 digits. This patch increases the size to 6 digits to placate the
test. This should not *really* affect the correctness of the test.
* Support State::KeepRunningBatch().
State::KeepRunning() can take large amounts of time relative to quick
operations (on the order of 1ns, depending on hardware). For such
sensitive operations, it is recommended to run batches of repeated
operations.
This commit simplifies handling of total_iterations_. Rather than
predecrementing such that total_iterations_ == 1 signals that
KeepRunning() should exit, total_iterations_ == 0 now signals the
intention for the benchmark to exit.
* Create better fast path in State::KeepRunningBatch()
* Replace int parameter with size_t to fix signed mismatch warnings
* Ensure benchmark State has been started even on error.
* Simplify KeepRunningBatch()
* Add support for GTest based unit tests.
As Dominic and I have previously discussed, there is some
need/desire to improve the testing situation in Google Benchmark.
One step to fixing this problem is to make it easier to write
unit tests by adding support for GTest, which is what this patch does.
By default it looks for an installed version of GTest. However the
user can specify -DBENCHMARK_BUILD_EXTERNAL_GTEST=ON to instead
download, build, and use copy of gtest from source. This is
quite useful when Benchmark is being built in non-standard configurations,
such as against libc++ or in 32 bit mode.
* Improve CPU Cache info reporting -- Add Windows support.
This patch does a couple of thing regarding CPU Cache reporting.
First, it adds an implementation on Windows. Second it fixes
the JSONReporter to correctly (and actually) output the CPU
configuration information.
And finally, third, it detects and reports the number of
physical CPU's that share the same cache.
* Fix BM_SetInsert example
Move declaration of `std::set<int> data` outside the timing loop, so that the
destructor is not timed.
* Speed up BM_SetInsert test
Since the time taken to ConstructRandomSet() is so large compared to the time
to insert one element, but only the latter is used to determine number of
iterations, this benchmark now takes an extremely long time to run in
benchmark_test.
Speed it up two ways:
- Increase the Ranges() parameters
- Cache ConstructRandomSet() result (it's not random anyway), and do only
O(N) copy every iteration
* Fix same issue in BM_MapLookup test
* Make BM_SetInsert test consistent with README
- Use the same Ranges everywhere, but increase the 2nd range
- Change order of Args() calls in README to more closely match the result of Ranges
- Don't cache ConstructRandomSet, since it doesn't make sense in README
- Get a smaller optimization inside it, by givint a hint to insert()
Recently the library added a new ranged-for variant of the KeepRunning
loop that is much faster. For this reason it should be preferred in all
new code.
Because a library, its documentation, and its tests should all embody
the best practices of using the library, this patch changes all but a
few usages of KeepRunning() into for (auto _ : state).
The remaining usages in the tests and documentation persist only
to document and test behavior that is different between the two formulations.
Also note that because the range-for loop requires C++11, the KeepRunning
variant has not been deprecated at this time.
* Add C++11 Ranged For loop alternative to KeepRunning
As pointed out by @astrelni and @dominichamon, the KeepRunning
loop requires a bunch of memory loads and stores every iterations,
which affects the measurements.
The main reason for these additional loads and stores is that the
State object is passed in by reference, making its contents externally
visible memory, and the compiler doesn't know it hasn't been changed
by non-visible code.
It's also possible the large size of the State struct is hindering
optimizations.
This patch allows the `State` object to be iterated over using
a range-based for loop. Example:
void BM_Foo(benchmark::State& state) {
for (auto _ : state) {
[...]
}
}
This formulation is much more efficient, because the variable counting
the loop index is stored in the iterator produced by `State::begin()`,
which itself is stored in function-local memory and therefore not accessible
by code outside of the function. Therefore the compiler knows the iterator
hasn't been changed every iteration.
This initial patch and idea was from Alex Strelnikov.
* Fix null pointer initialization in C++03
* Fix#444 - Use BENCHMARK_HAS_CXX11 over __cplusplus.
MSVC incorrectly defines __cplusplus to report C++03, despite the compiler
actually providing C++11 or greater. Therefore we have to detect C++11 differently
for MSVC. This patch uses `_MSVC_LANG` which has been defined since
Visual Studio 2015 Update 3; which should be sufficient for detecting C++11.
Secondly this patch changes over most usages of __cplusplus >= 201103L to
check BENCHMARK_HAS_CXX11 instead.
* remove redunant comment
* Drop Stat1, refactor statistics to be user-providable, add median.
My main goal was to add median statistic. Since Stat1
calculated the stats incrementally, and did not store
the values themselves, it is was not possible. Thus,
i have replaced Stat1 with simple std::vector<double>,
containing all the values.
Then, i have refactored current mean/stdev to be a
function that is provided with values vector, and
returns the statistic. While there, it seemed to make
sense to deduplicate the code by storing all the
statistics functions in a map, and then simply iterate
over it. And the interface to add new statistics is
intentionally exposed, so they may be added easily.
The notable change is that Iterations are no longer
displayed as 0 for stdev. Is could be changed, but
i'm not sure how to nicely fit that into the API.
Similarly, this dance about sometimes (for some fields,
for some statistics) dividing by run.iterations, and
then multiplying the calculated stastic back is also
dropped, and if you do the math, i fail to see why
it was needed there in the first place.
Since that was the only use of stat.h, it is removed.
* complexity.h: attempt to fix MSVC build
* Update README.md
* Store statistics to compute in a vector, ensures ordering.
* Add a bit more tests for repetitions.
* Partially address review notes.
* Fix gcc build: drop extra ';'
clang, why didn't you warn me?
* Address review comments.
* double() -> 0.0
* early return
The benchmark library is compiled as C++11, but certain
tests are compiled as C++03. When -flto is enabled GCC 5.4
and above will diagnose an ODR violation in libstdc++'s <map>.
This ODR violation, although real, should likely be benign. For
this reason it seems sensible to simply suppress -Wodr when building
the C++03 test.
This patch fixes#420 and supersede's PR #424.
* Json reporter: passthrough fp, don't cast it to int; adjust tooling
Json output format is generally meant for further processing
using some automated tools. Thus, it makes sense not to
intentionally limit the precision of the values contained
in the report.
As it can be seen, FormatKV() for doubles, used %.2f format,
which was meant to preserve at least some of the precision.
However, before that function is ever called, the doubles
were already cast to the integer via RoundDouble()...
This is also the case for console reporter, where it makes
sense because the screen space is limited, and this reporter,
however the CSV reporter does output some( decimal digits.
Thus i can only conclude that the loss of the precision
was not really considered, so i have decided to adjust the
code of the json reporter to output the full fp precision.
There can be several reasons why that is the right thing
to do, the bigger the time_unit used, the greater the
precision loss, so i'd say any sort of further processing
(like e.g. tools/compare_bench.py does) is best done
on the values with most precision.
Also, that cast skewed the data away from zero, which
i think may or may not result in false- positives/negatives
in the output of tools/compare_bench.py
* Json reporter: FormatKV(double): address review note
* tools/gbench/report.py: skip benchmarks with different time units
While it may be useful to teach it to operate on the
measurements with different time units, which is now
possible since floats are stored, and not the integers,
but for now at least doing such a sanity-checking
is better than providing misinformation.
* Make Benchmark a single header library (but not header-only)
This patch refactors benchmark into a single header, to allow
for slightly easier usage.
The initial reason for the header split was to keep C++ library
components from being included by benchmark_api.h, making that
part of the library STL agnostic. However this has since changed
and there seems to be little reason to separate the reporters from
the rest of the library.
* Fix internal_macros.h
* Remove more references to macros.h
* Add ClearRegisteredBenchmark() function.
Since benchmarks can be registered at runtime using the RegisterBenchmark(...)
functions, it makes sense to have a ClearRegisteredBenchmarks() function too,
that can be used at runtime to clear the currently registered benchmark and
re-register an entirely new set.
This allows users to run a set of registered benchmarks, get the output using
a custom reporter, and then clear and re-register new benchmarks based on the
previous results.
This fixes issue #400, at least partially.
* Remove unused change
* Fix#342: DoNotOptimize causes compile errors on older GCC versions.
DoNotOptimize uses inline assembly contraints to tell
the compiler what the type of the input variable. The 'g'
operand allows the input to be any register, memory, or
immediate integer operand. However this constraint seems
to be too weak on older GCC versions, and certain inputs
will cause compile errors.
This patch changes the constraint to 'X', which is documented
as "any operand whatsoever is allowed". This appears to fix
the issues with older GCC versions.
However Clang doesn't seem to like "X", and will attempt
to put the input into a register even when it can't/shouldn't;
causing a compile error. However using "g" seems to work like
"X" with GCC, so for this reason Clang still uses "g".
* Try alternative formulation to placate GCC
This thing with the pragma ignore was getting out of hand: now
MinGW (and probably GCC) was erroring too. So I chose to move
the definition of IsZero() out of the anonymous namespace into
benchmark.cc.
This is needed for examining the values of user counters (needed
for #348). It is also needed for checking the values of standard
benchmark results like items_processed or complexities (for example,
checking the standard deviation is needed for unit testing #357
as discussed in #362).
The tests are still missing a way to check actual validity of
numerical results; this will be done next. As they currently are,
the tests pass, but the problem detected with #378 is still
standing and the results with non-standard counters are wrong.
* fix android compilation
* checking __GLIBCXX__ and __GLIBCPP__ macro in addition to __ANDROID__
* using vsnprintf instead of std::vsnprintf to compile on Android
* removed __GLIBCPP__ check on Android
* StringPrintF instead of std::to_string for Android
* Add Benchmark::Iterations for explicitly specifying the number of iterations to use.
* Document that benchmark::Iterations should not be used to limit benchmark runtimes
In non-debug builds CMake automatically adds -DNDEBUG, this means
that uses of `assert` in the tests are disabled for non-debug builds.
Obviously we want these tests to run, regardless of configuration.
This patch strips -DNDEBUG during non-debug builds and adds
-UNDEBUG just to be sure.
fixes#354
The build fails with ICC17 because of warnings and Werror. What is the correct solution to fix it?
Should a patch
disable Werror for ICC (or maybe all non known compilers)
disable the false postive warnings for all files. This could be done using:
add_cxx_compiler_flag(-wd2102) #ICC17u2: Many false positives for Wstrict-aliasing
add_cxx_compiler_flag(-wd2259) #ICC17u2: non-pointer conversion from "long" to "int" may lose significant bits (even for explicit static cast, sleep.cc(44))
add_cxx_compiler_flag(-wd654) #ICC17u2: overloaded virtual function "benchmark::Fixture::SetUp" is only partially overridden (because of deprecated overload)
disable warnings at file level or some other granularity
* Added user counters, and move use of bytes_processed and items_processed to user counter logic.
Each counter is a string-value pair. The counters were
made available through the State class. Two helper virtual
methods were added to the Fixture class to allow convenient
initialization and termination of the counters: InitState()
and TerminateState(). The reporting of the counters is buggy
and is still a work in progress, to be completed in the next commits.
* fix bad removal of BenchmarkCounters code during the merge
* add myself to AUTHORS/CONTRIBUTORS
* fix printing to std::cout in csv_reporter
* bytes_per_second and items_per_second are now in the UserCounters class
* add user counters to json reporter
* moving bytes_per_second and items_per_second to their old state
* console reporter dealing ok with user counters.
* update unit tests for user counters
* CSVReporter now prints user counters too.
* cleanup user counters
* reverted changes to cmake files which should have gone into later commits
* fixture_test: fix gcc 4.6 compilation
* remove ctor with default argument
see https://github.com/google/benchmark/pull/262#discussion_r72298055
* use (auto-defined) BENCHMARK_HAS_CXX11 instead of BENCHMARK_INITLIST.
https://github.com/google/benchmark/pull/262#discussion_r72298310
* leanify counters API
Discussions:
API complexity: https://github.com/google/benchmark/pull/262#discussion_r72298731
remove std::string dependency (WIP): https://github.com/google/benchmark/pull/262#discussion_r72298142
spacing & alignment: https://github.com/google/benchmark/pull/262#discussion_r72298422
* remove std::string dependency on public API - changed counter name storage to char*
* Counter ctor: use overloads instead of default arguments
discussion:
https://github.com/google/benchmark/pull/262#discussion_r72298055
* Use raw pointers to remove dependency on std::vector from public API .
For more info, see discussion at https://github.com/google/benchmark/pull/262#discussion_r72319678 .
* Move counter implementation from benchmark.cc to counter.cc.
See discussion: https://github.com/google/benchmark/pull/262#discussion_r72298980 .
* Remove unused (commented-out) code.
* Moved thread counters to ThreadStats.
* Counters: fixed copy and move constructors.
* Counter: use an inplace buffer for small names.
* benchmark_test: move counters test out of CXX11 preprocessor conditional.
* Counter: fix VS2013 compilation error in char[] initialization.
* Fix typo.
* Expose counters from State.
See discussion: https://github.com/google/benchmark/pull/262#issuecomment-237156951
* Changed counters interface to map-like.
* Fix printing of user counters in ConsoleReporter.
* Applied clang-format to counter.cc and console_reporter.cc.
Command was `clang-format -style=Google -i counter.cc console_reporter.cc`
I also applied to all other files, but the changes were very
far-reaching so I rolled those back.
* Rename Counter::Flags_e to Counter::Flags
* Fix use of reserved names in Counter and BenchmarkCounters.
* Counter: Fix move ctor bug + change order of members.
* Fixture: remove tentative methods InitState() and TerminateState().
* Update fixture_test to the new Fixture interface.
* BenchmarkCounters: fixed a bug in the move ctor. Remove call to CHECK_LT().
CHECK_LT() was making the size_t lookup take ~double the time of a string lookup!
* BenchmarkCounters: add option to not print zero counters (defaults to false).
* Add test to compare counter storage and access with std::map.
* README: clarify cost of counter access modes.
* move counter access test to an own test.
* BenchmarkCounters: add move Insert()
* Counters access test: add accelerated lookup by name.
* Fix old range syntax.
* Fix missing include of cstdio
* Fix Visual Studio warning
* VS2013 and lower: fix use of snprintf()
* VS2013: fix use of char[] as a member of std::pair<>.
* change counter storage to std::map
* Remove skipZeroCounters logic
* Fix VS compilation error.
* Implemented request changes to PR #262.
* PR #262: More requested changes.
* README: cleanup counter text.
* PR #262: remove clang-format changes for preexisting code
* Complexity+Counters: fix counter flags which were being ignored.
* Document all Counter::Flag members
* fixed loss of counter values
* ConsoleReporter: remove tabular printing of user counters.
* ConsoleReporter: header printing should not be contingent on user counter names.
* Minor white space and alignment fixes.
* cxx03_test + counters: reuse the BM_empty() function.
* user counters: add note to README on how counters are gathered across threads
* Test bytes_per_second and items_per_second.
* Test SetLabel.
* Reformat.
* Make State::error_occurred_ private.
* Fix tests with floats.
* Merge private blocks