Commit graph

21 commits

Author SHA1 Message Date
Matt Armstrong 69054ae50e
Use fewer ramp up repetitions when KeepRunningBatch is used (#1113)
Use the benchmark's reported iteration count when estimating
iterations for the next repetition, rather than the requested
iteration count.  When the benchmark uses KeepRunningBatch the actual
iteration count can be larger than the one the runner requested.

Prior to this fix the runner was underestimating the next iteration
count, sometimes significantly so.  Consider the case of a benchmark
using a batch size of 1024.  Prior to this change, the benchmark
runner would attempt iteration counts 1, 10, 100 and 1000, yet the
benchmark itself would do the same amount of work each time: a single
batch of 1024 iterations.  The discrepancy could also contribute to
estimation errors once the benchmark time reached 10% of the target.
For example, if the very first batch of 1024 iterations reached 10% of
benchmark_min_min time, the runner would attempt to scale that to 100%
from a basis of one iteration rather than 1024.

This bug was particularly noticeable in benchmarks with large batch
sizes, especially when the benchmark also had slow set up or tear down
phases.

With this fix in place it is possible to use KeepRunningBatch to
achieve a kind of "minimum iteration count" feature by using a larger
fixed batch size.  For example, a benchmark may build a map of 500K
elements and test a "find" operation.  There is no point in running
"find" just 1, 10, 100, etc., times.  The benchmark can now pick a
batch size of something like 10K, and the runner will arrive at the
final max iteration count with in noticeably fewer repetitions.
2021-04-20 07:16:05 +01:00
Roman Lebedev f92903cc53
Iteration counts should be uint64_t globally. (#817)
This is a shameless rip-off of https://github.com/google/benchmark/pull/646
I did promise to look into why that proposed PR was producing
so much worse assembly, and so i finally did.

The reason is - that diff changes `size_t` (unsigned) to `int64_t` (signed).

There is this nice little `assert`:
7a1c370283/include/benchmark/benchmark.h (L744)
It ensures that we didn't magically decide to advance our iterator
when we should have finished benchmarking.

When `cached_` was unsigned, the `assert` was `cached_ UGT 0`.
But we only ever get to that `assert` if `cached_ NE 0`,
and naturally if `cached_` is not `0`, then it is bigger than `0`,
so the `assert` is tautological, and gets folded away.

But now that `cached_` became signed, the assert became `cached_ SGT 0`.
And we still only know that `cached_ NE 0`, so the assert can't be
optimized out, or at least it doesn't currently.

Regardless of whether or not that is a bug in itself,
that particular diff would have regressed the normal 64-bit systems,
by halving the maximal iteration space (since we go from unsigned counter
to signed one, of the same bit-width), which seems like a bug.
And just so it happens, fixing *this* bug, fixes the other bug.

This produces fully (bit-by-bit) identical state_assembly_test.s
The filecheck change is actually needed regardless of this patch,
else this test does not pass for me even without this diff.
2019-05-13 12:33:11 +03:00
Samuel Panzer ce3fde16cb Return 0 from State::iterations() when not yet started. (#598)
* Return a reasonable value from State::iterations() even before starting a benchmark

* Optimize State::iterations() for started case.
2018-05-24 10:33:19 +01:00
Eric Fiselier 858688b845 Ensure std::iterator_traits<StateIterator> instantiates.
Due to ADL lookup performed on the begin and end functions
of `for (auto _ : State)`, std::iterator_traits may get
incidentally instantiated. This patch ensures the library
can tolerate that.
2018-02-21 00:54:19 -07:00
Samuel Panzer 296ec5693e Support State::KeepRunningBatch(). (#521)
* Support State::KeepRunningBatch().

State::KeepRunning() can take large amounts of time relative to quick
operations (on the order of 1ns, depending on hardware). For such
sensitive operations, it is recommended to run batches of repeated
operations.

This commit simplifies handling of total_iterations_. Rather than
predecrementing such that total_iterations_ == 1 signals that
KeepRunning() should exit, total_iterations_ == 0 now signals the
intention for the benchmark to exit.

* Create better fast path in State::KeepRunningBatch()

* Replace int parameter with size_t to fix signed mismatch warnings

* Ensure benchmark State has been started even on error.

* Simplify KeepRunningBatch()
2018-02-09 21:57:04 -07:00
Louis Dionne 5b2c08668c Enforce using a semicolon after BENCHMARK_MAIN to remove compiler warnings (#495) 2017-12-03 18:45:07 -07:00
Eric 25acf220a4 Refactor most usages of KeepRunning to use the perfered ranged-for. (#459)
Recently the library added a new ranged-for variant of the KeepRunning
loop that is much faster. For this reason it should be preferred in all
new code.

Because a library, its documentation, and its tests should all embody
the best practices of using the library, this patch changes all but a
few usages of KeepRunning() into for (auto _ : state).

The remaining usages in the tests and documentation persist only
to document and test behavior that is different between the two formulations.

Also note that because the range-for loop requires C++11, the KeepRunning
variant has not been deprecated at this time.
2017-10-17 12:17:02 -06:00
Eric 0526755944 Add C++11 Ranged For loop alternative to KeepRunning (#454)
* Add C++11 Ranged For loop alternative to KeepRunning

As pointed out by @astrelni and @dominichamon, the KeepRunning
loop requires a bunch of memory loads and stores every iterations,
which affects the measurements.

The main reason for these additional loads and stores is that the
State object is passed in by reference, making its contents externally
visible memory, and the compiler doesn't know it hasn't been changed
by non-visible code.

It's also possible the large size of the State struct is hindering
optimizations.

This patch allows the `State` object to be iterated over using
a range-based for loop. Example:

void BM_Foo(benchmark::State& state) {
	for (auto _ : state) {
		[...]
	}
}

This formulation is much more efficient, because the variable counting
the loop index is stored in the iterator produced by `State::begin()`,
which itself is stored in function-local memory and therefore not accessible
by code outside of the function. Therefore the compiler knows the iterator
hasn't been changed every iteration.

This initial patch and idea was from Alex Strelnikov.

* Fix null pointer initialization in C++03
2017-10-10 08:56:42 -07:00
Eric 9d4b719dae Make Benchmark a single header library (but not header-only) (#407)
* Make Benchmark a single header library (but not header-only)

This patch refactors benchmark into a single header, to allow
for slightly easier usage.

The initial reason for the header split was to keep C++ library
components from being included by benchmark_api.h, making that
part of the library STL agnostic. However this has since changed
and there seems to be little reason to separate the reporters from
the rest of the library.

* Fix internal_macros.h

* Remove more references to macros.h
2017-07-04 16:31:47 -06:00
Dominic Hamon 1100e91907 Simplify clang-format and apply to tests (#302) 2016-10-07 11:04:50 -07:00
Marcin Kolny dfe0260754 Support multiple ranges in the benchmark (#257)
* Support multiple ranges in the benchmark

google-benchmark library allows to provide up to two ranges to the
benchmark method (range_x and range_y). However, in many cases it's not
sufficient. The patch introduces multi-range features, so user can easily
define multiple ranges by passing a vector of integers, and access values
through the method range(i).

* Remove redundant API

Functions State::range_x() and State::range_y() have been removed. They should
be replaced by State::range(0) and State::range(1).
Functions Benchmark::ArgPair() and Benchmark::RangePair() have been removed.
They should be replaced by Benchmark::Args() and Benchmark::Ranges().
2016-08-04 12:30:14 -07:00
izaid 14c4ce5ebf Removed unneeded #include <cstddef> 2015-10-06 15:10:15 +01:00
Eric Fiselier 9d47552020 merge new changes in basic_test.cc 2015-03-27 16:37:53 -04:00
Eric Fiselier e428b9eec3 Add 'benchmark::DoNotOptimize(...)' to help users prevent optimizations 2015-03-27 16:35:46 -04:00
Eric Fiselier 725f1f066d add walltime benchmark and fix unused variable. 2015-03-26 14:56:52 -04:00
Eric Fiselier be993acbb3 remove unneeded test repetitions 2015-03-26 14:33:48 -04:00
Eric Fiselier d0df7ef2de cleanup whitespace and includes 2015-03-26 14:30:16 -04:00
Eric Fiselier a3308c6d86 Introduce use of std::chrono and remove Walltime printing 2015-03-26 14:26:07 -04:00
Eric Fiselier d71660227b fix minimal_benchmark.h include 2015-03-17 14:02:02 -04:00
Eric Fiselier ef79d0af9f move basic_test over to minimal benchmark 2015-03-17 13:48:33 -04:00
Eric Fiselier 7a767012f1 Adopt new benchmark timing internals.
This patch adopts a new internal structure for how timings are performed.
Currently every iteration of a benchmark checks to see if it has been running
for an appropriate amount of time. Checking the clock introduces noise into
the timings and this can cause inconsistent output from each benchmark.

Now every iteration of a benchmark only checks an iteration count to see if it
should stop running. The iteration count is determined before hand by testing
the benchmark on a series of increasing iteration counts until a suitable count
is found. This increases the amount of time it takes to run the actual benchmarks
but it also greatly increases the accuracy of the results.

This patch introduces some breaking changes. The notable breaking changes are:
1. Benchmarks run on multiple threads no generate a report per thread. Instead
   only a single report is generated.
2. ::benchmark::UseRealTime() was removed and replaced with State::UseRealTime().
2015-03-12 18:03:33 -04:00