* Build libpfm as a dependency to allow collection of perf counters
This commit builds libpfm using rules_foreign_cc and lets the default
build of the benchmark library support perf counter collection without
needing additional work from users.
Tested with a custom target:
```
bazel run \
--override_repository=com_github_google_benchmark=/home/raghu/benchmark \
-c opt :test-bench -- "--benchmark_perf_counters=INSTRUCTIONS,CYCLES"
Using profile: local
<snip>
----------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------
BM_Test 0.279 ns 0.279 ns 1000000000 CYCLES=1.00888 INSTRUCTIONS=2
```
Signed-off-by: Raghu Raja <raghu@enfabrica.net>
* Adding myself to the CONTRIBUTORS file per CLA guidance
Enfabrica has already signed a corporate CLA.
Signed-off-by: Raghu Raja <raghu@enfabrica.net>
Signed-off-by: Raghu Raja <raghu@enfabrica.net>
Adds qualifiers for Python 3.9-3.11 indicating them being supported in the Python bindings building.
Support for Python 3.6 was removed, so the indicator for Python 3.6 was removed.
Some linux distributions no longer provide `python` binary and require
usage of `python3` instead. This changes the scripts here and uses
cmake `find_package(Python3` when running python.
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
This patch fixes compilation on Solaris, addressing the problems reported
in Issue #1499:
* Provide `HOST_NAME_MAX` definition.
* Match `sysconf(3C)` return type.
* Avoid `-Wcast-qual` warnings with `libkstat(3KSTAT)` functions.
* Avoid clash with `<floatingpoint.h>` `single` typedef.
This commit bumps the pybind11 version to 2.10.0, which is the first
pybind version coming with Python 3.11 support. This change is necessary
to facilitate wheel builds for Python 3.11 and upward, as changes to
Python internals in 3.11 broke compatibility with older pybind11
versions.
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
* Discuss sources of variance in the user guide
* Mention cpufreq/boost
* Pull variance material into a new document
Add reducing_variance.md as a place to discuss things related to variance
and, in the future, statistical interpretation of benchmark results.
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
The pandas.Timedelta class truncates to integral nanoseconds, which throws
away sub-nanosecond precision present in benchmark JSON. Switch to
floating point multiplication, which preserves it.
Fixes#1482
Tentatively fixes#1477.
* Eliminate usage of deprecated API in sysinfo.cc
The `std::wstring_convert` is deprecated in C++17.
Since this code is in the windows branch, we could use the win32 API (MultiByteToWideChar)
* ran clang-format
This commit enables arm64 Linux wheel builds for Python.
It also changes the build procedure on Linux using
cibuildwheel in GitHub Actions. Instead of the more granular, verbose
approach that was used until now, we opt for the GitHub Action released
by cibuildwheel directly.
We also change the Bazel install procedure in the manylinux Docker
container image. Previously, Bazel was installed from an added RHEL repo, since that is
the recommended official way of installing Bazel on CentOS platforms.
However, the last successful build available for manylinux2014 has been Bazel 4,
which is showing its age with the release of Bazel 6 coming up as of this commit.
After this change, prebuilt Bazel binaries are downloaded using
wget directly from the Bazel GitHub release page. Since Bazel is built
for both x86 and arm64 on Linux, we immediately gain wheel build
support for these architectures. However, since the architecture
of the manylinux image is aarch64 instead of arm64,
a shell script was added that normalizes aarch64 to arm64,
and installs the correct arm64 Bazel binary if necessary.
* Ensure we don't need benchmark installed to pass c++ feature checks
Requires removal of some dependencies on benchmark.h from internal
low-level headers, which is a good thing.
Also added better logging to the feature check cmake module.
* Explicitly cast int literals to int8_t in tests so silence implicit-conversion warnings
Error came from:
```
: error: implicit conversion loses integer precision: 'const int' to 'const signed char' [-Werror,-Wimplicit-int-conversion]
```
* clang format
* undo deleted line
* Fixed build issues on window
- Added missing dlimport/export attributes in function definitions. (They are needed in both decls and defs)
- Removed dlimport/dlexprt attribute in private field. (global_context is not exported anywhere).
* fixed incorrect include path
* undo changes w.r.t HelperPrintf
* removed forward decl of private variable - instead, introduce a getter and use it.
* Removed forward decl from benchmark_gtest too
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
* Stop generating the export header and just check it in
* format the new header
* support windows
* format the header again
* avoid depending on internal macro
* ensure we define the right thing for windows static builds
* support older cmake
* and for tests
* attempt to fix sanitizer builds by moving away from llvm head
* extra verbosity
* try clang 13 and add extra logging
* get latest clang and try again
Non-const DoNotOptimize() can't compile when used with some types.
Example of code which can't compile:
char buffer3[3] = "";
benchmark::DoNotOptimize(buffer3);
Error message:
error: impossible constraint in 'asm'
asm volatile("" : "+r"(value) : : "memory");
Introduced in 8545dfb (Fix DoNotOptimize() GCC copy overhead (#1340) (#1410))
The cause is compiler can't work with the +r constraint for types that can't
be placed perfectly in registers. For example, char array[3] can't be perfectly
fit in register on x86_64 so it requires placed in memory but constraint
doesn't allow that.
Solution
- Use +m,r constraint for the small objects so the compiler can decide to use
register or/and memory
- For the big objects +m constraint is used which allows avoiding extra copy
bug(see #1340)
- The same approach is used for the const version of DoNotOptimize()
although the const version works fine with the "r" constraint only.
Using mixed r,m constraint looks more general solution.
See
- Issue #1340 ([BUG] DoNotOptimize() adds overhead with extra copy of argument(gcc))
- Pull request #1410 (Fix DoNotOptimize() GCC copy overhead (#1340) #1410)
- Commit 8545dfb (Fix DoNotOptimize() GCC copy overhead (#1340) (#1410))
* Fix DoNotOptimize() GCC copy overhead (#1340)
The issue is that GCC DoNotOptimize() does a full copy of an argument
if it's not a pointer and it slows down a benchmark. If an argument is big
enough there is a memcpy() call for copying the argument. An argument
object can be a big object so DoNotOptimize() could add sufficient
overhead and affects benchmark results.
The cause is in GCC behavior with asm volatile constraints. Looks like GCC
trying to use r(register) constraint for all cases despite object size.
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105519
The solution is the split DoNotOptimize() in two cases - value fits
in register and value doesn't fit in register. And use case specific
asm constraint. std::is_trivially_copyable trait is needed because
"+r" constraint doesn't work with non trivial copyable objects.
- Fix requires support C++11 feature std::is_trivially_copyable from GCC
compiler. The feature has been supported since GCC 5
- Fallback for GCC version < 5 still exists but it uses "m" constraint
which means a little bit more overhead in some cases
- Add assembly tests for issued cases
Fixes#1340
* Add supported compiler versions info for assembly tests
- Assembly tests are inherently non-portable. So explicitly add GCC
and Clang versions required for reliable tests passed
- Write a warning message if the current compiler version isn't supported
* Add possibility to ask for libbenchmark version number (#1004)
Add a header which holds the current major, minor, and
patch number of the library. The header is auto generated
by CMake.
* Do not generate unused functions (#1004)
* Add support for version number in bazel (#1004)
* Fix clang format #1004
* Fix more clang format problems (#1004)
* Use git version feature of cmake to determine current lib version
* Rename version_config header to version
* Bake git version into bazel build
* Use same input config header as in cmake for version.h
* Adapt the releasing.md to include versioning in bazel
* add multiple OSes to bazel workflow
* correct indent
* only set copts when they're supported by the OS
* os check should work
* pull out cxx03_test for per-platform stuff
* attempt to fix windows test output
Report all time numbers > 10 digits in scientific notation with
4 decimal places. This is necessary since only 10 digits
are currently reserved for the time columns (Time and CPU).
If exceeding 10 digits the output isnt properly aligned anymore.