* Add enum value from newest Windows SDK
Windows SDK version 10.0.26100.0 adds a cache type value, `CacheUnknown`. This adds a case for that type to `sysinfo.cc`, which will otherwise complain about the switch statement being non-exhaustive when building with the new SDK.
Since the value doesn't exist in prior SDK versions, we only add the case conditionally. The condition can be removed if we ever decide to bump up the required SDK version.
* Fix SDK version macro
Make sure the version macro we're using for the SDK is properly indicative of version 10.0.26100.0. Also fix formatting complains from the linter.
* Add space to satisfy formatter
Formatter insists on two space before a comment after a macro...
* Change preprocessor condition
Try detecting the current SDK version in a slightly different way.
* Replace NTDDI_WIN11_GE with its value
Undefined constants are treated as 0 by the preprocessor, which causes the check to trivially return true for previous SDK versions. Replace the constant with its value (from the newest SDK version) instead,
* Added benchmark_dry_run boolean flag to command line options
* Dry run logic to exit early and override iterations, repetitions, min time, min warmup time
* Changeddry run override logic structure and added dry run to context
---------
Co-authored-by: Shaan <shaanmistry03@gmail.com>
Co-authored-by: Shaan Mistry <49106143+Shaan-Mistry@users.noreply.github.com>
* Supply MacOS deployment target to delocate, use build+uv frontend
This shaves off multiple minutes from the wheel builds alone.
Also revert to trusted publishing for wheel uploads as it is now set up.
* Bump oldest supported Python to 3.10, eliminate setuptools-scm
The version is now a string again, under the same attribute as it was
before. This is a pragmatic decision in order to be able to upload wheels
again, possibly directly from main.
We could in the future also set the Python version to a development version
if we want to avoid accidental uploads of `main`.
* Add a note on supported Python versions in the docs
Also fixes the setuptools failure observed in the latest CI by pinning
to the last version before v73 until the problem is identified and resolved.
* Fix C4459: Rename a function parameter `profiler_manager` to avoid hiding the global declaration.
* Treat warnings as errors for MSVC
* disable one warning for MSVC
* Align benchmark::State to a cacheline.
This can avoid interference with neighboring objects and stabilize
benchmark results.
* separate cachline definition from alignment attribute macro
Co-authored-by: Roman Lebedev <lebedev.ri@gmail.com>
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Co-authored-by: Roman Lebedev <lebedev.ri@gmail.com>
According to the user guide, when manual timing, it is necessary to explicit it by using the `UseManualTime` function. Its equivalent in Python is use_manual_time(). This function was not called in the example.
It is possible to verify that the use of this function has an impact on the measure by adding another `time.sleep(0.01)` at the end of the iteration. There is a x2 difference depending on whether `use_manual_time()` is used or not.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Previously, the Start/Stop routines were called before the benchmark function
was called and after it returned. However, what we really want is for them
to be called within the core of the benchmark:
for (auto _ : state) {
// This is what we want traced, not the entire BM_foo function.
}
Adds a stub file for the `google_benchmark._benchmark` submodule,
generated with the new `nanobind_stubgen` rule released in nanobind_bazel
v2.1.0.
Tweaks the setup.py logic a little bit to package stub files with the
rest of the build artifacts. Also explicitly adds the generated stub and
marker files to the list of package data artifacts.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
For several compilers, `benchmark.cc` suppresses a warning regarding its use of `offsetof`. This merely extends that suppression to cover clang as well.
Until the PyPI account is recovered, it should be possible to upload
wheels with the GitHub secrets that were previously used.
Changes the PyPI upload action sourcing to point to the v1 stable release
branch, which receives rolling updates and is the canonical way of
including the wheel publishing action.
Uploading will probably need another release, because setuptools_scm
needs to produce a clean tag that the PyPI API allows as an upload.
Incorporates the nanobind_bazel change from https://github.com/google/benchmark/pull/1795.
nanobind 2.0 reworked the nanobind::enum_ class so it uses a real Python enum or intenum rather than its previous hand-rolled implementation.
https://nanobind.readthedocs.io/en/latest/changelog.html#version-2-0-0-may-23-2024
As a consequence of that change, nanobind now checks when casting an integer to a enum value that the integer corresponds to a valid enum. Counter::Flags is a bitmask, and many combinations are not valid enum members.
This change:
a) sets nb::is_arithmetic(), which means Counter::Flags becomes an IntEnum that can be freely cast to an integer.
b) defines the | operator for flags to return an integer, not an enum, avoiding the error.
c) changes Counter's constructor to accept an int, not a Counter::Flags enum. Since Counter::Flags is an IntEnum now, it can be freely coerced to an int.
If https://github.com/wjakob/nanobind/pull/599 is merged into nanobind, then we can perhaps use a flag enum here instead.
This API is akin to the MemoryManager API and lets tools provide
their own profiler which is wrapped in the same way MemoryManager is
wrapped. Namely, the profiler provides Start/Stop methods that are called
at the start/end of running the benchmark in a separate pass.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Clarify BENCHMARK_REGISTER_F
Add comments highlighting the difference between `BENCHMARK_TEMPLATE_F` and `BENCHMARK_TEMPLATE_DEFINE_F`, mirroring those of `BENCHMARK_F ` and `BENCHMARK_DEFINE_F`.
* More informative comments.
* Update user_guide.md
* Add -lkstat to the .pc for Solaris
This fixes linking for projects that rely on pkg-config to generate the
link line on Solaris.
Test plan: Built the project locally on Solaris and verified -kstat
appears in the .pc file
```
$ cat lib/pkgconfig/benchmark.pc | grep Libs.private
Libs.private: -lpthread -lkstat
```
* Use BENCHMARK_PRIVATE_LINK_LIBRARIES
Currently, Google Benchmark can be built and installed with support for
libpfm, but this can cause a problem if that installation is later
called upon by another CMake project. Indeed, while the installed CMake
configuration script correctly identifies that it needs to link against
libpfm, it doesn't try to find libpfm, meaning that the target will be
unavailable. This commit fixes this potential configuration-time error
by ensuring that an installation of Google Benchmark will correctly try
to find the libpfm dependency when it is used elsewhere.
#1742 changed the placeholder version from `0.0.0` to `v0.0.0`,
but this line which was further dealing with it, was not updated.
Fixes https://github.com/google/benchmark/issues/1792
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
The customization done via BENCHMARK_OS_QURT works just fine with the Hexagon simulator, but on at least some Hexagon hardware, both `qurt_timer_get_ticks()` and `std::chrono::now()` are broken and always return 0. This fixes the former by using the better-supported (and essentially identical `qurt_sysclock_get_hw_ticks()` call, and the latter by reading a 19.2MHz hardware counter (per suggestion from Qualcomm). Local testing seems to indicate these changes are just as robust under the simulator as before.
It is now possible to build Mac wheels on native machines in Github
Actions, so ARM64 Mac wheels are now built and tested on M1 machines.
Also, the artifact up-/download was migrated to v4, which made it
necessary to upload wheels to unique artifact names, and then later
stitch them together again in a subsequent job.
The cross-platform Mac build injection in setup.py was removed,
since it is no longer necessary.
I relanded a monkey-patching of Bazel build files, this time for
MODULE.bazel. This is because `rules_python` does not allow running
as the root user, which is the case in cibuildwheel+Linux (happens
in a Docker container). Since I did not see a quick way of switching
to rootless containers, and did not want to hardcode the config change
(it can apparently cause cache misses and build failures), I inject the
"ignore_root_user_error" flag into the MODULE.bazel file when running
in cibuildwheel on Linux.
This comes following the first BCR release of nanobind_bazel.
Feature-wise, nothing substantial has changed, except that the extensions
are stripped of debug info when built in release mode, which reduces
clutter in the symbol tables.
No stubgen yet, since nanobind v2 has not been released yet.
Adjusted the GetSysctl call in sysinfo.cc to ensure the frequency
value is returned as a double rather than an integer. This helps
maintain consistency and clarity in the codebase.