A simple framework for compile-time benchmarks

Overview

Metabench Travis status Appveyor status

A simple framework for compile-time microbenchmarks

Overview

Metabench is a single, self-contained CMake module making it easy to create compile-time microbenchmarks. Compile-time benchmarks measure the performance of compiling a piece of code instead of measuring the performance of running it, as regular benchmarks do. The micro part in microbenchmark means that Metabench can be used to benchmark precise parts of a C++ file, such as the instantiation of a single function. Writing benchmarks of this kind is very useful for C++ programmers writing metaprogramming-heavy libraries, which are known to cause long compilation times. Metabench was designed to be very simple to use, while still allowing fairly complex benchmarks to be written.

Metabench is also a collection of compile-time microbenchmarks written using the metabench.cmake module. The benchmarks measure the compile-time performance of various algorithms provided by different metaprogramming libraries. The benchmarks are updated nightly with the latest version of each library, and the results are published at http://metaben.ch.

Requirements

Metabench requires CMake 3.1 or higher and Ruby 2.1 or higher. Metabench is known to work with CMake's Unix Makefiles and Ninja generators.

Usage

To use Metabench, make sure you have the dependencies listed above and simply drop the metabench.cmake file somewhere in your CMake search path for modules. Then, use include(metabench) to include the module in your CMake file, add individual datasets to be benchmarked using metabench_add_dataset, and finally specify which datasets should be put together into a chart via metabench_add_chart. For example, a minimal CMake file using Metabench would look like:

# Make sure Metabench can be found when writing include(metabench)
list(APPEND CMAKE_MODULE_PATH "path/to/metabench/directory")

# Actually include the module
include(metabench)

# Add new datasets
metabench_add_dataset(dataset1 "path/to/dataset1.cpp.erb" "[1, 5, 10]")
metabench_add_dataset(dataset2 "path/to/dataset2.cpp.erb" "(1...15)")
metabench_add_dataset(dataset3 "path/to/dataset3.cpp.erb" "(1...20).step(5)")

# Add a new chart
metabench_add_chart(chart DATASETS dataset1 dataset2 dataset3)

This will create a target named chart, which, when run, will gather benchmark data from each dataset and output JSON files for easy integration with other tools. A HTML file is generated for easy visualization of the datasets as a NVD3 chart. To understand what the path/to/datasetN.cpp.erb files are, read what follows.

The principle

Benchmarking the compilation time of a single .cpp file is rather useless, because one could simply run the compiler and time that single execution instead. What is really useful is to have a means of running variations of the same .cpp file automatically. For example, we might be interested in benchmarking the compilation time for creating a std::tuple with many elements in it. To do so, we could write the following test case:

#include <tuple>

int main() {
    auto tuple = std::make_tuple(1, 2, 3, 4, 5);
}

We would run the compiler and time the compilation, and then change the test case by augmenting the number of elements in the tuple:

#include <tuple>

int main() {
    auto tuple = std::make_tuple(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}

We would measure the compilation time for this file, and repeat the process until satisfactory data has been gathered. This tedious task of generating different (but obviously related) .cpp files and running the compiler to gather timings is what Metabench automates. It does this by taking a .cpp.erb file written using the ERB template system, and generating a family of .cpp files from that template. It then compiles these .cpp files and gathers benchmark data from these compilations.

Concretely, you start by writing a .cpp.erb file (say std_tuple.cpp.erb) that may contain ERB markup:

#include <tuple>

int main() {
    auto tuple = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
}

Code contained inside <%= ... %> is just normal Ruby code. When the file will be rendered, the contents of <%= ... %> will be replaced with the result of evaluating this Ruby code, which will look like:

#include <tuple>

int main() {
    auto tuple = std::make_tuple(1, 2, 3, ..., n);
}

The ERB markup language has many other features; we encourage readers to take a look at the Wikipedia page. What happens is that Metabench will generate a .cpp file for different values of n, and will gather benchmark data for each of these values. Now, this isn't the whole story. More often than not, we're only interested in benchmarking part of a C++ file. Indeed, if we benchmark the whole file in our example above, we'll end up measuring the time required to #include the <tuple> header in addition to the time required for creating the std::tuple. While this might be negligible in our example, this situation arises in nontrivial examples, and would make the resulting data nearly worthless. Hence, we have to tell Metabench what part(s) of the file it should measure. This is done by guarding the relevant part(s) of the code with a preprocessor #if:

#include <tuple>

int main() {
#if defined(METABENCH)
    auto tuple = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
#endif
}

What Metabench will actually do is compile the file once with the macro defined (and hence with the content of the block), and once without it. It will then subtract the time for compiling the file without the content of the block to the time for compiling the whole file, which should represent a good approximation of the time for compiling what's inside the block.

On the C++ side of things, the .cpp file will be compiled (to benchmark it) as if it were located in the directory containing the .cpp.erb file, so that relative include paths can be used. Furthermore, it will be compiled as if the .cpp file were part of a CMake executable added in the same directory as the call to metabench_add_dataset. This way, any variable or property set in CMake will also apply when benchmarking the file. In other words, Metabench tries to create the illusion that the code is actually compiled as if it were written in the .cpp.erb file.

This is it for the basic usage of the module! The example/ directory contains a fully working example of using Metabench to create benchmarks. For a more involved example, you can take a look at the benchmark suite in the benchmark/ directory. Note that only the most basic usage of Metabench was covered here. To know all the features provided by the module, you should read the reference documentation provided as comments inside the CMake module.

A note on benchmark resolution

Like any measurement tool, Metabench has a limited resolution. For example, when the code being measured (inside the #ifdef METABENCH/#endif pair) takes only a few milliseconds to compile, the timings reported by Metabench may be completely inside the noise. Typically, the resolution of timings taken by Metabench is similar to that of the time command. A good technique to make sure the results of a benchmark are not inside the noise is to reduce the relative uncertainty of the measurement. This can be done by increasing the total compilation time of the measured block, by repeating the same thing (or a similar one) multiple times:

#include <tuple>

int main() {
#if defined(METABENCH)
    auto tuple1 = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
    auto tuple2 = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
    auto tuple3 = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
    auto tuple4 = std::make_tuple(<%= (1..n).to_a.join(', ') %>);
#endif
}

History

Metabench was initially developed inside the Boost.Hana library as a mean to benchmark compile-time algorithms. After seeing that a self-standing framework would be useful to the general C++ community, it was decided to extract it into its own project.

License

Please see LICENSE.md.

Comments
  • Initial draft of a benchmark suite

    Initial draft of a benchmark suite

    This is a rough example of how I would imagine a shared benchmark suite as discussed in #35. Hana has a lot of these benchmarks already written, some for external libraries, so they could just be copied from there.

    feature 
    opened by ldionne 24
  • Add support for microbenchmarks

    Add support for microbenchmarks

    This PR is a WIP presenting an alternative implementation to #108. I prefer this implementation because it is simpler to use and seems much more useful. The only downside is that it might cause the benchmark suite to take longer to execute, but that probably isn't a deal breaker.

    feature 
    opened by ldionne 20
  • Metabench is measuring the wrong thing

    Metabench is measuring the wrong thing

    Very few people are manipulating type lists of > a few dozen elements. I find the current tests to be largely meaningless for real world metaprograms. It would be more interesting to run the tests up to, say, 100, but repeatedly and with lots of different types to force the number of instantiations up.

    In addition, benchmarks are meaningless if the total CPU time is less than about 5s. So, for each N in [0,100] each test should keep adding unique instantiations of the algorithm being measured until the CPU time is high enough and then divide by the number of instantiations to get an accurate idea of the performance profile of that algorithm at that N.

    That way, we can really start the work of optimizing our libs for the real world.

    opened by ericniebler 18
  • Improve the resolution of benchmark measurements

    Improve the resolution of benchmark measurements

    This is a work in progress towards #148.

    The idea is to impose a common bias to all benchmarks in order to increase compilation times thus allowing for better measurements.

    opened by brunocodutra 16
  • Allow specifying a title in add_benchmark

    Allow specifying a title in add_benchmark

    While this may not be useful for the benchmark suite because of the way http://ldionne.com/metabench is setup, I need this capability for some uses of Metabench in presentations.

    feature 
    opened by ldionne 13
  • Use the median-of-3 when gathering timings

    Use the median-of-3 when gathering timings

    With this PR, the timings are done 3 times for each n in the range, and the median of the 3 timings is taken. This is an experiment to check whether the timings appear to be more reliable this way, as discussed in #69. Note that running the benchmarks now takes 3 times as long, which could be a deal breaker.

    feature 
    opened by ldionne 12
  • Made metabench use the system + user time for his measurements instea…

    Made metabench use the system + user time for his measurements instea…

    …d of just the raw time. This means that the time will be less influenced by other processes runnig on the system. Idea for this commit came from @chieltbest

    opened by CvRXX 11
  • [gh-pages] Should we smooth benchmark results?

    [gh-pages] Should we smooth benchmark results?

    Right now charts are displayed exactly as measured and are thus subject to variations related to thread scheduling and other SO affairs that result in undesirable spikes.

    Activating basis or bundle interpolation on NVD3 should smooth out most spikes.

    feature gh-pages 
    opened by brunocodutra 11
  • the download of highcharts.js shouldn't fail silently

    the download of highcharts.js shouldn't fail silently

    if the download fails we could either:

    1. error out
    2. emit some warning and do not generate the html file
    3. fall back to loading highcharts at the time of visualization
    4. fall back to some hardcoded version of highcharts

    I implemented number 3 so at least we try providing the visualization, but perhaps we should go for number 4 to make it completely independent of internet connection?

    opened by brunocodutra 11
  • [Appveyor] Properly find Ruby

    [Appveyor] Properly find Ruby

    Right now, the FindRuby module is unable to find the ruby library. To workaround this, we currently set the RUBY_LIBRARY CMake variable explicitly in .appveyor.yml. This PR is an attempt to fix this.

    bug 
    opened by ldionne 10
  • Consider adding high-precision microbenchmarks for small sequences

    Consider adding high-precision microbenchmarks for small sequences

    Forking #124 to discuss high precision microbenchmarks for small inputs more precisely. The idea is that we'd like to have a precise view of the behavior of algorithms on small sequences, because this is how they are mostly (but not only) used in the wild. I see two different approaches:

    1. For each sequence length N, generate K different sequences of length N and call the algorithm once on each sequence. By having a sufficiently large K, the total compilation time is increased and the relative error is reduced.
    2. Same as (1), but then divide the result by K to find an approximation of the absolute time taken for a single algorithm. I have reservations with this approach, because the compilation time as we increase K (the number of small sequences) is not necessarily linear.
    feature 
    opened by ldionne 7
  • Provide more `aspects` based on compiler statistics (-Xclang -print-stats)

    Provide more `aspects` based on compiler statistics (-Xclang -print-stats)

    This is just an idea I would like to share with you.

    Clang prints several compilation statistics when supplied with the flags -Xclang -print-stats. See this sample output for an empty cpp file. This could be used to provide more aspects for the benchmarks (number of types, more reliable memory footprints(?), ...). I hacked together a clang-only proof of concept for measuring the type count, but I am not planning to continue with it myself. My first measurements of the type count show low-noise data, and there is some correlation with the compilation times.

    There should be a way (dump and analyze AST?) to obtain similar statistics from other compilers, too.

    opened by ecrypa 0
  • implement older compilers (clang 3.[0..5] and gcc 4.7)

    implement older compilers (clang 3.[0..5] and gcc 4.7)

    looking at the libs on here there seems to be a lot of support for compilers which are older than those currently offered. I am looking to add a legacy mode to boost.tmp (same front end different back end) using this pattern https://godbolt.org/g/W62iut and I would like to bench it against others on here (initially its probably going to suck). would it be feasible to add a few more clang versions and g++-4.7 ?

    mp11: g++ 4.7 or later clang++ 3.3 or later Visual Studio 2013, 2015, 2017

    metal: GCC | 4.7 | Ubuntu 14.04 LTS Clang | 3.4 | Ubuntu 14.04 LTS Clang | 3.5 | Ubuntu 14.04 LTS

    meta: clang >= 3.4

    brigand: clang >= 3.4

    opened by odinthenerd 2
  • MapAsTuple faster than hana.map, but significant differences in graph over time

    MapAsTuple faster than hana.map, but significant differences in graph over time

    The two attached screen shots show:

    https://github.com/cppljevans/composite_storage/blob/master/benchmark/src/CMakeLists.txt

    but without the large time of the hana.make_find dataset (it's largeness largely obscured details of other datasets). The 2 screen shots show the result of the same test run a few minutes apart. What the 2 screen shots show is that the MapAsTuple method invariably and significantly runs faster than the hane.map method. The two screen shots show there's also significant variation in the shape of curves, indicating noise, I guess.

    IIRC, one of your pages lamented how hana map was slow, and hopefully this will give some clue about how to fix that.

    -regards, Larry

    Run at 18:56 mapastuple-hana map-benchmark screenshot-2018-05-26 1856 Run at 19:06 mapastuple-hana map-benchmark screenshot-2018-05-26 1906

    opened by cppljevans 0
  • For debugging purposes, preserve generated .cpp files

    For debugging purposes, preserve generated .cpp files

    It helps user to see what the templates do if the generated files are preserved so that user can look at them to assure himself the template is doing what he expects.

    The attached change accomplishes that. I suggest it as an enhancement. metabench.cmake.diff.txt

    opened by cppljevans 0
  • Hover info doesn't increment past the shortest dataset's max X value

    Hover info doesn't increment past the shortest dataset's max X value

    chart_point

    The hover info in this screenshot shows 280, even though the point is at the 300 vertical. This might be a bug in a different project, I don't know. Either way, it's really minor. live example

    Edit: love the tool, by the way.

    opened by badair 2
Owner
Louis Dionne
Math and programming enthusiast specialized in generic library design and C++ metaprogramming. Member of the C++ Standards Committee, Boost, and author of Hana.
Louis Dionne
A simple C++ 03/11/etc timer class for ~microsecond-precision cross-platform benchmarking. The implementation is as limited and as simple as possible to create the lowest amount of overhead.

plf_nanotimer A simple C++ 03/11/etc timer class for ~microsecond-precision cross-platform benchmarking. The implementation is as limited and simple a

Matt Bentley 102 Dec 4, 2022
A unit testing framework for C

Check Table of Contents About Installing Linking Packaging About Check is a unit testing framework for C. It features a simple interface for defining

null 926 Jan 2, 2023
The fastest feature-rich C++11/14/17/20 single-header testing framework

master branch Windows All dev branch Windows All doctest is a new C++ testing framework but is by far the fastest both in compile times (by orders of

Viktor Kirilov 4.5k Jan 5, 2023
A modern, C++-native, header-only, test framework for unit-tests, TDD and BDD - using C++11, C++14, C++17 and later (or C++03 on the Catch1.x branch)

Catch2 v3 is being developed! You are on the devel branch, where the next major version, v3, of Catch2 is being developed. As it is a significant rewo

Catch Org 16k Jan 8, 2023
A modern, C++-native, header-only, test framework for unit-tests, TDD and BDD - using C++11, C++14, C++17 and later (or C++03 on the Catch1.x branch)

Catch2 v3 is being developed! You are on the devel branch, where the next major version, v3, of Catch2 is being developed. As it is a significant rewo

Catch Org 16k Jan 8, 2023
C++ Benchmark Authoring Library/Framework

Celero C++ Benchmarking Library Copyright 2017-2019 John Farrier Apache 2.0 License Community Support A Special Thanks to the following corporations f

John Farrier 728 Jan 6, 2023
CppUTest unit testing and mocking framework for C/C++

CppUTest CppUTest unit testing and mocking framework for C/C++ More information on the project page Slack channel: Join if link not expired Getting St

CppUTest 1.1k Dec 26, 2022
A testing micro framework for creating function test doubles

Fake Function Framework (fff) A Fake Function Framework for C Hello Fake World! Capturing Arguments Return Values Resetting a Fake Call History Defaul

Mike Long 551 Dec 29, 2022
Googletest - Google Testing and Mocking Framework

GoogleTest OSS Builds Status Announcements Release 1.10.x Release 1.10.x is now available. Coming Soon Post 1.10.x googletest will follow Abseil Live

Google 28.7k Jan 7, 2023
Minimal unit testing framework for C

MinUnit Minunit is a minimal unit testing framework for C/C++ self-contained in a single header file. It provides a way to define and configure test s

David Siñuela Pastor 455 Dec 19, 2022
A C++ micro-benchmarking framework

Nonius What is nonius? Nonius is an open-source framework for benchmarking small snippets of C++ code. It is very heavily inspired by Criterion, a sim

Nonius 339 Dec 19, 2022
A lightweight unit testing framework for C++

Maintenance of UnitTest++, recently sporadic, is officially on hiatus until 26 November 2020. Subscribe to https://github.com/unittest-cpp/unittest-cp

UnitTest++ 510 Jan 1, 2023
🧪 single header unit testing framework for C and C++

?? utest.h A simple one header solution to unit testing for C/C++. Usage Just #include "utest.h" in your code! The current supported platforms are Lin

Neil Henning 560 Jan 1, 2023
UT: C++20 μ(micro)/Unit Testing Framework

"If you liked it then you "should have put a"_test on it", Beyonce rule [Boost::ext].UT / μt | Motivation | Quick Start | Overview | Tutorial | Exampl

boost::ext 950 Dec 29, 2022
test framework

Photesthesis This is a small, experimental parameterized-testing tool. It is intended to be used in concert with another unit-testing framework (eg. C

Graydon Hoare 11 Jun 2, 2021
Simple Unit Testing for C

Unity Test Copyright (c) 2007 - 2021 Unity Project by Mike Karlesky, Mark VanderVoord, and Greg Williams Welcome to the Unity Test Project, one of the

Throw The Switch 2.8k Jan 5, 2023
Simple Android ARM&ARM64 GOT Hook

Simple Android ARM&ARM64 GOT Hook 基于链接视图和执行视图,解析ELF,查找导入函数偏移值,替换函数地址。 详见:简易Android ARM&ARM64 GOT Hook (一) 简易Android ARM&ARM64 GOT Hook (二) 编译 使用Androi

Xhy 25 Dec 28, 2022
A compile-time enabled Modern C++ library that provides compile-time dimensional analysis and unit/quantity manipulation.

mp-units - A Units Library for C++ The mp-units library is the subject of ISO standardization for C++23/26. More on this can be found in ISO C++ paper

Mateusz Pusz 679 Dec 29, 2022
C++ compile-time enum to string, iteration, in a single header file

Better Enums Reflective compile-time enum library with clean syntax, in a single header file, and without dependencies. In C++11, everything can be us

Anton Bachin 1.4k Dec 27, 2022
A Compile time PCRE (almost) compatible regular expression matcher.

Compile time regular expressions v3 Fast compile-time regular expressions with support for matching/searching/capturing during compile-time or runtime

Hana Dusíková 2.6k Jan 5, 2023