Simple C++ code to benchmark fast division algorithms

Overview

fast_division

Simple C++ code to benchmark fast division algorithms relying on constant divisors.

The code is a companion to the paper Integer Division by Constants: Optimal Bounds in the sense that it illustrates the results. It should be viewed as a research artefact.

Usage

The code is made of a single header file (fast_division.h).

Copy fast_division.h in your C++ project. You can then use it as follows.

#include "fast_division/fast_division.h"

// for any compile-time constant 'divisor' we have:
assert(fast_division::divide32<divisor>::quotient(n) == n / divisor);
assert(fast_division::divide32<divisor>::remainder(n) == n % divisor)
assert(fast_division::divide32<divisor>::is_divisible(n) == (n % divisor == 0));

Benchmarks

❯ cmake -B build && cmake --build build
./build/benchmark/benchmark

On an Apple M1 processor with clang 12, we get:

❯ ./build/benchmark/benchmark
divisor = 19
std division                            :     0.91 ns/ops (+/- 3.8 %)
fast division                           :     0.31 ns/ops (+/- 1.0 %)
std remainder                           :     0.76 ns/ops (+/- 0.7 %)
fast remainder                          :     0.62 ns/ops (+/- 4.7 %)
divisor = 123456
std division                            :     0.84 ns/ops (+/- 0.7 %)
fast division                           :     0.31 ns/ops (+/- 1.1 %)
std remainder                           :     0.62 ns/ops (+/- 0.6 %)
fast remainder                          :     0.31 ns/ops (+/- 16.0 %)
divisor = 4096
std division                            :     0.58 ns/ops (+/- 1.8 %)
fast division                           :     0.58 ns/ops (+/- 1.8 %)
std remainder                           :     0.72 ns/ops (+/- 0.7 %)
fast remainder                          :     0.72 ns/ops (+/- 0.6 %)

On an AMD Zen 2 processor with GCC 10, we get:

$ ./build/benchmark/benchmark
divisor = 19
std division                            :     0.87 ns/ops (+/- 1.0 %)
fast division                           :     0.44 ns/ops (+/- 3.4 %)
std remainder                           :     1.15 ns/ops (+/- 0.1 %)
fast remainder                          :     0.82 ns/ops (+/- 0.3 %)
divisor = 123456
std division                            :     0.59 ns/ops (+/- 2.9 %)
fast division                           :     0.43 ns/ops (+/- 4.3 %)
std remainder                           :     0.76 ns/ops (+/- 10.5 %)
fast remainder                          :     0.64 ns/ops (+/- 1.0 %)
divisor = 4096
std division                            :     0.29 ns/ops (+/- 0.4 %)
fast division                           :     0.29 ns/ops (+/- 0.3 %)
std remainder                           :     0.29 ns/ops (+/- 0.4 %)
fast remainder                          :     0.29 ns/ops (+/- 0.4 %)

Results will vary depending on your compiler and processor. They also depend on the divisor.

The benchmark measures throughput, not latency.

Further reading and code

For a practical library with performance claims, see fastmod and the manuscript:

Limitations and requirements

  • We currently assume that the divisor is a compile-time constant. Extending to runtime constants is possible.
  • We assume a recent compiler (C++11 or better). If you are interested in supporting legacy compilers, pull requests are invited.
  • Currently, only 32-bit unsigned integers are supported. Adding other data types is merely a matter of extra programming effort.
  • We assume a GCC-like compiler like GCC/LLVM clang. Pull requests to support Visual Studio and other compilers are invited.
You might also like...
🔗 Common Data Structures and Algorithms

🔗 Data Structures and Algorithms This library provides common data structures. It will also provide some data structures which needed in render or ga

A collection of various algorithms to produce length-limited prefix codes

Introduction This is a collection of various algorithms to produce length-limited prefix codes. My library is written in plain C with tons of comments

Data Structure and Algorithms problems across various platforms.
Data Structure and Algorithms problems across various platforms.

Covering various practice problems of Arrays, Stacks, queue, tree, graph and different Algorithms. A collection of solutions for Data Structure and Algorithms problems across various platforms in different languages.

Repository of problems and solutions of labsheets used for Data Structures and Algorithms (CS F211) in Semester 2, 2020-21 at BITS Pilani - Hyderabad Campus.

CS F211 Data Structures and Algorithms (BITS Pilani - Hyderabad Campus) This repository contains the problems, solution approaches & explanations and

Solutions for problems given in ETH course Algorithms Lab in Fall 2020

Algolab2020 Solutions for problems given in ETH course Algorithms Lab in Fall 2020. The code for these problems is written with the following in mind:

An assortment of commonly used algorithms and data structures implemented with C++.

Algorithms-And-Data-Structures This repo contains C++ implementations of common algorithms and data structures, grouped by topics. The list will be sp

The ultimate guide for data structure and algorithms

Quick Notes: For Arrays: (methods that can be applied) sorting and then doing something, hashtable, two pointers in a loop are some of the operations

Wonderful library with lots of useful functions, algorithms and data structures in C, link it with -l9wada

Lib9wada Wonderful library with lots of useful functions, algorithms and data structures in C, link it with -l9wada Usage Compile the library with mak

Wonderful library with lots of useful functions, algorithms and data structures in C, link it with -l9wada

LibC+ Wonderful library with lots of useful functions, algorithms and data structures in C, link it with -lC+ Better than C, not as much as c++ Usage

Owner
Daniel Lemire
Daniel Lemire is a computer science professor. His research is focused on software performance and indexing.
Daniel Lemire
R :package: and header-only C++ library for geospatial space-division based compression and encoding

spress spress provides utilities for encoding and compressing geospatial objects, such as sf objects. Installation This package requires C++11 for com

UF-OKN 5 Dec 9, 2021
R :package: and header-only C++ library for geospatial space-division based compression and encoding

spress spress provides utilities for encoding and compressing geospatial objects, such as sf objects. Installation This package requires C++11 for com

UF-OKN 5 Dec 9, 2021
HashTableBenchmark - A simple cross-platform speed & memory-efficiency benchmark for the most common hash-table implementations in the C++ world

Hash-Tables Benchmarks This repository contains a bunch of extendable benchmarks mostly for following containers: std:unordered_map from STL. google::

Unum 9 Nov 20, 2022
A benchmark for hash tables and hash functions in C++, evaluate on different data as comprehensively as possible

Hash Table Benchmark This is yet another benchmark for hash tables(hash maps) with different hash functions in C++, attempting to evaluate the perform

Ren Zibei 3 Oct 18, 2022
Algo-Tree is a collection of Algorithms and data structures which are fundamentals to efficient code and good software design

Algo-Tree is a collection of Algorithms and data structures which are fundamentals to efficient code and good software design. Creating and designing excellent algorithms is required for being an exemplary programmer. It contains solutions in various languages such as C++, Python and Java.

DSC-Banasthali 53 Oct 4, 2022
Redacted source code for exercises proposed in the Data Structures and Algorithms laboratory.

fsega_ie2_dsa Redacted source code for exercises proposed in the Data Structures and Algorithms laboratory. Usage The src/ directory contains a direct

Cezar Mathe 1 Dec 5, 2021
This repository contains the Assignment code of Data Structures and Algorithms Assignments of SPPU, Second Year IT Syllabus (2019 pattern)

DSAL This repository contains the Assignment code of Data Structures and Algorithms Assignments of SPPU, Second Year IT Syllabus (2019 pattern) Assign

Mujeeb Khan 1 Apr 6, 2022
libsrt is a C library for writing fast and safe C code, faster.

libsrt is a C library for writing fast and safe C code, faster. It provides string, vector, bit set, set, map, hash set, and hash map handling. Suitable for soft and hard real-time. Allows both heap and stack allocation. *BETA* (API still can change: suggestions are welcome)

null 515 Nov 30, 2022
Simple and fast configuration file library (written in C99)

Features Configuration file reading Supported operating systems Ubuntu MacOS Windows Build requirements C99 compiler CMake 3.10+ Cloning git clone htt

Nikita Fediuchin 3 May 26, 2022