SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

Related tags

Math quake3 cursed rsqrt
Overview

simd_fastinvsqrt

SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

Why

Why not.

How

This video explains it well.

Speed test

Here is the results of running benchmark.c compiled with -O2 on my hardware:

Q_rsqrt took(x) 110.617000ms
1.0f/sqrtf(x) took 343.441000ms
Q_rsqrt_sse(x) took 31.145000ms
_mm_div_ps(_mm_set1_ps(1.0f), _mm_sqrt_ps(x)) took 85.942000ms
_mm_rsqrt_ps(x) took 13.855000ms

We can clearly see that Q_rsqrt_sse significantly faster (about 3.5x, the theoretical maximum being 4x) than the scalar version, with the fastest being SSE's native inverse square root function.

Using it

If for some god forsaken reason you want to use this just include the simd_fastinvsqrt.h header in your program, define INCLUDE_ORIGINAL before including to bring in the original Q_rsqrt as well.

Owner
Liam
OwO What's This?
Liam
A wrapper for intel SSE/AVX vector instructions

VMath A wrapper for intel SSE/AVX vector instructions This is just a toy thing to figure out what working with intrinsics is like. I tried to keep it

Dennis 7 Apr 24, 2022
Math library using hlsl syntax with SSE/NEON support

HLSL++ Small header-only math library for C++ with the same syntax as the hlsl shading language. It supports any SSE (x86/x64 devices like PC, Mac, PS

null 253 Jun 13, 2022
Inverse kinematics of a six-degree-of-freedom manipulator

Inverse-kinematics-of-a-six-degree-of-freedom-manipulator The kinematics model of the manipulator is shown in the figure below: i αi-1 ai-1 di θi 1 0

zhuhuijin 4 Feb 2, 2022
Invk - Inverse Kinematics Library with Quaternions

InvK - Inverse Kinematics Library using Quaternions by Rama Hoetzlein (ramakarl.com) This is a simple library that demonsrates an efficient solution t

Rama Karl Hoetzlein 16 Feb 10, 2022
P(R*_{3, 0, 1}) specialized SIMD Geometric Algebra Library

Klein ?? ?? Project Site ?? ?? Description Do you need to do any of the following? Quickly? Really quickly even? Projecting points onto lines, lines t

Jeremy Ong 599 Jun 17, 2022
SIMD Vector Classes for C++

You may be interested in switching to std-simd. Features present in Vc 1.4 and not present in std-simd will eventually turn into Vc 2.0, which then de

null 1.2k Jun 23, 2022
Artistic creativity, accelerated with SIMD.

Link the YouTube video demonstration: https://www.youtube.com/watch?v=Bjwml32dxhU The compression algorithm does not work well on this colorful video,

Long Nguyen 17 Mar 16, 2022
Seidel's Algorithm: Linear-Complexity Linear Programming for Small-Dimensional Variables

SDLP Seidel's Algorithm: Linear-Complexity Linear Programming (LP) for Small-Dimensions About This solver is super efficient for small-dimensional LP

ZJU FAST Lab 31 Jun 8, 2022
Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations.

Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations.

null 2.2k Jun 28, 2022
Optimized implementations of the Number Theoretic Transform (NTT) algorithm for the ring R/(X^N + 1) where N=2^m.

optimized-number-theoretic-transform-implementations This sample code package is an implementation of the Number Theoretic Transform (NTT) algorithm f

International Business Machines 13 Mar 23, 2022
Library for nonconvex constrained optimization using the augmented Lagrangian method and the matrix-free PANOC algorithm.

alpaqa Alpaqa is an efficient implementation of the Augmented Lagrangian method for general nonlinear programming problems, which uses the first-order

OPTEC 10 May 11, 2022
SymEngine is a fast symbolic manipulation library, written in C++

SymEngine SymEngine is a standalone fast C++ symbolic manipulation library. Optional thin wrappers allow usage of the library from other languages, e.

null 859 Jun 25, 2022
Fast math tool written on asm/c

math_tool fast math tool written on asm/c This project was created for easy use of mathematical / geometric rules and operations. This project contain

portable executable 3 Mar 8, 2022
Blazing-fast Expression Templates Library (ETL) with GPU support, in C++

Expression Templates Library (ETL) 1.3.0 ETL is a header only library for C++ that provides vector and matrix classes with support for Expression Temp

Baptiste Wicht 201 Jun 4, 2022
Kraken is an open-source modern math library that comes with a fast-fixed matrix class and math-related functions.

Kraken ?? Table of Contents Introduction Requirement Contents Installation Introduction Kraken is a modern math library written in a way that gives ac

yahya mohammed 24 Mar 28, 2022
C++17 implementation of constexpr double-array trie

constexpr_doublearray What is this? This library implements a double-array trie of compile-time constants using constexpr in C++17. As you know, compi

Shunsuke Kanda 8 May 6, 2022
C++ implementation of the Python Numpy library

NumCpp: A Templatized Header Only C++ Implementation of the Python NumPy Library Author: David Pilger [email protected] Version: License Testing C++

David Pilger 2.4k Jul 1, 2022
Minimal matrix implementation in C++

min-matrix Minimal matrix implementation in C++. ?? Pull Request ?? ?? Post Issues ?? Brief ?? Eigen compiles too slow? ?? ?? Just want something simp

Shepard 5 Dec 17, 2021
John Walker 15 Jun 16, 2022
Square Root Bundle Adjustment for Large-Scale Reconstruction

Square Root Bundle Adjustment for Large-Scale Reconstruction

Nikolaus Demmel 162 Jun 18, 2022
C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.

Introduction The Simd Library is a free open source image processing and machine learning library, designed for C and C++ programmers. It provides man

Ihar Yermalayeu 1.6k Jun 24, 2022
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

Xtensor Stack 1.4k Jun 24, 2022
✔️The smallest header-only GUI library(4 KLOC) for all platforms

Welcome to GUI-lite The smallest header-only GUI library (4 KLOC) for all platforms. 中文 Lightweight ✂️ Small: 4,000+ lines of C++ code, zero dependenc

null 6.3k Jun 27, 2022
C++ game engine inspired by quake. Modern rendering and quake mapping tool integration.

Nuake Feel free to join the discord server for updates: What is it Nuake is a game engine written from scratch by myself. It is not meant to be a end-

Antoine Pilote 22 Jun 17, 2022
A repair tool for Symbian Nokia phones affected by the infamous white screen of death.

WSODFix About Nokia mobile phones such as the N-Gage running early versions of the Symbian OS suffer from a very common problem widely known as the Wh

Michael Fitzmayer 20 Apr 24, 2022
A simpler version of the infamous zero kb virus written in C++.

A simpler version of the infamous zero kb virus written in C++. A few years back I was struck by the 'zero kb' virus, and so i decide to write my own zero kb virus as an act of vengeance (

vincent laizer 3 Sep 19, 2021
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

KFR - Fast, modern C++ DSP framework Compiler support: https://www.kfr.dev KFR is an open source C++ DSP framework that focuses on high performance (s

KFR 1.2k Jun 26, 2022
The largest possible square on a board while avoiding obstacles.

BiggestSquare The largest possible square on a board while avoiding obstacles. Resume This project is a little remake of the Epitech's bsq project Thi

Facia Femi 3 Dec 6, 2021
3x3 fast pseudo inverse operation.

fast-pseudo-inverse-3 This code was created to quickly find 3x3 pseudo inverse matrix. It is very light and easy to use. You just have to copy the hea

Hyungjin Cha 3 Nov 7, 2021