Artistic creativity, accelerated with SIMD.

Related tags

Math sfml avx simd cpp17
Overview

Link the YouTube video demonstration: https://www.youtube.com/watch?v=Bjwml32dxhU

The compression algorithm does not work well on this colorful video, so I can not upload the full version as GIF.

The what

This is one of my latest projects with more of a artistic orientation, but also with a very technical nature. It helped me to cope with university's exam stress and gives me a small mind space just for myself.

animated

The why

Currently I have been working a lots with Neon Register on ARM architecture and found myself falling in love with SIMD (Single Instruction Multiple Data) and generally parallel programming. Normally, almost every modern CPU supports SIMD and a smart compiler will try its best to vectorize our code. But even the best compiler is not smart enough to understand the business logic, which is fundamental for enhancing code performance. Therefore, when we find ourself need a bit more speed push, but can not afford to run software on a graphics card, explicitly programming in SIMD could be a way out.

The how

The working principle of the core algorithm is simple. We update each particle for each frame in according to its position and distance to the mouse.

The standard software engineering would want us to solve the problem object orientedly, with each object having its own attributes. But to parallelize the rendering efficiently on hardware, I had to break the objects into matrices, with each vector having only data of an attribute.

With an AMD's Thread Ripper, I am able to process 8 particles on one run instead of just one particle. Combining this technique with thread level parallelization results a pretty neat visual software.

References

Owner
Long Nguyen
Studying computer science at Uni Tübingen, Germany
Long Nguyen
SIMD Vector Classes for C++

You may be interested in switching to std-simd. Features present in Vc 1.4 and not present in std-simd will eventually turn into Vc 2.0, which then de

null 1.2k Jun 23, 2022
SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

simd_fastinvsqrt SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena. Why Why not. How This video explai

Liam 7 Jan 28, 2022
C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.

Introduction The Simd Library is a free open source image processing and machine learning library, designed for C and C++ programmers. It provides man

Ihar Yermalayeu 1.6k Jun 24, 2022
P(R*_{3, 0, 1}) specialized SIMD Geometric Algebra Library

Klein ?? ?? Project Site ?? ?? Description Do you need to do any of the following? Quickly? Really quickly even? Projecting points onto lines, lines t

Jeremy Ong 599 Jun 17, 2022
SIMD Vector Classes for C++

You may be interested in switching to std-simd. Features present in Vc 1.4 and not present in std-simd will eventually turn into Vc 2.0, which then de

null 1.2k Jun 23, 2022
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

Xtensor Stack 1.4k Jun 24, 2022
SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

simd_fastinvsqrt SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena. Why Why not. How This video explai

Liam 7 Jan 28, 2022
Portable header-only C++ low level SIMD library

libsimdpp libsimdpp is a portable header-only zero-overhead C++ low level SIMD library. The library presents a single interface over SIMD instruction

Povilas Kanapickas 1k Jun 20, 2022
Proyecto de Enmascadaro de Imagenes con SIMD

TP 2 - Organización del Computador II Proyecto de Enmascarado de Imágenes con SIMD Objetivo Se debe implementan 2 funciones de Enmascaramiento de imág

null 0 May 6, 2022
std::find simd version

std::find simd version std::find doesn't use simd intrinsics. ( check https://gms.tf/stdfind-and-memchr-optimizations.html ) So i thought simd can mak

SungJinKang 19 Jan 5, 2022
A small data-oriented and SIMD-optimized 3D rigid body physics library.

nudge Nudge is a small data-oriented and SIMD-optimized 3D rigid body physics library. For more information, see: http://rasmusbarr.github.io/blog/dod

null 231 Jun 10, 2022
OpenCL based GPU accelerated SPH fluid simulation library

libclsph An OpenCL based GPU accelerated SPH fluid simulation library Can I see it in action? Demo #1 Demo #2 Why? Libclsph was created to explore the

null 46 Jun 15, 2022
GPU Accelerated C++ User Interface, with WYSIWYG developing tools, XML supports, built-in data binding and MVVM features.

GacUI GPU Accelerated C++ User Interface, with WYSIWYG developing tools, XML supports, built-in data binding and MVVM features. Read the LICENSE first

Vczh Libraries 2.1k Jun 29, 2022
Accelerated Container Image

OverlayBD Accelerated Container Image Accelerated Container Image is an open-source implementation of paper "DADI: Block-Level Image Service for Agile

Alibaba 119 Jun 21, 2022
YOLOv4 accelerated wtih TensorRT and multi-stream input using Deepstream

Deepstream 5.1 YOLOv4 App This Deepstream application showcases YOLOv4 running at high FPS throughput! P.S - Click the gif to watch the entire video!

Akash James 31 Apr 21, 2022
A easy-to-use image processing library accelerated with CUDA on GPU.

gpucv Have you used OpenCV on your CPU, and wanted to run it on GPU. Did you try installing OpenCV and get frustrated with its installation. Fret not

shrikumaran pb 4 Aug 14, 2021
Isaac ROS image_pipeline package for hardware-accelerated image processing in ROS2.

isaac_ros_image_pipeline Overview This metapackage offers similar functionality as the standard, CPU-based image_pipeline metapackage, but does so by

NVIDIA AI IOT 30 Apr 1, 2022
Cross-platform version of Heboris C7EX using a hardware-accelerated SDL 2.0 renderer

Heboris C7EX - unofficial version (YGS2K EX) This version contains the source code for Heboris C7EX. It requires a C compiler, SDL 2.0, SDL 2.0 mixer,

Brandon McGriff 9 May 27, 2022
ROS2 packages based on NVIDIA libArgus library for hardware-accelerated CSI camera support.

Isaac ROS Argus Camera This repository provides monocular and stereo nodes that enable ROS developers to use cameras connected to Jetson platforms ove

NVIDIA Isaac ROS 26 Jun 16, 2022
CUDA-accelerated Apriltag detection and pose estimation.

Isaac ROS Apriltag Overview This ROS2 node uses the NVIDIA GPU-accelerated AprilTags library to detect AprilTags in images and publishes their poses,

NVIDIA Isaac ROS 36 Jun 24, 2022
Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU.

Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode

NVIDIA Isaac ROS 36 Jun 20, 2022
Visual odometry package based on hardware-accelerated NVIDIA Elbrus library with world class quality and performance.

Isaac ROS Visual Odometry This repository provides a ROS2 package that estimates stereo visual inertial odometry using the Isaac Elbrus GPU-accelerate

NVIDIA Isaac ROS 164 Jun 19, 2022
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

Apache Arrow Powering In-Memory Analytics Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enabl

The Apache Software Foundation 9.7k Jun 25, 2022
HugeCTR is a GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-Through Rates (CTRs)

Merlin: HugeCTR HugeCTR is a GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-T

null 668 Jun 24, 2022
A CUDA-accelerated cloth simulation engine based on Extended Position Based Dynamics (XPBD).

Velvet Velvet is a CUDA-accelerated cloth simulation engine based on Extended Position Based Dynamics (XPBD). Why another cloth simulator? There are a

Vital Chen 10 Jun 9, 2022