Dissecting the M1's GPU for 3D acceleration

Related tags

Graphics gpu
Overview

Asahi GPU

Research for an open source graphics stack for Apple M1.

wrap

Build with the included makefile make wrap.dylib, and insert in any Metal application by setting the environment variable DYLD_INSERT_LIBRARY=/Users/bloom/gpu/wrap.dylib.

Contributors

  • Alyssa Rosenzweig (bloom) on IRC, working on the command stream and ISA
  • marcan, working on kernel side

Contributing

All contributors are expected to abide by our Code of Conduct and our Copyright and Reverse Engineering Policy.

For more information, please see our Contributing page.

You might also like...
SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL)

SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL), capable of handling subpixel features seamlessly, and featuring an improved and advanced pattern detection & handling mechanism.

✔️The smallest header-only GUI library(4 KLOC) for all platforms
✔️The smallest header-only GUI library(4 KLOC) for all platforms

Welcome to GUI-lite The smallest header-only GUI library (4 KLOC) for all platforms. 中文 Lightweight ✂️ Small: 4,000+ lines of C++ code, zero dependenc

Tensors and Dynamic neural networks in Python with strong GPU acceleration
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

A lightweight 2D Pose model  can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.
A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

Radeon Rays is ray intersection acceleration library for hardware and software multiplatforms using CPU and GPU

RadeonRays 4.1 Summary RadeonRays is a ray intersection acceleration library. AMD developed RadeonRays to help developers make the most of GPU and to

Minerva: a fast and flexible tool for deep learning on multi-GPU. It provides ndarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy. A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.
A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Libonnx A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support. Getting Started The library's

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

Intel Homomorphic Encryption Acceleration Library (HEXL) Intel ®️ HEXL is an open-source library which provides efficient implementations of integer a

OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.
OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.

OpenEmbedding English version | 中文版 About OpenEmbedding is an open-source framework for TensorFlow distributed training acceleration. Nowadays, many m

The movements of your RC vehicles are jerky and not smooth? This Arduino device will solve this issue by adding acceleration and deceleration ramps to the PWM signals!
The movements of your RC vehicles are jerky and not smooth? This Arduino device will solve this issue by adding acceleration and deceleration ramps to the PWM signals!

This is an Arduino Pro Mini 3.3V / 8MHz based RC servo ramp / delay generator Features: 4 RC servo PWM inputs and outputs (can be enhanced) Reads the

Velox is a new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
Velox is a new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

Velox is a C++ database acceleration library which provides reusable, extensible, and high-performance data processing components

Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Intel Homomorphic Encryption Acceleration Library for FPGAs

main: development: Intel Homomorphic Encryption Acceleration Library for FPGAs (Intel HEXL for FPGA) Intel ®️ HEXL for FPGA is an open-source library

TengineFactory - Algorithm acceleration landing framework, let you complete the development of algorithm at low cost.eg: Facedetect, FaceLandmark..
TengineFactory - Algorithm acceleration landing framework, let you complete the development of algorithm at low cost.eg: Facedetect, FaceLandmark..

简介 随着人工智能的普及,深度学习算法的越来越规整,一套可以低代码并且快速落地并且有定制化解决方案的框架就是一种趋势。为了缩短算法落地周期,降低算法落地门槛是一个必然的方向。 TengineFactory 是由 OPEN AI LAB 自主研发的一套快速,低代码的算法落地框架。我们致力于打造一个完全

Nvvl - A library that uses hardware acceleration to load sequences of video frames to facilitate machine learning training

NVVL is part of DALI! DALI (Nvidia Data Loading Library) incorporates NVVL functionality and offers much more than that, so it is recommended to switc

The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs based on CUDA.

dgSPARSE Library Introdution The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs bas

4eisa40 GPU computing : exploiting the GPU to execute advanced simulations

GPU-computing 4eisa40 GPU computing : exploiting the GPU to execute advanced simulations Activities Parallel programming Algorithms Image processing O

A GPU (CUDA) based Artificial Neural Network library
A GPU (CUDA) based Artificial Neural Network library

Updates - 05/10/2017: Added a new example The program "image_generator" is located in the "/src/examples" subdirectory and was submitted by Ben Bogart

ArrayFire: a general purpose GPU library.
ArrayFire: a general purpose GPU library.

ArrayFire is a general-purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures i

Comments
  • MacOS>=12 fix

    MacOS>=12 fix

    Signed-off-by: Adeeb Abbas [email protected]

    The current master yields the following error. Using deprecated naming, this PR aims to fix it -

    demo/iokit.c:46:31: error: 'kIOMasterPortDefault' is deprecated: first deprecated in macOS 12.0 [-Werror,-Wdeprecated-declarations]

    opened by adeeb10abbas 3
  • Makefile tweaks

    Makefile tweaks

    • Introduce new default all target, which builds the wrapper. Previously the default target unfortunately ended up being clean.
    • Declare dependencies of the wrap.dylib target so it rebuilds on source change
    • Add all the build products so far to .gitignore

    Signed-off-by: Chris Forbes [email protected]

    opened by chrisforbes 3
  • MacOS >=12.0 make fix

    MacOS >=12.0 make fix

    The current master yields the following error. Using deprecated naming, this PR aims to fix it -

    demo/iokit.c:46:31: error: 'kIOMasterPortDefault' is deprecated: first deprecated in macOS 12.0 [-Werror,-Wdeprecated-declarations]

    opened by adeeb10abbas 1
Owner
Asahi Linux
Porting Linux to Apple Silicon macs
Asahi Linux
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup

A minimalist and platform-agnostic interactive/real-time raytracer. Strong emphasis on simplicity, ease of use and almost no setup to get started with

Arnon Marcus 46 Oct 5, 2022
2D GPU renderer for dynamic UIs

vger vger is a vector graphics renderer which renders a limited set of primitives, but does so almost entirely on the GPU. Works on iOS and macOS. API

Audulus LLC 159 Nov 26, 2022
This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer GPU libraries.

OpenGL Cube Demo using PVR_PSP2 Driver layer GPU libraries This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer

David Cantu 5 Oct 31, 2021
What I'm doing here is insane GPU driver prototype for @GreenteaOS

NjRAA Work-in-progress Driver Foundation [nee-jee-ray] What I'm doing here is a GPU driver for Linux as a prototype for future graphics stack of the @

Miraculous Ladybugreport 4 Jan 22, 2022
A Hydra-enabled GPU path tracer that supports MaterialX.

A Hydra-enabled GPU path tracer that supports MaterialX.

Pablo Delgado 172 Nov 21, 2022
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.

Legion-LLRI Legion-LLRI, or “Legion Low Level Rendering Interface” is a rendering API that aims to provide a graphics API agnostic approach to graphic

Rythe Interactive 26 Aug 13, 2022
A low-level, cross-platform GPU library

vgpu is cross-platform low-level GPU library. Features Support for Windows, Linux, macOS. Modern rendering using Vulkan and Direct3D12. Dependencies U

Amer Koleci 9 Jul 28, 2022
GPU cloth with OpenGL Compute Shaders

GPU cloth with OpenGL Compute Shaders This project in progress is a PBD cloth simulation accelerated and parallelized using OpenGL compute shaders. Fo

null 35 Jul 27, 2022
GPU Texture Generator

Imogen GPU/CPU Texture Generator GPU Texture generator using dear imgui for UI. Not production ready and a bit messy but really fun to code. This is a

Cedric Guillemet 709 Nov 21, 2022
Optimized GPU noise functions and utilities

Optimized GPU noise functions and utilities

Brian Sharpe 336 Nov 20, 2022