A tool which profiles Vulkan devices to find their peak capacities

Related tags

Testing vkpeak
Overview

vkpeak

CI

A synthetic benchmarking tool to measure peak capabilities of vulkan devices. It only measures the peak metrics that can be achieved using vector operations and does not represent a real-world use case.

Download

Download Windows/Linux/MacOS Executable for Intel/AMD/Nvidia GPU

https://github.com/nihui/vkpeak/releases

Usages

vkpeak.exe 0

The only parameter 0 is the device id.

If you encounter a crash or error, try upgrading your GPU driver:

Build from Source

  1. Download and setup the Vulkan SDK from https://vulkan.lunarg.com/
  • For Linux distributions, you can either get the essential build requirements from package manager
dnf install vulkan-headers vulkan-loader-devel
apt-get install libvulkan-dev
pacman -S vulkan-headers vulkan-icd-loader
  1. Clone this project with all submodules
git clone https://github.com/nihui/vkpeak.git
cd vkpeak
git submodule update --init --recursive
  1. Build with CMake
  • You can pass -DUSE_STATIC_MOLTENVK=ON option to avoid linking the vulkan loader library on MacOS
mkdir build
cd build
cmake ..
cmake --build . -j 4

Sample

[nihui@nihui-pc build]$ ./vkpeak 0
device       = GeForce RTX 2070

fp32-scalar  = 8536.18 GFLOPS
fp32-vec4    = 8473.82 GFLOPS

fp16-scalar  = 8405.30 GFLOPS
fp16-vec4    = 16261.30 GFLOPS

fp64-scalar  = 262.86 GFLOPS
fp64-vec4    = 262.86 GFLOPS

int32-scalar = 8363.63 GIOPS
int32-vec4   = 8313.07 GIOPS

int16-scalar = 5518.05 GIOPS
int16-vec4   = 7138.91 GIOPS
nihui@nihui-macbook-air vkpeak-20210424-macos % ./vkpeak 0 
device       = Apple M1

fp32-scalar  = 2093.55 GFLOPS
fp32-vec4    = 2369.02 GFLOPS

fp16-scalar  = 2195.79 GFLOPS
fp16-vec4    = 2513.04 GFLOPS

fp64-scalar  = 0.00 GFLOPS
fp64-vec4    = 0.00 GFLOPS

int32-scalar = 653.38 GIOPS
int32-vec4   = 649.56 GIOPS

int16-scalar = 653.42 GIOPS
int16-vec4   = 652.94 GIOPS

Other Open-Source Code Used

Comments
  • Req: Do not run tests on unsupported hardware

    Req: Do not run tests on unsupported hardware

    the arc DG2 A380 does not support float64, and this the test should not run, (and probably report unsupported) attached is the related mesa report and vulkan info output for DG2

    https://gitlab.freedesktop.org/mesa/mesa/-/issues/7580

    vkinfo.txt

    bug 
    opened by Quackdoc 2
  • Test results on RTX 3060 are lower than expected

    Test results on RTX 3060 are lower than expected

    .\vkpeak.exe 0 device = NVIDIA GeForce RTX 3060

    fp32-scalar = 6884.30 GFLOPS fp32-vec4 = 9102.33 GFLOPS

    fp16-scalar = 6834.72 GFLOPS fp16-vec4 = 13487.34 GFLOPS

    fp64-scalar = 214.61 GFLOPS fp64-vec4 = 215.09 GFLOPS

    int32-scalar = 6843.09 GIOPS int32-vec4 = 6814.33 GIOPS

    int16-scalar = 4517.13 GIOPS int16-vec4 = 6017.45 GIOPS

    RTX 3060 w/527.56 on Windows 11 , it should be about 13 TFLOPS with fp32 FMA.

    opened by edisonchan 0
  • Test fp16 without 16-bit storage

    Test fp16 without 16-bit storage

    It would be nice to be able to look at the perfomance of fp16 ALU ops, even if we don't have 16-bit storage. This is the case for qualcomm A618, for example -- we can't load/store 16 bits, but we can do math.

    Similarly, it would be nice to test RelaxedPrecision ALU ops on 32-bit values, which seems to be a common case for glslang-translated glsl.

    opened by anholt 0
  • Crashes on M1 Max

    Crashes on M1 Max

    When I try and compile it myself, I get an error. When I run the program from the releases page, it works. Have any ideas?

    
    fish: Job 1, './vkpeak 0' terminated by signal SIGSEGV (Address boundary error)
    

    Console says:

    
    Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
    Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000010
    Exception Codes:       0x0000000000000001, 0x0000000000000010
    Exception Note:        EXC_CORPSE_NOTIFY
    
    Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
    Terminating Process:   exc handler [99795]
    
    VM Region Info: 0x10 is not in any region.  Bytes before following region: 4332306416
    
    opened by lbibass 2
  • Int8 & Int64 support?

    Int8 & Int64 support?

    Hi, nice benchmark! below my Titan V and RX Vega Win results.. AFAIK Vulkan spec supports also int8 (via VK_KHR_shader_float16_int8 shaderInt8) and int64 (shaderInt64).. any plan on support benchmarking int8/64 throughput? thanks..

    Results:

    device = NVIDIA TITAN V

    fp32-scalar = 17230.91 GFLOPS fp32-vec4 = 16898.01 GFLOPS

    fp16-scalar = 16781.96 GFLOPS fp16-vec4 = 32568.21 GFLOPS

    fp64-scalar = 7664.02 GFLOPS fp64-vec4 = 7677.14 GFLOPS

    int32-scalar = 14464.71 GIOPS int32-vec4 = 14755.26 GIOPS

    int16-scalar = 9727.97 GIOPS int16-vec4 = 11768.93 GIOPS

    device = Radeon RX Vega

    fp32-scalar = 11453.46 GFLOPS fp32-vec4 = 11010.15 GFLOPS

    fp16-scalar = 10388.36 GFLOPS fp16-vec4 = 17744.94 GFLOPS

    fp64-scalar = 686.59 GFLOPS fp64-vec4 = 686.31 GFLOPS

    int32-scalar = 2188.62 GIOPS int32-vec4 = 2170.05 GIOPS

    int16-scalar = 10013.59 GIOPS int16-vec4 = 9885.89 GIOPS

    opened by oscarbg 1
Owner
マジやばくね
null
Network utility tool which enables to prototype or test network things.

netsck netsck is a network utility tool which is developed to prototype or test network things. It provides a shell inside which runs javascript engin

Ozan Cansel 4 May 29, 2022
Practical mutation testing tool for C and C++

Mull Mull is a tool for Mutation Testing based on LLVM/Clang with a strong focus on C and C++ languages. For installation and usage please refer to th

Mull Project 652 Dec 30, 2022
A dynamic mock tool for C/C++ unit test on Linux&MacOS X86_64

lmock 接口 替换一个函数,修改机器指令,用新函数替换旧函数,支持全局函数(包括第三方和系统函数)、成员函数(包括静态和虚函数)

null 55 Dec 21, 2022
A tool to help in testing client/server robustness in the presence of malformed data.

Tool to assist in testing robustness of network-attached services in the presence of malformed data.

Peter Farley 1 Aug 27, 2022
A tool to test if a shared library is dlopen'ble

A tool to test if a shared library is dlopen'ble

Mahin Ahmed 1 Oct 17, 2021
🍋 Macro creation tool for MacOS

?? Lime Macro creation tool for MacOS Why Does lime require accessibility? Lime requires the Accessibility API to perform macro actions, such as press

AshPerson 1 Nov 27, 2021
The Vulkan Profiles Tools are a collection of tools delivered with the Vulkan SDK for Vulkan application developers to leverage Vulkan Profiles while developing a Vulkan application

Copyright © 2021-2022 LunarG, Inc. Vulkan Profiles Tools (BETA) The Vulkan Profiles Tools are a collection of tools delivered with the Vulkan SDK for

The Khronos Group 73 Dec 25, 2022
Find patterns of vulnerabilities on Windows in order to find 0-day and write exploits of 1-days. We use Microsoft security updates in order to find the patterns.

Back 2 the Future Find patterns of vulnerabilities on Windows in order to find 0-day and write exploits of 1-days. We use Microsoft security updates i

SafeBreach Labs 118 Dec 30, 2022
Fully resizing juce peak meter module with optional fader overlay.

Sound Meter Juce peak meter module with optional fader overlay. by Marcel Huibers | Sound Development 2021 | Published under the MIT License Features:

Sound Development 17 Nov 22, 2022
This project helps a person park their car in their garage in the same place every time.

garage-parking-sensor Description This project is developed to help a person park their car in their garage in the same place every time. Normally peo

Calvin Pereira 1 Aug 18, 2022
Vulkan Video Sample Application demonstrating an end-to-end, all-Vulkan, processing of h.264/5 compressed video content.

This project is a Vulkan Video Sample Application demonstrating an end-to-end, all-Vulkan, processing of h.264/5 compressed video content. The application decodes the h.264/5 compressed content using an HW accelerated decoder, the decoded YCbCr frames are processed with Vulkan Graphics and then presented via the Vulkan WSI.

NVIDIA DesignWorks Samples 132 Dec 15, 2022
ContactGot is an offline desktop app, where clients can leave their info, while an administrator can manage which information they need to gather on certain projects.

ContactGot Contents Description How to use Requirements Engineering Installation Documentation Design Architecture Demonstration 1. Description During

Elizaveta 15 Sep 17, 2022
Identify I2C devices from a database of the most popular I2C sensors and other devices

I2C Detective Identify I2C devices from a database of the most popular I2C sensors and other devices. For more information see http://www.technoblogy.

David Johnson-Davies 21 Nov 29, 2022
🐸 Coqui STT is an open source Speech-to-Text toolkit which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers

Coqui STT ( ?? STT) is an open-source deep-learning toolkit for training and deploying speech-to-text models. ?? STT is battle tested in both producti

Coqui.ai 1.7k Jan 2, 2023
A light-weight Flutter Engine Embedder based on HADK ,which for Android devices that runs without any java code

flutter-hadk A light-weight Flutter Engine Embedder based on HADK ,which for Android devices that runs without any java code 1.Build by android-ndk-to

null 12 Jun 15, 2022
Björn Kalkbrenner 37 Dec 18, 2022
Dolphin |MMJR| is a Gamecube/Wii Emulator for Android devices; based on Dolphin MMJ source code which is aimed at pure performance.

Dolphin |MMJR| An Android-only performance-focused Dolphin (Official) fork, continued from the Dolphin MMJ source code by Weihuoya. This version is me

null 291 Dec 28, 2022
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

Mozilla 20.8k Jan 9, 2023
Syncspirit is a continuous file synchronization program, which synchronizes files between devices.

syncspirit sites: github, abf syncspirit is a continuous file synchronization program, which synchronizes files between devices. It is build using C++

Ivan Baidakou 16 Dec 25, 2022