A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Overview

Libonnx

A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Getting Started

The library's .c and .h files can be dropped into a project and compiled along with it. Before use, should be allocated struct onnx_context_t * and you can pass an array of struct resolver_t * for hardware acceleration.

The filename is path to the format of onnx model.

struct onnx_context_t * ctx = onnx_context_alloc_from_file(filename, NULL, 0);

Then, you can get input and output tensor using onnx_tensor_search function.

struct onnx_tensor_t * input = onnx_tensor_search(ctx, "input-tensor-name");
struct onnx_tensor_t * output = onnx_tensor_search(ctx, "output-tensor-name");

When the input tensor has been setting, you can run inference engine using onnx_run function and the result will putting into the output tensor.

onnx_run(ctx);

Finally, you must free struct onnx_context_t * using onnx_context_free function.

onnx_context_free(ctx);

Examples

Just type make at the root directory, you will see a static library and some binary of examples and tests for usage.

cd libonnx
make

Screenshots

Notes

This library based on the onnx version 1.8.0 with the newest opset 13 support. The supported operator table in the documents directory.

Links

License

This library is free software; you can redistribute it and or modify it under the terms of the MIT license. See MIT License for details.

Issues
  • Hello model RAM size required

    Hello model RAM size required

    Hi,

    I'm trying to run the hello example on a small embedded system but im unsure of the memory required to allocate this model ( when runningonnx_context_alloc).

    I have roughly 2MB, is that enough? Is there a smaller model that I can test with the model defined as a const char array? Like the static const unsigned char mnist_onnx[] = { ... }

    opened by noomio 11
  • Tensorflow model with opset 12 seems to crash when loaded

    Tensorflow model with opset 12 seems to crash when loaded

    I have a model converted from Tensorflow that uses opset 12. (using tf2onnx.convert) The model opens fine in Netron and elsewhere but crashes somewhere in Concat_reshape when I try to load it with onnx_context_alloc_from_file. I tried compiling for both x86 and x64 with the same result.

    Here are the model properties as viewed through Netron: image

    Opening the models supplied in the libonnx test directory seemed to work fine. Do you have any suggestions for how to get this working? Thanks.

    opened by Planet-Patrick 5
  • Question: Does libonnx support dynamic shape/CUDA ?

    Question: Does libonnx support dynamic shape/CUDA ?

    Dear friends, Thanks for good job. We have two question: 1) Does libonnx support dynamic shape input ? If not, how to implement it ? (Just some hints.); 2) How to support CUDA inference on libonnx ?

    opened by delldu 4
  • How to convert a `onnx_context` to a `unsigned char array` as in the hello world example?

    How to convert a `onnx_context` to a `unsigned char array` as in the hello world example?

    In the main.c file in the examples/hello folder, how did you convert the MNSIT model to a unsigned char array and use it?:

    #include <onnx.h>
    
    static const unsigned char mnist_onnx[] = {
    	0x08, 0x03, 0x12, 0x04, 0x43, 0x4e, 0x54, 0x4b, 0x1a, 0x05, 0x32, 0x2e,
    	0x35, 0x2e, 0x31, 0x22, 0x07, 0x61, 0x69, 0x2e, 0x63, 0x6e, 0x74, 0x6b,
    	0x28, 0x01, 0x3a, 0xb2, 0xce, 0x01, 0x0a, 0x62, 0x0a, 0x0c, 0x50, 0x61,
    	0x72, 0x61, 0x6d, 0x65, 0x74, 0x65, 0x72, 0x31, 0x39, 0x33, 0x0a, 0x1b,
    	0x50, 0x61, 0x72, 0x61, 0x6d, 0x65, 0x74, 0x65, 0x72, 0x31, 0x39, 0x33,
    

    If you have any code to do it can you share it?

    opened by abhinandanudupa 3
  • Can this software run on MacOS

    Can this software run on MacOS

    I came across an error when compling it on my Mac.

    [CC] helper.c In file included from helper.c:28: ./helper.h:13:10: fatal error: 'malloc.h' file not found #include <malloc.h> ^~~~~~~~~~ 1 error generated. make[1]: *** [helper.o] Error 1 make: *** [all] Error 2

    opened by feifeibear 3
  • Why does every test in the model, node and simple folders fail?

    Why does every test in the model, node and simple folders fail?

    I have compiled libonnx on a fresh installation of Ubuntu - installed all the prerequisites ran just make and tried running the tests one by one. But I find the every test that I run fails and I am not able to figure out why.

    Here is what I did:

    • Installed the latest LTS release of Ubuntu
    • Installed make, build-essential, git and libsdl2-gfx (Did do other stuff but those would not mess with this)
    • Ran make all to compile
    • Ran the tests on many examples: ./tests ./model/mnist_8/
    • But every test that I have tried has just failed!
    $ ./TESTING/libonnx/tests/tests ./TESTING/libonnx/tests/model/mnist_8/
    [test_data_set_0]                                                                       [FAIL]
    [test_data_set_1]                                                                       [FAIL]
    [test_data_set_2]                                                                       [FAIL]
    
    • All those in the simple and model folders fail but those in pytorch-* succeed partially
    • Is this because of a missing operator? (no Unsupported opset message has been displayed as in the pytorch tests)
    • Nonetheless the example for handwriting recognition has identified the number correctly most of the time
    opened by abhinandanudupa 2
  • Which header files are necessary?

    Which header files are necessary?

    Hello! I am working on a project where I have to deploy an inference engine in a very minimal environment where even the math library is not present. While compiling (onnxconf.h) for the platform I found out that the math.h header file was missing in the platform libraries. Here are some of the headers being used in that header file - onnxconf.h:

    #include <stdio.h>
    #include <stdlib.h>  
    #include <stdint.h>  
    #include <stddef.h>  
    #include <string.h>  
    #include <malloc.h>  
    #include <float.h>  
    #include <math.h>  
    #include <list.h>  
    #include <hmap.h>  
    

    Also I have found that in this header file no math function has been used - I hope I am write on this! Correct me if I am wrong So I was wondering if I could just remove this inclusion of math.h and still have a functional inference engine?

    Just to make things simple compiling a math library would not be desirable but is possible.

    opened by abhinandanudupa 2
  • This project need SDL2.

    This project need SDL2.

    At first , I got error when complie this project: main.c:1:10: fatal error: SDL2/SDL.h: 没有那个文件或目录 1 | #include <SDL2/SDL.h> | ^~~~~~~~~~~~ then I use sudo apt-get install libsdl2-gfx-dev fixed this error.

    I think this project should explain this dependence.

    THANKS!

    opened by wuzy361 2
  • Running test/model/test_mnist_8 issue

    Running test/model/test_mnist_8 issue

    Hi,

    When I run the test/model/test_mnist_8 once it works and I get a OKAY result. I then re-run it and it FAILS.

    Any suggestion why this might be and what to look for?

    opened by noomio 2
  • isnan and isinf issue

    isnan and isinf issue

    Hi,

    When I compile I get the following error with clang. I then try to link the library and it complains. Not quite sure why, I added in the main.c isnan and isinf and compiles ok. -lm is added to the linker.

    Library compilation: default/IsNaN.c:34:11: warning: implicit declaration of function 'isnanf' is invalid in C99 [-Wimplicit-function-declaration] py[i] = isnanf(v) ? 1 : 0;

    Linker: libonnx.a(.text+0x598): undefined reference to isnanf'

    opened by noomio 2
  • [documentation] Improved the instructions in README.md

    [documentation] Improved the instructions in README.md

    • Added instructions to install SDL2 and SDL2 GFX on Ubunutu
    • Added instructions on how to cross compile using arm64 as an example
    • Added instructions on running a test (and also added the output for the model folder)
    • Added instruction to convert an onnx model to a char array

    While working with the library, as a inexperienced student, I had a hard time figuring out some instructions. I hope that these instructions would help those like me.

    opened by abhinandanudupa 1
  • Valgrind output for Yolo v2 model

    Valgrind output for Yolo v2 model

    I downloaded tiny yolo v2 model from https://github.com/onnx/models/tree/main/vision/object_detection_segmentation/tiny-yolov2 And when inferencing this, I got those outputs from Valgrind

    ==178736== Invalid read of size 1 ==178736== at 0x162DF9: shash (onnxconf.h:146) ==178736== by 0x162F11: MaxPool_init (MaxPool.c:38) ==178736== by 0x113FF0: onnx_graph_alloc (onnx.c:1238) ==178736== by 0x10FCFA: onnx_context_alloc (onnx.c:102) ==178736== by 0x10FF35: onnx_context_alloc_from_file (onnx.c:145)

    ==178736== Invalid write of size 1 ==178736== at 0x1154F1: onnx_attribute_read_string (onnx.c:1747) ==178736== by 0x162F09: MaxPool_init (MaxPool.c:38) ==178736== by 0x113FF0: onnx_graph_alloc (onnx.c:1238) ==178736== by 0x10FCFA: onnx_context_alloc (onnx.c:102) ==178736== by 0x10FF35: onnx_context_alloc_from_file (onnx.c:145)

    ==178736== Invalid read of size 1 ==178736== at 0x13BEB8: shash (onnxconf.h:146) ==178736== by 0x13BFD1: Conv_init (Conv.c:43) ==178736== by 0x113FF0: onnx_graph_alloc (onnx.c:1238) ==178736== by 0x10FCFA: onnx_context_alloc (onnx.c:102) ==178736== by 0x10FF35: onnx_context_alloc_from_file (onnx.c:145)

    ==178736== Invalid write of size 1 ==178736== at 0x1154F1: onnx_attribute_read_string (onnx.c:1747) ==178736== by 0x13BFC9: Conv_init (Conv.c:43) ==178736== by 0x113FF0: onnx_graph_alloc (onnx.c:1238) ==178736== by 0x10FCFA: onnx_context_alloc (onnx.c:102) ==178736== by 0x10FF35: onnx_context_alloc_from_file (onnx.c:145)

    ==178736== ERROR SUMMARY: 30 errors from 4 contexts (suppressed: 0 from 0)

    opened by erdem-kose 0
  • Maxpool + dilation

    Maxpool + dilation

    This is really a question, I don't think there is a bug here, just something I'm not understanding.

    I'm looking at the code for maxpool and how it handles dilations. The spec has this example:

    """
    input_shape: [1, 1, 4, 4]
    output_shape: [1, 1, 2, 2]
    """
    node = onnx.helper.make_node(
        'MaxPool',
        inputs=['x'],
        outputs=['y'],
        kernel_shape=[2, 2],
        strides=[1, 1],
        dilations=[2, 2]
    )
    x = np.array([[[
        [1, 2, 3, 4],
        [5, 6, 7, 8],
        [9, 10, 11, 12],
        [13, 14, 15, 16],
    ]]]).astype(np.float32)
    y = np.array([[[
        [11, 12],
        [15, 16]]]]).astype(np.float32)
    
    expect(node, inputs=[x], outputs=[y], name='test_maxpool_2d_dilations')
    

    This should implicitly use AUTO_PAD_NOTSET. Now what I tried is getting the MaxPool_float32 to give the [ 11, 12, 15, 16 ] result by hardcoding the inputs, for the full code + output see this godbolt:

    int strides[] = { 1, 1 };
    int kernels[] = { 2, 2 };
    int cpads[] = { 0, 0, 0, 0 };
    
    int x_ndim = 4;
    int x_dims[] = { 1, 1, 4, 4 };
    int y_dims[] = { 1, 1, 2, 2 };
    

    From my code reading, the dilation is only used to determine the output dimensions, which I've hardcoded here.

    But with these inputs I get the incorrect output:

    6.000000 7.000000 10.000000 11.000000
    

    So, what is the way that dilations influence the end result that I am missing?

    opened by folkertdev 0
  • Failed to load 'yolov5n.onnx'.

    Failed to load 'yolov5n.onnx'.

    hi, I try to load 'yolov5n.onnx' like this:

    #include "onnx.h"
    
    int main(void)
    {
        struct onnx_context_t *sess = onnx_context_alloc_from_file("yolov5n.onnx", NULL, 0);
        onnx_context_dump(sess, 1);
        return 0;
    }
    

    but nothing was output, including warnings and errors.

    opened by miaowrx 1
Owner
xboot.org
xboot.org
A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

JinquanPan 43 Jun 21, 2022
YOLO v5 ONNX Runtime C++ inference code.

yolov5-onnxruntime C++ YOLO v5 ONNX Runtime inference code for object detection. Dependecies: OpenCV 4.5+ ONNXRuntime 1.7+ OS: Windows 10 or Ubuntu 20

null 62 Jun 21, 2022
Support Yolov4/Yolov3/Centernet/Classify/Unet. use darknet/libtorch/pytorch to onnx to tensorrt

ONNX-TensorRT Yolov4/Yolov3/CenterNet/Classify/Unet Implementation Yolov4/Yolov3 centernet INTRODUCTION you have the trained model file from the darkn

null 156 Jun 10, 2022
Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Beidi Chen 1k Jun 17, 2022
Nvvl - A library that uses hardware acceleration to load sequences of video frames to facilitate machine learning training

NVVL is part of DALI! DALI (Nvidia Data Loading Library) incorporates NVVL functionality and offers much more than that, so it is recommended to switc

NVIDIA Corporation 657 Jun 9, 2022
Radeon Rays is ray intersection acceleration library for hardware and software multiplatforms using CPU and GPU

RadeonRays 4.1 Summary RadeonRays is a ray intersection acceleration library. AMD developed RadeonRays to help developers make the most of GPU and to

GPUOpen Libraries & SDKs 961 Jun 23, 2022
Cranium - 🤖 A portable, header-only, artificial neural network library written in C99

Cranium is a portable, header-only, feedforward artificial neural network library written in vanilla C99. It supports fully-connected networks of arbi

Devin Soni 529 Jun 26, 2022
Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU.

Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode

NVIDIA Isaac ROS 36 Jun 20, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator compatible with deep learning frameworks, PyTorch and TensorFlow/Keras, as well as classical machine learning libraries such as scikit-learn, and more.

Microsoft 7k Jun 25, 2022
yolov5 onnx caffe

环境配置 ubuntu:18.04 cuda:10.0 cudnn:7.6.5 caffe: 1.0 OpenCV:3.4.2 Anaconda3:5.2.0 相关的安装包我已经放到百度云盘,可以从如下链接下载: https://pan.baidu.com/s/17bjiU4H5O36psGrHlF

null 49 Jun 22, 2022
Examples for using ONNX Runtime for machine learning inferencing.

Examples for using ONNX Runtime for machine learning inferencing.

Microsoft 236 Jun 25, 2022
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

Mozilla 19.8k Jun 28, 2022
PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r

null 3k Jun 27, 2022
Benchmark framework of 3D integrated CIM accelerators for popular DNN inference, support both monolithic and heterogeneous 3D integration

3D+NeuroSim V1.0 The DNN+NeuroSim framework was developed by Prof. Shimeng Yu's group (Georgia Institute of Technology). The model is made publicly av

NeuroSim 10 Dec 21, 2021
An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

手写AI 1k Jun 28, 2022
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

null 56.9k Jun 23, 2022
OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.

OpenEmbedding English version | 中文版 About OpenEmbedding is an open-source framework for TensorFlow distributed training acceleration. Nowadays, many m

4Paradigm 18 Jun 16, 2022
TengineFactory - Algorithm acceleration landing framework, let you complete the development of algorithm at low cost.eg: Facedetect, FaceLandmark..

简介 随着人工智能的普及,深度学习算法的越来越规整,一套可以低代码并且快速落地并且有定制化解决方案的框架就是一种趋势。为了缩短算法落地周期,降低算法落地门槛是一个必然的方向。 TengineFactory 是由 OPEN AI LAB 自主研发的一套快速,低代码的算法落地框架。我们致力于打造一个完全

OAID 88 May 16, 2022
The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs based on CUDA.

dgSPARSE Library Introdution The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs bas

dgSPARSE 49 Jun 17, 2022