Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, ncnn, MNN, SNPE, Arm NN, NNAbla

Overview

InferenceHelper

  • This is a helper class for deep learning frameworks especially for inference
  • This class provides an interface to use various deep learnig frameworks, so that you can use the same application code

GitHub GitHub

Class Diagram

Supported frameworks

  • TensorFlow Lite
  • TensorFlow Lite with delegate (XNNPACK, GPU, EdgeTPU, NNAPI)
  • TensorRT (GPU, DLA)
  • OpenCV(dnn)
  • OpenCV(dnn) with GPU
  • ncnn
  • MNN
  • SNPE (Snapdragon Neural Processing Engine SDK (Qualcomm Neural Processing SDK for AI v1.51.0))
  • Arm NN
  • NNabla
  • NNabla with CUDA

Supported targets

  • Windows 10 (Visual Studio 2017 x64, Visual Studio 2019 x64)
  • Linux (x64, armv7, aarch64)
  • Android (armv7, aarch64)

Tested Environment

Framework Windows (x64) Linux (x64) Linux (armv7) Linux (aarch64) Android (aarch64)
OpenCV(dnn) OK OK OK OK not tested
TensorFlow Lite OK OK OK OK OK
TensorFlow Lite + XNNPACK OK OK OK OK OK
TensorFlow Lite + GPU not supported OK OK OK OK
TensorFlow Lite + EdgeTPU OK not tested OK OK not supported
TensorFlow Lite + NNAPI not supported not supported not supported not supported OK
TensorRT not tested not tested not tested OK not supported
ncnn OK OK OK OK OK
MNN OK OK OK OK OK
SNPE not supported not supported not tested OK OK
Arm NN not supported OK not supported OK not supported
NNabla OK note tested not supported OK not tested
NNabla + CUDA OK note tested not supported OK not tested
Note Visual Studio 2017, 2019 Xubuntu 18.04 Raspberry Pi Jetson Xavier NX Pixel 4a

Sample project

https://github.com/iwatake2222/InferenceHelper_Sample

Related projects

Usage

Installation

Extra steps

You need some extra steps if you use the frameworks listed below

Extra steps: Tensorflow Lite

  • Download header files
    git submodule init
    git submodule update
    cd third_party/tensorflow
    chmod +x tensorflow/lite/tools/make/download_dependencies.sh
    tensorflow/lite/tools/make/download_dependencies.sh
    

Extra steps: SNPE

Extra steps: NNabla

Project settings in CMake

  • Add InferenceHelper to your project
    set(INFERENCE_HELPER_DIR ${CMAKE_CURRENT_LIST_DIR}/../../InferenceHelper/)
    add_subdirectory(${INFERENCE_HELPER_DIR}/inference_helper inference_helper)
    target_include_directories(${LibraryName} PUBLIC ${INFERENCE_HELPER_DIR}/inference_helper)
    target_link_libraries(${LibraryName} InferenceHelper)

CMake options

  • Deep learning framework:

    • You can enable multiple options althoguh the following example enables just one option
    # OpenCV (dnn)
    cmake .. -DINFERENCE_HELPER_ENABLE_OPENCV=on
    # Tensorflow Lite
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE=on
    # Tensorflow Lite (XNNPACK)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_XNNPACK=on
    # Tensorflow Lite (GPU)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_GPU=on
    # Tensorflow Lite (EdgeTPU)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_EDGETPU=on
    # Tensorflow Lite (NNAPI)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_NNAPI=on
    # TensorRT
    cmake .. -DINFERENCE_HELPER_ENABLE_TENSORRT=on
    # ncnn
    cmake .. -DINFERENCE_HELPER_ENABLE_NCNN=on
    # MNN
    cmake .. -DINFERENCE_HELPER_ENABLE_MNN=on
    # SNPE
    cmake .. -DINFERENCE_HELPER_ENABLE_SNPE=on
    # Arm NN
    cmake .. -DINFERENCE_HELPER_ENABLE_ARMNN=on
    # NNabla
    cmake .. -DINFERENCE_HELPER_ENABLE_NNABLA=on
    # NNabla with CUDA
    cmake .. -DINFERENCE_HELPER_ENABLE_NNABLA_CUDA=on
  • Enable/Disable preprocess using OpenCV:

    • By disabling this option, InferenceHelper is not dependent on OpenCV
    cmake .. -INFERENCE_HELPER_ENABLE_PRE_PROCESS_BY_OPENCV=off

APIs

InferenceHelper

Enumeration

typedef enum {
    kOpencv,
    kOpencvGpu,
    kTensorflowLite,
    kTensorflowLiteXnnpack,
    kTensorflowLiteGpu,
    kTensorflowLiteEdgetpu,
    kTensorflowLiteNnapi,
    kTensorrt,
    kNcnn,
    kMnn,
    kSnpe,
    kArmnn,
    kNnabla,
    kNnablaCuda,
} HelperType;

static InferenceHelper* Create(const HelperType helper_type)

  • Create InferenceHelper instance for the selected framework
std::unique_ptr<InferenceHelper> inference_helper(InferenceHelper::Create(InferenceHelper::kTensorflowLite));

static void PreProcessByOpenCV(const InputTensorInfo& input_tensor_info, bool is_nchw, cv::Mat& img_blob)

  • Run preprocess (convert image to blob(NCHW or NHWC))
  • This is just a helper function. You may not use this function.
    • Available when INFERENCE_HELPER_ENABLE_PRE_PROCESS_BY_OPENCV=on
InferenceHelper::PreProcessByOpenCV(input_tensor_info, false, img_blob);

int32_t SetNumThreads(const int32_t num_threads)

  • Set the number of threads to be used
  • This function needs to be called before initialize
inference_helper->SetNumThreads(4);

int32_t SetCustomOps(const std::vector<std::pair<const char*, const void*>>& custom_ops)

  • Set custom ops
  • This function needs to be called before initialize
std::vector<std::pair<const char*, const void*>> custom_ops;
custom_ops.push_back(std::pair<const char*, const void*>("Convolution2DTransposeBias", (const void*)mediapipe::tflite_operations::RegisterConvolution2DTransposeBias()));
inference_helper->SetCustomOps(custom_ops);

int32_t Initialize(const std::string& model_filename, std::vector& input_tensor_info_list, std::vector& output_tensor_info_list)

  • Initialize inference helper
    • Load model
    • Set tensor information
std::vector<InputTensorInfo> input_tensor_list;
InputTensorInfo input_tensor_info("input", TensorInfo::TENSOR_TYPE_FP32, false);    /* name, data_type, NCHW or NHWC */
input_tensor_info.tensor_dims = { 1, 224, 224, 3 };
input_tensor_info.data_type = InputTensorInfo::kDataTypeImage;
input_tensor_info.data = img_src.data;
input_tensor_info.image_info.width = img_src.cols;
input_tensor_info.image_info.height = img_src.rows;
input_tensor_info.image_info.channel = img_src.channels();
input_tensor_info.image_info.crop_x = 0;
input_tensor_info.image_info.crop_y = 0;
input_tensor_info.image_info.crop_width = img_src.cols;
input_tensor_info.image_info.crop_height = img_src.rows;
input_tensor_info.image_info.is_bgr = false;
input_tensor_info.image_info.swap_color = false;
input_tensor_info.normalize.mean[0] = 0.485f;   /* https://github.com/onnx/models/tree/master/vision/classification/mobilenet#preprocessing */
input_tensor_info.normalize.mean[1] = 0.456f;
input_tensor_info.normalize.mean[2] = 0.406f;
input_tensor_info.normalize.norm[0] = 0.229f;
input_tensor_info.normalize.norm[1] = 0.224f;
input_tensor_info.normalize.norm[2] = 0.225f;
input_tensor_list.push_back(input_tensor_info);

std::vector<OutputTensorInfo> output_tensor_list;
output_tensor_list.push_back(OutputTensorInfo("MobilenetV2/Predictions/Reshape_1", TensorInfo::TENSOR_TYPE_FP32));

inference_helper->initialize("mobilenet_v2_1.0_224.tflite", input_tensor_list, output_tensor_list);

int32_t Finalize(void)

  • Finalize inference helper
inference_helper->Finalize();

int32_t PreProcess(const std::vector& input_tensor_info_list)

  • Run preprocess
  • Call this function before invoke
  • Call this function even if the input data is already pre-processed in order to copy data to memory
  • Note : Some frameworks don't support crop, resize. So, it's better to resize image before calling preProcess.
inference_helper->PreProcess(input_tensor_list);

int32_t Process(std::vector& output_tensor_info_list)

  • Run inference
inference_helper->Process(output_tensor_info_list)

TensorInfo (InputTensorInfo, OutputTensorInfo)

Enumeration

enum {
    kTensorTypeNone,
    kTensorTypeUint8,
    kTensorTypeInt8,
    kTensorTypeFp32,
    kTensorTypeInt32,
    kTensorTypeInt64,
};

Properties

std::string name;           // [In] Set the name_ of tensor
int32_t     id;             // [Out] Do not modify (Used in InferenceHelper)
int32_t     tensor_type;    // [In] The type of tensor (e.g. kTensorTypeFp32)
std::vector<int32_t> tensor_dims;    // InputTensorInfo:   [In] The dimentions of tensor. (If empty at initialize, the size is updated from model info.)
                                     // OutputTensorInfo: [Out] The dimentions of tensor is set from model information
bool        is_nchw;        // [IN] NCHW or NHWC

InputTensorInfo

Enumeration

enum {
    kDataTypeImage,
    kDataTypeBlobNhwc,  // data_ which already finished preprocess(color conversion, resize, normalize_, etc.)
    kDataTypeBlobNchw,
};

Properties

void*   data;      // [In] Set the pointer to image/blob
int32_t data_type; // [In] Set the type of data_ (e.g. kDataTypeImage)

struct {
    int32_t width;
    int32_t height;
    int32_t channel;
    int32_t crop_x;
    int32_t crop_y;
    int32_t crop_width;
    int32_t crop_height;
    bool    is_bgr;        // used when channel == 3 (true: BGR, false: RGB)
    bool    swap_color;
} image_info;              // [In] used when data_type_ == kDataTypeImage

struct {
    float mean[3];
    float norm[3];
} normalize;              // [In] used when data_type_ == kDataTypeImage

OutputTensorInfo

Properties

void* data;     // [Out] Pointer to the output data_
struct {
    float   scale;
    uint8_t zero_point;
} quant;        // [Out] Parameters for dequantization (convert uint8 to float)

float* GetDataAsFloat()

  • Get output data in the form of FP32
  • When tensor type is INT8 (quantized), the data is converted to FP32 (dequantized)
const float* val_float = output_tensor_list[0].GetDataAsFloat();

License

Acknowledgements

  • This project utilizes OSS (Open Source Software)
Comments
  • GPU Device Selection

    GPU Device Selection

    How to choose gpu1 and gpu2 device if using embedded platform and two gpus deployed via tflite's nnapi? e.g. Dual GPU for embedded platform NXP i.MX8 QuadMax

    opened by Hank880223 2
  • Will you consider TensorInfo to support nlp or voice?

    Will you consider TensorInfo to support nlp or voice?

    Thanks for your open source, I would like to extend this project to support speech or NLP models. I found the TensorInfo does not support nlp or voice .

    https://github.com/iwatake2222/InferenceHelper/blob/9fbdc14f8a0fcd042519705ade6bc9a578cfd6b5/inference_helper/inference_helper.h#L26

    opened by Channingss 2
  • Update ncnn library

    Update ncnn library

    Now that ncnn provides pre-built libraries for major OS/CPU, it's better to use them, rather than using "my" pre-built libraries. https://github.com/Tencent/ncnn/releases/tag/20211208

    opened by iwatake2222 1
  • PreProcess is slow

    PreProcess is slow

    In the loop in PreProcess() function, GetWidth()/GetHeight()/GetChannel() is used as end condition. It makes processing speed slow because there are some logic in these getter functions.

    bug 
    opened by iwatake2222 1
  • Resource For NNABLA

    Resource For NNABLA

    opened by iwatake2222 0
Releases(20210816)
Owner
iwatake
iwatake
SuperGlue MNN C++部署,SuperGlue C++ Inference with MNN

MNNSuperGlue 概述 MNN Superglue 关键点匹配C++实现,原论文《SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)》,原pytorch代码https://gith

Hanson 43 Nov 17, 2022
Deep Learning API and Server in C++11 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Open Source Deep Learning Server & API DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state

JoliBrain 2.4k Dec 30, 2022
Lite.AI 🚀🚀🌟 is a user-friendly C++ lib for awesome🔥🔥🔥 AI models based on onnxruntime, ncnn or mnn. YOLOX, YoloV5, YoloV4, DeepLabV3, ArcFace, CosFace, Colorization, SSD

Lite.AI ?????? is a user-friendly C++ lib for awesome?????? AI models based on onnxruntime, ncnn or mnn. YOLOX??, YoloV5??, YoloV4??, DeepLabV3??, ArcFace??, CosFace??, Colorization??, SSD??, etc.

Def++ 2.4k Jan 4, 2023
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

TensorRT Open Source Software This repository contains the Open Source Software (OSS) components of NVIDIA TensorRT. Included are the sources for Tens

NVIDIA Corporation 6.4k Jan 4, 2023
This is a sample ncnn android project, it depends on ncnn library and opencv

This is a sample ncnn android project, it depends on ncnn library and opencv

null 248 Jan 6, 2023
🚀🚀🌟NanoDet with ONNXRuntime/MNN/TNN/NCNN.

nanodet.lite.ai.toolkit ?? ?? ?? 使用Lite.AI.ToolKit C++工具箱来跑NanoDet的一些案例(https://github.com/DefTruth/lite.ai.toolkit) ,ONNXRuntime、MNN、NCNN和TNN四个版本。 若是

DefTruth 13 Dec 29, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Dec 30, 2022
Simple inference deep head pose ncnn version

ncnn-deep-head-pose Simple implement inference deep head pose ncnn version with high performance and optimized resource. This project based on deep-he

Đỗ Công Minh 13 Dec 16, 2022
Openvino tensorflow - OpenVINO™ integration with TensorFlow

English | 简体中文 OpenVINO™ integration with TensorFlow This repository contains the source code of OpenVINO™ integration with TensorFlow, designed for T

OpenVINO Toolkit 169 Dec 23, 2022
Pose-tensorflow - Human Pose estimation with TensorFlow framework

Human Pose Estimation with TensorFlow Here you can find the implementation of the Human Body Pose Estimation algorithm, presented in the DeeperCut and

Eldar Insafutdinov 1.1k Dec 29, 2022
Forward - A library for high performance deep learning inference on NVIDIA GPUs

a library for high performance deep learning inference on NVIDIA GPUs.

Tencent 123 Mar 17, 2021
A library for high performance deep learning inference on NVIDIA GPUs.

Forward - A library for high performance deep learning inference on NVIDIA GPUs Forward - A library for high performance deep learning inference on NV

Tencent 509 Dec 17, 2022
TFCC is a C++ deep learning inference framework.

TFCC is a C++ deep learning inference framework.

Tencent 113 Dec 23, 2022
PPLNN is a high-performance deep-learning inference engine for efficient AI inferencing.

PPLNN, which is short for "PPLNN is a Primitive Library for Neural Network", is a high-performance deep-learning inference engine for efficient AI inferencing.

null 939 Dec 29, 2022
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices.

Xiaomi 4.7k Jan 3, 2023
Ericsson-SNPE-Yolov3线程池版本。

yolov3-thread-pool [TOC] Overview 本仓库是Ericsson-Yolov3-SNPE的优化版本,在使用新框架的基础上新增了一些特性: 使用GFlags完成命令行参数的设置; 通过读取json文件完成程序Gstreamer pipeline的配置; 抽象标准接口用于控制

Ricardo Lu 10 Nov 14, 2022
Number recognition with MNIST on Raspberry Pi Pico + TensorFlow Lite for Microcontrollers

About Number recognition with MNIST on Raspberry Pi Pico + TensorFlow Lite for Microcontrollers Device Raspberry Pi Pico LCDディスプレイ 2.8"240x320 SPI TFT

iwatake 51 Dec 16, 2022
Eloquent interface to Tensorflow Lite for Microcontrollers

This Arduino library is here to simplify the deployment of Tensorflow Lite for Microcontrollers models to Arduino boards using the Arduino IDE.

null 188 Dec 26, 2022
TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

Nobuo Tsukamoto 87 Nov 16, 2022