Examples for using ONNX Runtime for machine learning inferencing.

Overview

ONNX Runtime Inference Examples



This repo has examples that demonstrate the use of ONNX Runtime (ORT) for inference.

Examples

Outline the examples in the repository.

Example Description Pipeline Status
C/C++ examples Examples for ONNX Runtime C/C++ APIs Linux-CPU Windows-CPU
Mobile examples Examples that demonstrate how to use ONNX Runtime Mobile in mobile applications. Build Status
JavaScript API examples Examples that demonstrate how to use JavaScript API for ONNX Runtime.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Issues
  • Failing to get inference result in C++

    Failing to get inference result in C++

    I have an ONNX model that I trained using XGBoost and onnxmltools, and sucessfully tested in python using onnxruntime. I am now trying to load the model form C++, but I am unable to get viable inferences back. For this pouropose, I build a facilitator class, inspired by this example and this example. The calss looks like this:

    #include <onnxruntime_cxx_api.h>
    
    class Predictor{
      public:
        Ort::Session session{nullptr};
        Ort::MemoryInfo memory_info{nullptr};
        char* input_name = nullptr;
        char* output_name = nullptr;
        std::vector<int64_t> *dims;
    
        Predictor(std::string path) {
          Ort::Env env(OrtLoggingLevel::ORT_LOGGING_LEVEL_WARNING, "fast-inference");
          Ort::SessionOptions sessionOptions;
          sessionOptions.SetIntraOpNumThreads(1);
          sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
    
          this->session = Ort::Session(env, path.c_str(), sessionOptions);
          this->memory_info = Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);
    
          Ort::AllocatorWithDefaultOptions ort_alloc;
          input_name = this->session.GetInputName(0, ort_alloc);
          output_name = this->session.GetOutputName(1, ort_alloc); // 0 returns the labels, 1 returns the probabilities
    
          Ort::TypeInfo info = this->session.GetInputTypeInfo(0);
          auto tensor_info = info.GetTensorTypeAndShapeInfo();
          size_t dim_count = tensor_info.GetDimensionsCount();
          dims = new std::vector<int64_t>(dim_count);
          tensor_info.GetDimensions((*dims).data(), (*dims).size());
        };
    
        float* get_prediction(std::vector<float> f){
          const char* i[] = {this->input_name};
          const char* o[] = {this->output_name};
    
          (*this->dims)[0] = 0; // Apperently, dims are reported as (-1, 6), which generates an error. This changes the -1 to 0.
    
          auto input_tensor = Ort::Value::CreateTensor<float>(this->memory_info, f.data(), f.size(), (*this->dims).data(), (*this->dims).size());
          Ort::Value output_tensor{nullptr};
    
          // Ort::Value output_tensor{nullptr};
          // std::vector<Value> p this->session.Run(Ort::RunOptions{nullptr}, i, &input_tensor, 1, o, &output_tensor, 1);
          auto h = this->session.Run(Ort::RunOptions{nullptr}, i, &input_tensor, 1, o, 1);
          float* probs = h[0].GetTensorMutableData<float>();
          
          retrun probs; // <- Problem is here: probs is 0x00 and appears to hold no value
        };
    };
    
    void main(){
      Predictor pred("./model.onnx");
      std::vector<float> features{0.0268593, 0.20130636, 0.18040432, 0.11453316, 0.17075175, 0.0};
      auto r = pred.get_prediction(features);
    }
    

    However, the vector where the probabilities should be stored appears to be empty. Any clue on what the problem may be?

    opened by JNSFilipe 11
  • Problem onnxruntime node - windows server 2019

    Problem onnxruntime node - windows server 2019

    I try to run the quick-start_onnxruntime-node but in Windows Server 2019 Standard it does not work, it shows the following error, in windows 10 there is no problem. According to the documentation it works for windows server 2019.

    Could you help me what is happening?

    image

    Thank you very much!

    opened by mbravo1 4
  • Cannot read properties of null - SKL RandomForestClassifier

    Cannot read properties of null - SKL RandomForestClassifier

    Every time run my model in the browser I get the following error:

    util.ts:452 Uncaught (in promise) TypeError: Cannot read properties of null (reading 'elemType')
        at Function.t.tensorValueTypeFromProto (util.ts:452)
        at new t (graph.ts:92)
        at t.buildGraphFromOnnxFormat (graph.ts:256)
        at t.buildGraph (graph.ts:186)
        at new t (graph.ts:150)
        at Object.from (graph.ts:81)
        at t.loadFromOnnxFormat (model.ts:43)
        at t.load (model.ts:21)
        at session.ts:93
        at t.event (instrument.ts:337)
    

    I won't get past creating the inference session and loading the model:

    async function runClassExampleSkl() {
            console.log('runClassExampleSkl start');
            
            **Wont't get past this point:**
    	const session_skl = await ort.InferenceSession.create('model-skl.onnx');
    }
    

    And this is how I've converted my model:

    initial_type = [('float_input', FloatTensorType([None, 13]))]
    sklonnx = to_onnx(rfc, initial_types=initial_type, target_opset=9)
    with open("model-skl.onnx", "wb") as f:
        f.write(sklonnx.SerializeToString())
    

    And I can run it fine in python:

    sess = rt.InferenceSession("model-skl.onnx")
    input_name = sess.get_inputs()[0].name
    label_name = sess.get_outputs()[0].name
    pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
    

    I'm using: Python 3.8.8 scikit-learn 0.22.1 skl2onnx 1.9.3 Windows 10 Chrome onnxruntime 1.8

    opened by jayson-gomba 4
  • cannot open file 'atls.lib'

    cannot open file 'atls.lib'

    Hello,

    I am unable to compile the examples because I get the follwing message: LINK : fatal error LNK1104: cannot open file 'atls.lib'

    I use windows 10 with visual studio 2019. I have cmake installed and onnxruntime installed for x86. When using cmake, I used the flag -DONNXRUNTIME_ROOTDIR=c:\hrl\ort_install

    Could you help ?

    Thanks

    opened by blinnassime 2
  • [OpenVINO-EP] V3.4 update samples

    [OpenVINO-EP] V3.4 update samples

    [OpenVINO-EP] V3.4 update samples: -> Updated c# sample -> Added I/O buffer c++ sample and it's documentation -> Added Requirements.txt file for tinyYolov2 python sample

    opened by MaajidKhan 2
  • Add general Xamarin sample for vision models.

    Add general Xamarin sample for vision models.

    Currently demonstrates image classification and face recognition. More models can be added in the future for SSD etc. Setup to be able to get ORT from the INT NuGet repo for pre-release testing, and the main NuGet repo for official release testing.

    opened by skottmckay 2
  • [OpenVINO EP] Docx update for OpenVINO EP Samples

    [OpenVINO EP] Docx update for OpenVINO EP Samples

    This PR: ->Updated docx which went missing during porting the samples from https://github.com/microsoft/onnxruntime/

    -> Added new yolov4 OpenVINO EP sample code files and it's README.md

    opened by MaajidKhan 2
  • [iOS Object Detection] Add object detection e2e ios sample application

    [iOS Object Detection] Add object detection e2e ios sample application

    • A basic application works as an object detector using ssd_mobilenet_v1(quantized) model (from TF)

    Several todos: ✅

    1. Complete the readme to add documentations for the part convert model from TF->ONNX->ORT format
    2. Add host link for the prepared ort ssd mobilenet model for user to download
    3. Quantized model is not supported for this sample app
    opened by YUNQIUGUO 2
  • [JS] [onnxruntime-web] WebAssembly files were not added to dist folder

    [JS] [onnxruntime-web] WebAssembly files were not added to dist folder

    I'm very excited to try ONNX Runtime Web. I had some issues with the quick-start_onnxruntime-web-bundler sample:

    1. The sample didn't build for me because it uses "onnxruntime-web": "^1.8.0" which isn't released yet. So I used did npm install [email protected] and used "1.7.0-dev.4.2". The provided model.onnx worked fine with "1.7.0-dev.4.2".
    2. WebAssembly files were not added to dist folder: image I was able to resolve that and another error by doing cp node_modules/onnxruntime-web/dist/ort-wasm-threaded.wasm node_modules/onnxruntime-web/dist/ort-wasm-threaded.worker.js ./dist/ after I built. Should I need to do that? Maybe this will be corrected in version 1.8.0?

    UPDATE: I'm using "onnxruntime-web": "^1.7.0-dev.5" and when I build I notice:

    WARNING in ./node_modules/onnxruntime-web/lib/wasm/binding/ort-wasm-threaded.worker.js 1:2882-2889    
    Critical dependency: require function is used in a way in which dependencies cannot be statically extracted
     @ ./node_modules/onnxruntime-web/lib/wasm/wasm-factory.js 96:28-76
     @ ./node_modules/onnxruntime-web/lib/backend-wasm.js 11:23-53
     @ ./node_modules/onnxruntime-web/lib/index.js 18:23-48
     @ ./app.js 2:12-38
    

    The provided sample worked fine with "onnxruntime-web": ""1.7.0-dev.4.2" even though the Wasm files weren't copied. I had some issues trying to use my model file but I'll make a separate issue about my model file.

    opened by juharris 2
  • Why does my onnx model in python just have 1 input but the onnx model in nodejs has 3 inputs?

    Why does my onnx model in python just have 1 input but the onnx model in nodejs has 3 inputs?

    My actual input: Im using this matrix with shape [74, 1, 256] for my model. image

    Onnx model in Python has 1 input but onnx model in Nodejs has 3 inputs

    onnx python model input is a tensor: image

    But nodejs model has 3 inputs ???: image

    How to apply my actual input to nodejs model?

    opened by whoisltd 1
  • Bump tensorflow from 2.6.3 to 2.6.4 in /mobile/examples/object_detection/ios/ORTObjectDetection

    Bump tensorflow from 2.6.3 to 2.6.4 in /mobile/examples/object_detection/ios/ORTObjectDetection

    Bumps tensorflow from 2.6.3 to 2.6.4.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.6.4

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    Release 2.8.0

    Major Features and Improvements

    • tf.lite:

      • Added TFLite builtin op support for the following TF ops:
        • tf.raw_ops.Bucketize op on CPU.
        • tf.where op for data types tf.int32/tf.uint32/tf.int8/tf.uint8/tf.int64.
        • tf.random.normal op for output data type tf.float32 on CPU.
        • tf.random.uniform op for output data type tf.float32 on CPU.
        • tf.random.categorical op for output data type tf.int64 on CPU.
    • tensorflow.experimental.tensorrt:

      • conversion_params is now deprecated inside TrtGraphConverterV2 in favor of direct arguments: max_workspace_size_bytes, precision_mode, minimum_segment_size, maximum_cached_engines, use_calibration and

    ... (truncated)

    Commits
    • 33ed2b1 Merge pull request #56102 from tensorflow/mihaimaruseac-patch-1
    • e1ec480 Fix build due to importlib-metadata/setuptools
    • 63f211c Merge pull request #56033 from tensorflow-jenkins/relnotes-2.6.4-6677
    • 22b8fe4 Update RELEASE.md
    • ec30684 Merge pull request #56070 from tensorflow/mm-cp-adafb45c781-on-r2.6
    • 38774ed Merge pull request #56060 from yongtang:curl-7.83.1
    • 9ef1604 Merge pull request #56036 from tensorflow-jenkins/version-numbers-2.6.4-9925
    • a6526a3 Update version numbers to 2.6.4
    • cb1a481 Update RELEASE.md
    • 4da550f Insert release notes place-fill
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • openvino-EP error

    openvino-EP error

    OpenVINO device type is set to: CPU_FP32 2022-06-09 15:48:19.6454397 [E:onnxruntime:image-classification-inference, provider_bridge_ort.cc:1022 onnxruntime::ProviderLibrary::Get] LoadLibrary failed with error 126 "" when trying to load "E:\onnxruntime-inference-examples\c_cxx\OpenVINO_EP\Windows\build\squeezenet_classification\Debug\onnxruntime_providers_openvino.dll"

    opened by quanzhang2020 1
  • execution provider set in quantization/image_classification/cpu/run.py

    execution provider set in quantization/image_classification/cpu/run.py

    Set execution provider explicitly as required since ORT 1.0

    ValueError: This ORT build has ['CUDAExecutionProvider', 'DnnlExecutionProvider', 'CPUExecutionProvider'] enabled. Since ORT 1.9, you are required to explicitly set the providers parameter when instantiating InferenceSession. For example, onnxruntime.InferenceSession(..., providers=['CUDAExecutionProvider', 'DnnlExecutionProvider', 'CPUExecutionProvider'], ...)

    opened by swenkel 0
  • [docs] add link to doc explaining how to replace use of the pre-built package in samples with a custom build

    [docs] add link to doc explaining how to replace use of the pre-built package in samples with a custom build

    opened by YUNQIUGUO 1
  • Running repo with yolox model

    Running repo with yolox model

    Hello, please how can I run this repo with a yolox onnx model, I have tried, this is the error I get:

    onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: images for the following indices index: 1 Got: 416 Expected: 3 index: 3 Got: 3 Expected: 416 Please fix either the inputs or the model.

    opened by MfonobongIme 1
  • quantized model only forward faster than float32, but if include get output, slower

    quantized model only forward faster than float32, but if include get output, slower

    Hi, I have a quantized model perfcount on time, quantized is faster:

    def run_time(model_p):
        session = ort.InferenceSession(model_p)
        input_name = session.get_inputs()[0].name
        total = 0.0
        runs = 10
        input_data = np.zeros((1, 3, 224, 224), np.float32)
        _ = session.run([], {input_name: input_data})
        for i in range(runs):
            start = time.perf_counter()
            _ = session.run([], {input_name: input_data})
            end = (time.perf_counter() - start) * 1000
            total += end
            print(f"{end:.2f}ms")
        total /= runs
        print(f"Avg: {total:.2f}ms")
    

    Output:

    7.57ms
    7.45ms
    7.44ms
    7.37ms
    7.42ms
    7.48ms
    7.65ms
    7.46ms
    7.39ms
    7.39ms
    Avg: 7.46ms
    5.01ms
    5.27ms
    5.06ms
    5.01ms
    5.00ms
    4.98ms
    5.03ms
    4.99ms
    5.00ms
    5.05ms
    Avg: 5.04ms
    

    int8 faster.

    But, when eval it, get output and calculate max value, it become slower:

    def evaluate_onnx_model(model_p, test_loader, criterion=None):
        running_loss = 0
        running_corrects = 0
    
        session = ort.InferenceSession(model_p)
        input_name = session.get_inputs()[0].name
    
        total = 0.
        for inputs, labels in test_loader:
            inputs = inputs.cpu().numpy()
            labels = labels.cpu().numpy()
    
            start = time.perf_counter()
            outputs = session.run([], {input_name: inputs})[0]
            end = (time.perf_counter() - start)
            total += end
    
            preds = np.argmax(outputs, 1)
            if criterion is not None:
                loss = criterion(outputs, labels).item()
            else:
                loss = 0
            # statistics
            running_corrects += np.sum(preds == labels)
    
        # eval_loss = running_loss / len(test_loader.dataset)
        eval_accuracy = running_corrects / len(test_loader.dataset)
        total /= len(test_loader)
        print(f"eval loss: {0}, eval acc: {eval_accuracy}, cost: {total}")
        return 0, eval_accuracy
    
    eval loss: 0, eval acc: 0.8477, cost: 0.9931462904438376
    eval loss: 0, eval acc: 0.8345, cost: 1.501858500018716
    
    

    the cost is slower than foat32 model.... How could this be?

    opened by jinfagang 0
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8k Jul 1, 2022
PPLNN is a high-performance deep-learning inference engine for efficient AI inferencing.

PPLNN, which is short for "PPLNN is a Primitive Library for Neural Network", is a high-performance deep-learning inference engine for efficient AI inferencing.

null 814 Jun 19, 2022
Pure C ONNX runtime with zero dependancies for embedded devices

?? cONNXr C ONNX Runtime A onnx runtime written in pure C99 with zero dependencies focused on embedded devices. Run inference on your machine learning

Alvaro 124 Jun 15, 2022
YOLO v5 ONNX Runtime C++ inference code.

yolov5-onnxruntime C++ YOLO v5 ONNX Runtime inference code for object detection. Dependecies: OpenCV 4.5+ ONNXRuntime 1.7+ OS: Windows 10 or Ubuntu 20

null 62 Jun 21, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Jun 26, 2022
A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms.

iNeural A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms. What is a Neural Network? Work on

Fatih Küçükkarakurt 5 Apr 5, 2022
A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Libonnx A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support. Getting Started The library's

xboot.org 400 Jun 27, 2022
yolov5 onnx caffe

环境配置 ubuntu:18.04 cuda:10.0 cudnn:7.6.5 caffe: 1.0 OpenCV:3.4.2 Anaconda3:5.2.0 相关的安装包我已经放到百度云盘,可以从如下链接下载: https://pan.baidu.com/s/17bjiU4H5O36psGrHlF

null 49 Jun 22, 2022
Support Yolov4/Yolov3/Centernet/Classify/Unet. use darknet/libtorch/pytorch to onnx to tensorrt

ONNX-TensorRT Yolov4/Yolov3/CenterNet/Classify/Unet Implementation Yolov4/Yolov3 centernet INTRODUCTION you have the trained model file from the darkn

null 156 Jun 10, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 166.1k Jul 1, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 897 May 28, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.6k Jul 1, 2022
A lightweight C++ machine learning library for embedded electronics and robotics.

Fido Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics. Fido is especially suited for robotic

The Fido Project 412 Jun 25, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 13.9k Jun 30, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jun 24, 2022
Feature Store for Machine Learning

Overview Feast (Feature Store) is an operational data system for managing and serving machine learning features to models in production. Please see ou

Feast 3.3k Jun 30, 2022
Machine Learning Platform for Kubernetes

Reproduce, Automate, Scale your data science. Welcome to Polyaxon, a platform for building, training, and monitoring large scale deep learning applica

polyaxon 3.1k Jul 2, 2022