Simple samples for TensorRT programming

Overview

Introduction

This is a collection of simplified TensorRT samples to get you started with TensorRT programming. Most of the samples are written in C++, and some are in Python to show the basics.

Build and Run

For C++ samples, please install TensorRT properly and modify the Makefile according to your setup, then run make. For Python samples, please install TensorRT with Python wheel, and install PyTorch and onnx_graphsurgeon with pip before running the scripts prefixed with "app".

C++ Samples

AppBasic

This is a basic sample which shows how to build and run an engine with static-shaped input (which we'll call "static-shape engine" for short) and save the engine to disk. This sample introduces a reusable class TrtLite, which will be used throughout these samples. The class is concise (~300 lines of code) yet covers most functions of TensorRT and simplifies its programming.

AppDynamicShape

This sample shows how to build and run an engine with dynamic-shaped input ("dynamic-shape engine" for short), and how to copy data and run the engine asynchronously. You may use Nsight Systems to see the timeline of GPU events.

It's important to overlap copying data, including copying from host memory to device memory for input and vice versa for output, and running the engine. This technique makes it possible to run inference tasks consecutively on GPU and maximizes the throughput. This technique can also be applied to other samples.

AppLoadRefit

This sample shows how to load an engine from disk and run it.

It also shows how to refit an engine. To refit a FP16 engine efficiently you need to save the build logs to a file. See the inlined comments in the source file for details.

AppInt8

This sample shows 3 use cases:

  1. Build a static-shape engine and run it in int8.
  2. Build a dynamic-shape engine and run it in int8.
  3. Build a quantization-aware trained (QAT) networks and run it in int8.

AppPlugin

This sample shows how to write a simple static-shape plugin. This plugin supports fp32/fp16/int8 precision. Please note that the plugin interface IPluginV2IOExt is for and only for static shape.

AppPluginDynamicShape

This sample is similar to AppPlugin but in dynamic shape. Please be noted that the plugin interface IPluginV2DynamicExt is for and only for dynamic shape.

AppMultiContext

This sample shows how to create multiple contexts from the same engine thus saving device memory space for network weights, and how to run the engine contexts on their own stream.

Like AppDynamicShape, it also runs asynchronously and you may use Nsight Systems to see the timeline of GPU events.

AppThroughput

This sample loads an engine from disk and get a benchmark on how many QPS can be achieved for max throughput. By default the sample loads an engine created from a Python sample and the command line utility trtexec (see below).

AppOnnx

This sample shows how to build an engine from an ONNX file with the ONNX parser. Please note trtexec, the command line utility shipped with the TensorRT offical release, also has this functionality.

Python Samples

app_basic.py

This sample shows how to build, save and run static-shape and dynamic-shape engines. TensorRT can be programmed with C++ or Python; C++ program can load and run the engine saved by Python program and vice versa.

app_onnx_resnet50.py

This sample shows how to export a PyTorch model into onnx. With the trtexec command in the script, you can convert onnx into TensorRT engine file by the utility trtexec (so the engine can be loaded and run such as by AppThroughput).

app_onnx_custom.py

This sample shows how to export a PyTorch model containing unsupported operator into onnx, and how to modify the onnx with graph surgeon so it can be converted into TensorRT engine smoothly.

Comments
  • Bump tensorflow-gpu from 1.15.5 to 2.7.2 in /cookbook/04-Parser/TensorFlow-Caffe-TensorRT

    Bump tensorflow-gpu from 1.15.5 to 2.7.2 in /cookbook/04-Parser/TensorFlow-Caffe-TensorRT

    Bumps tensorflow-gpu from 1.15.5 to 2.7.2.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.7.2

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    TensorFlow 2.7.1

    Release 2.7.1

    This releases introduces several vulnerability fixes:

    • Fixes a floating point division by 0 when executing convolution operators (CVE-2022-21725)
    • Fixes a heap OOB read in shape inference for ReverseSequence (CVE-2022-21728)
    • Fixes a heap OOB access in Dequantize (CVE-2022-21726)
    • Fixes an integer overflow in shape inference for Dequantize (CVE-2022-21727)
    • Fixes a heap OOB access in FractionalAvgPoolGrad (CVE-2022-21730)
    • Fixes an overflow and divide by zero in UnravelIndex (CVE-2022-21729)
    • Fixes a type confusion in shape inference for ConcatV2 (CVE-2022-21731)
    • Fixes an OOM in ThreadPoolHandle (CVE-2022-21732)
    • Fixes an OOM due to integer overflow in StringNGrams (CVE-2022-21733)
    • Fixes more issues caused by incomplete validation in boosted trees code (CVE-2021-41208)
    • Fixes an integer overflows in most sparse component-wise ops (CVE-2022-23567)
    • Fixes an integer overflows in AddManySparseToTensorsMap (CVE-2022-23568)

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    • Fixes a code injection in saved_model_cli (CVE-2022-29216)
    • Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)
    • Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)
    • Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)
    • Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)
    • Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)
    • Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)
    • Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)
    • Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)
    • Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)
    • Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)
    • Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)
    • Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)
    • Fixes a segfault due to missing support for quantized types (CVE-2022-29205)
    • Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

    ... (truncated)

    Commits
    • dd7b8a3 Merge pull request #56034 from tensorflow-jenkins/relnotes-2.7.2-15779
    • 1e7d6ea Update RELEASE.md
    • 5085135 Merge pull request #56069 from tensorflow/mm-cp-52488e5072f6fe44411d70c6af09e...
    • adafb45 Merge pull request #56060 from yongtang:curl-7.83.1
    • 01cb1b8 Merge pull request #56038 from tensorflow-jenkins/version-numbers-2.7.2-4733
    • 8c90c2f Update version numbers to 2.7.2
    • 43f3cdc Update RELEASE.md
    • 98b0a48 Insert release notes place-fill
    • dfa5cf3 Merge pull request #56028 from tensorflow/disable-tests-on-r2.7
    • 501a65c Disable timing out tests
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 6
  • Bump tensorflow from 1.15.5 to 2.7.2 in /cookbook

    Bump tensorflow from 1.15.5 to 2.7.2 in /cookbook

    Bumps tensorflow from 1.15.5 to 2.7.2.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.7.2

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    TensorFlow 2.7.1

    Release 2.7.1

    This releases introduces several vulnerability fixes:

    • Fixes a floating point division by 0 when executing convolution operators (CVE-2022-21725)
    • Fixes a heap OOB read in shape inference for ReverseSequence (CVE-2022-21728)
    • Fixes a heap OOB access in Dequantize (CVE-2022-21726)
    • Fixes an integer overflow in shape inference for Dequantize (CVE-2022-21727)
    • Fixes a heap OOB access in FractionalAvgPoolGrad (CVE-2022-21730)
    • Fixes an overflow and divide by zero in UnravelIndex (CVE-2022-21729)
    • Fixes a type confusion in shape inference for ConcatV2 (CVE-2022-21731)
    • Fixes an OOM in ThreadPoolHandle (CVE-2022-21732)
    • Fixes an OOM due to integer overflow in StringNGrams (CVE-2022-21733)
    • Fixes more issues caused by incomplete validation in boosted trees code (CVE-2021-41208)
    • Fixes an integer overflows in most sparse component-wise ops (CVE-2022-23567)
    • Fixes an integer overflows in AddManySparseToTensorsMap (CVE-2022-23568)

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.7.2

    This releases introduces several vulnerability fixes:

    Release 2.6.4

    This releases introduces several vulnerability fixes:

    • Fixes a code injection in saved_model_cli (CVE-2022-29216)
    • Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)
    • Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)
    • Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)
    • Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)
    • Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)
    • Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)
    • Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)
    • Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)
    • Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)
    • Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)
    • Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)
    • Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)
    • Fixes a segfault due to missing support for quantized types (CVE-2022-29205)
    • Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

    ... (truncated)

    Commits
    • dd7b8a3 Merge pull request #56034 from tensorflow-jenkins/relnotes-2.7.2-15779
    • 1e7d6ea Update RELEASE.md
    • 5085135 Merge pull request #56069 from tensorflow/mm-cp-52488e5072f6fe44411d70c6af09e...
    • adafb45 Merge pull request #56060 from yongtang:curl-7.83.1
    • 01cb1b8 Merge pull request #56038 from tensorflow-jenkins/version-numbers-2.7.2-4733
    • 8c90c2f Update version numbers to 2.7.2
    • 43f3cdc Update RELEASE.md
    • 98b0a48 Insert release notes place-fill
    • dfa5cf3 Merge pull request #56028 from tensorflow/disable-tests-on-r2.7
    • 501a65c Disable timing out tests
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 5
  • AddPlugin examples need to set plugin field in the constructor of PluginCreator

    AddPlugin examples need to set plugin field in the constructor of PluginCreator

    • Environment TensorRT 8.4 GA

    • Reproduction Steps

    ## Code file 1
    ## use AddScalarPlugin example
    import onnx_graphsurgeon as gs
    import onnx
    from collections import OrderedDict
    import numpy as np
    
    soFile = "./AddScalarPlugin.so"
    onnxFile = "./model.onnx"
    
    ctypes.cdll.LoadLibrary(soFilePath)
    
    inputs = gs.Variable(
        name="inputs", dtype=np.float32, shape=["batch", "seq", 256])
    outputs = gs.Variable(
        name="outputs", dtype=np.float32, shape=["batch", "seq", 256])
    nodes = [
        gs.Node(
            op="AddScalar",
            name="AddScalar_1",
            attrs=OrderedDict(scalar=np.array([2.0], dtype=np.float32)),
            inputs=[inputs],
            outputs=[outputs]
        )
    ]
    
    graph = gs.Graph(
        nodes=nodes, inputs=[inputs], outputs=[outputs], opset=13,
        name="onnx")
    model = gs.export_onnx(graph=graph)
    onnx.save(model, onnxFile)
    
    ## Code file 2
    ## use AddScalarPlugin example
    import tensorrt as trt
    import numpy as np
    import ctypes
    
    soFile = "./AddScalarPlugin.so"
    onnxFile = "./model.onnx"
    logger = trt.Logger(trt.Logger.VERBOSE)
    trt.init_libnvinfer_plugins(logger, '')
    ctypes.cdll.LoadLibrary(soFile)
    builder = trt.Builder(logger)
    network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    profile = builder.create_optimization_profile()
    config = builder.create_builder_config()
    config.max_workspace_size = 6 << 30
    parser = trt.OnnxParser(network, logger)
    parser.parse_from_file(onnxFile)
    
    • Backgroud When using OnnxParser or polygraphy to load plugin with attribute, it would be no plugin field when plugin::createPlugin is called, which means that it fails to load the attribute of plugin from onnx model.

    • How to fix Take AddScalarPlugin::createPlugin as an example, these two lines are necessary before getting the size and data for plugin fields:

    attr_.clear();  
    attr_.emplace_back(PluginField("scalar", nullptr, PluginFieldType::kFLOAT32, 1));
    
    opened by georgeliu95 2
  • Convert3DMMTo2DMM例子中 reshape维度问题?

    Convert3DMMTo2DMM例子中 reshape维度问题?

    https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/32bb0afc33241f4553c643656fbf65290c0bc4db/cookbook/10-BestPractice/Convert3DMMTo2DMM/main.py#L128 这里输入的是B*T*1的tensor,不应该reshape成(B*T)*1吗? 为什么reshape成了256?

    opened by BaofengZan 2
  • Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/yolov3_onnx

    Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/yolov3_onnx

    Bumps numpy from 1.20.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/uff_ssd

    Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/uff_ssd

    Bumps numpy from 1.20.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/uff_custom_plugin

    Bump numpy from 1.20.2 to 1.22.0 in /cookbook/old-cookbook/python/uff_custom_plugin

    Bumps numpy from 1.20.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • LayerNorm Demo貌似跑不通,并且有一些错误。

    LayerNorm Demo貌似跑不通,并且有一些错误。

    文件地址:https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/master/cookbook/06-PluginAndParser/pyTorch-LayerNorm/main.py

    错误1:用onnx-graphsurgeon修改的时候报out of index, 原因是因为div后面没有新节点了,原因是因为torch.mul(x,1)被onnx忽略了。 解决方法:将61行与63行的 x = t.mul(x, 1)改成 x = t.mul(x, 1.0) 即可。

    错误2:125行与129行的onnxFile应该写错了吧,这个demo是修改onnx节点改成LayerNorm节点,按理应该是运行修改后的onnx文件onnxSurgeonFile才对。改完后check结果为False。

    错误3:58行的LayerNorm应该只要第三个维度就行了,作者写了三个维度,t.nn.LayerNorm([nBS, nSL, nEmbedding], elementwise_affine=False, eps=epsilon)改成t.nn.LayerNorm(nEmbedding, elementwise_affine=False, eps=epsilon)即可

    opened by Tlntin 1
  • Update MNISTExample.py

    Update MNISTExample.py

    bug fix in cookbook/03, where we should state stride_nd in TensorRT 8.2.3(the docker image provided in the repo), but not in TensorRT 8.0.3.4(It seem that the preset values are different). If not, the shape of pooling output is wrong, resulting in build engine failure.

    opened by jedibobo 1
  • Problems in CookBook Demo 08 polygraphy

    Problems in CookBook Demo 08 polygraphy

    I wonder what tensorflow version are this repo using? For the containner you give, the version of CUDA is 11.6, so I install tf 's newest version. I tried the newest tf2.8.0-gpu, and a lot of problems arises from the tf version., like: tfgraph_util.convert_variables_to_constants, initializer=tf.truncated_normal_initializer,tf.graph_util.convert_variables_to_constants and tf.gfile.FastGFile, all these are not recognized by tf2.8.0. So I change all of those to tf.compat.v1.***, run and found no errors. Maybe I could commit my changes and create a pr.

    opened by jedibobo 1
  • cookbook-04Parser-pytorch-onnx-tensorrt issues

    cookbook-04Parser-pytorch-onnx-tensorrt issues

    my env : wsl-20.04 tensorrt 8.4 cuda 11.3 torch 1.10 onnx 1.12.0 test code master branch main.py when i test the example code the tensorrt output is all "0 0 0 0 0 0 0 0" and i have no method to solve the problem the c++ code is same

    https://github.com/NVIDIA/trt-samples-for-hackathon-cn/issues/59#tasklist-block-da5de874-6d3a-4b8d-aab9-703667626d94

    opened by F0xZz 0
  • failed reading calibration cache via c++ api

    failed reading calibration cache via c++ api

    https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/88fe8012cd8cc5c09a63b18ecb91be5bd89a6a9f/cookbook/04-Parser/pyTorch-ONNX-TensorRT/C%2B%2B/calibrator.cpp#L58-L73

    operator >> in line 70 only reads the first word before white space or \n

    opened by KexianShen 0
  • Bump opencv-python-headless from 3.4.16.59 to 4.2.0.32 in /cookbook

    Bump opencv-python-headless from 3.4.16.59 to 4.2.0.32 in /cookbook

    Bumps opencv-python-headless from 3.4.16.59 to 4.2.0.32.

    Release notes

    Sourced from opencv-python-headless's releases.

    3.4.18.65

    opencv-python: https://pypi.org/project/opencv-python/ opencv-contrib-python: https://pypi.org/project/opencv-contrib-python/ opencv-python-headless: https://pypi.org/project/opencv-python-headless/ opencv-contrib-python-headless: https://pypi.org/project/opencv-contrib-python-headless/

    OpenCV 3.4.18

    Changes:

    • Updated third-party libraries to fix potential vulnerabilities. #666
    • Added support for building Windows ARM64 Python package. #644
    • The repository has been synchronized with scikit-build 0.14.0 release. #637
    • This release produced with libpng 1.6.37 and supports eXIf orientation tag. #662

    3.4.17.63

    opencv-python: https://pypi.org/project/opencv-python/ opencv-contrib-python: https://pypi.org/project/opencv-contrib-python/ opencv-python-headless: https://pypi.org/project/opencv-python-headless/ opencv-contrib-python-headless: https://pypi.org/project/opencv-contrib-python-headless/

    OpenCV 3.4.17

    Changes:

    • Updated third-party libraries to fix potential vulnerabilities. #617

    3.4.17.61

    opencv-python: https://pypi.org/project/opencv-python/ opencv-contrib-python: https://pypi.org/project/opencv-contrib-python/ opencv-python-headless: https://pypi.org/project/opencv-python-headless/ opencv-contrib-python-headless: https://pypi.org/project/opencv-contrib-python-headless/

    OpenCV 3.4.17

    Changes:

    • Switched to a single binary release with Python Limited API to cover all Python versions since 3.6. #595
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump protobuf from 3.16.0 to 3.18.3 in /cookbook/08-Tool/trex

    Bump protobuf from 3.16.0 to 3.18.3 in /cookbook/08-Tool/trex

    Bumps protobuf from 3.16.0 to 3.18.3.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.18.3

    C++

    Protocol Buffers v3.16.1

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.2

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.1

    Python

    • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
    • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

    Ruby

    • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

    Protocol Buffers v3.18.0

    C++

    • Fix warnings raised by clang 11 (#8664)
    • Make StringPiece constructible from std::string_view (#8707)
    • Add missing capability attributes for LLVM 12 (#8714)
    • Stop using std::iterator (deprecated in C++17). (#8741)
    • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
    • Fix #7047 Safely handle setlocale (#8735)
    • Remove deprecated version of SetTotalBytesLimit() (#8794)
    • Support arena allocation of google::protobuf::AnyMetadata (#8758)
    • Fix undefined symbol error around SharedCtor() (#8827)
    • Fix default value of enum(int) in json_util with proto2 (#8835)
    • Better Smaller ByteSizeLong
    • Introduce event filters for inject_field_listener_events
    • Reduce memory usage of DescriptorPool
    • For lazy fields copy serialized form when allowed.
    • Re-introduce the InlinedStringField class
    • v2 access listener
    • Reduce padding in the proto's ExtensionRegistry map.
    • GetExtension performance optimizations
    • Make tracker a static variable rather than call static functions
    • Support extensions in field access listener
    • Annotate MergeFrom for field access listener
    • Fix incomplete types for field access listener
    • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
    • Reduce binary size due to fieldless proto messages
    • TextFormat: ParseInfoTree supports getting field end location in addition to start.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump protobuf from 3.16.0 to 3.18.3 in /cookbook

    Bump protobuf from 3.16.0 to 3.18.3 in /cookbook

    Bumps protobuf from 3.16.0 to 3.18.3.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.18.3

    C++

    Protocol Buffers v3.16.1

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.2

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.1

    Python

    • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
    • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

    Ruby

    • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

    Protocol Buffers v3.18.0

    C++

    • Fix warnings raised by clang 11 (#8664)
    • Make StringPiece constructible from std::string_view (#8707)
    • Add missing capability attributes for LLVM 12 (#8714)
    • Stop using std::iterator (deprecated in C++17). (#8741)
    • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
    • Fix #7047 Safely handle setlocale (#8735)
    • Remove deprecated version of SetTotalBytesLimit() (#8794)
    • Support arena allocation of google::protobuf::AnyMetadata (#8758)
    • Fix undefined symbol error around SharedCtor() (#8827)
    • Fix default value of enum(int) in json_util with proto2 (#8835)
    • Better Smaller ByteSizeLong
    • Introduce event filters for inject_field_listener_events
    • Reduce memory usage of DescriptorPool
    • For lazy fields copy serialized form when allowed.
    • Re-introduce the InlinedStringField class
    • v2 access listener
    • Reduce padding in the proto's ExtensionRegistry map.
    • GetExtension performance optimizations
    • Make tracker a static variable rather than call static functions
    • Support extensions in field access listener
    • Annotate MergeFrom for field access listener
    • Fix incomplete types for field access listener
    • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
    • Reduce binary size due to fieldless proto messages
    • TextFormat: ParseInfoTree supports getting field end location in addition to start.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
NVIDIA Corporation
NVIDIA Corporation
TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

Nobuo Tsukamoto 87 Nov 16, 2022
AI-related samples made available by the DevTech ProViz team

ProViz-AI Samples This repository is a collection of AI-related samples, developed and provided by the DevTech ProViz team. Each folder in the reposit

NVIDIA Corporation 7 Nov 18, 2022
NVIDIA Texture Tools samples for compression, image processing, and decompression.

NVTT 3 Samples This repository contains a number of samples showing how to use NVTT 3, a GPU-accelerated texture compression and image processing libr

NVIDIA DesignWorks Samples 33 Nov 16, 2022
Implement yolov5 with Tensorrt C++ api, and integrate batchedNMSPlugin. A Python wrapper is also provided.

yolov5 Original codes from tensorrtx. I modified the yololayer and integrated batchedNMSPlugin. A yolov5s.wts is provided for fast demo. How to genera

weiwei zhou 45 Oct 21, 2022
TensorRT int8 量化部署 yolov5s 4.0 模型,实测3.3ms一帧!

tensorrt模型推理 git clone https://github.com/Wulingtian/yolov5_tensorrt_int8.git(求star) cd yolov5_tensorrt_int8 vim CMakeLists.txt 修改USER_DIR参数为自己的用户根目录

null 116 Nov 19, 2022
TensorRT implementation of RepVGG models from RepVGG: Making VGG-style ConvNets Great Again

RepVGG RepVGG models from "RepVGG: Making VGG-style ConvNets Great Again" https://arxiv.org/pdf/2101.03697.pdf For the Pytorch implementation, you can

weiwei zhou 69 Sep 10, 2022
Deep Learning API and Server in C++11 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Open Source Deep Learning Server & API DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state

JoliBrain 2.4k Nov 28, 2022
Support Yolov4/Yolov3/Centernet/Classify/Unet. use darknet/libtorch/pytorch to onnx to tensorrt

ONNX-TensorRT Yolov4/Yolov3/CenterNet/Classify/Unet Implementation Yolov4/Yolov3 centernet INTRODUCTION you have the trained model file from the darkn

null 170 Nov 29, 2022
vs2015上使用tensorRT加速yolov5推理(Using tensorrt to accelerate yolov5 reasoning on vs2015)

1、安装环境 CUDA10.2 TensorRT7.2 OpenCV3.4(工程中已给出,不需安装) vs2015 下载相关工程:https://github.com/wang-xinyu/tensorrtx.git 2、生成yolov5s.wts文件 在生成yolov5s.wts前,首先需要下载模

null 16 Apr 19, 2022
Inference framework for MoE layers based on TensorRT with Python binding

InfMoE Inference framework for MoE-based models, based on a TensorRT custom plugin named MoELayerPlugin (including Python binding) that can run infere

Shengqi Chen 34 Nov 25, 2022
TensorRT for Scaled YOLOv4(yolov4-csp.cfg)

TensoRT Scaled YOLOv4 TensorRT for Scaled YOLOv4(yolov4-csp.cfg) 很多人都写过TensorRT版本的yolo了,我也来写一个。 测试环境 ubuntu 18.04 pytorch 1.7.1 jetpack 4.4 CUDA 11.0

Bolano 10 Jul 30, 2021
YOLOv4 accelerated wtih TensorRT and multi-stream input using Deepstream

Deepstream 5.1 YOLOv4 App This Deepstream application showcases YOLOv4 running at high FPS throughput! P.S - Click the gif to watch the entire video!

Akash James 35 Nov 10, 2022
C++ library based on tensorrt integration

3行代码实现YoloV5推理,TensorRT C++库 支持最新版tensorRT8.0,具有最新的解析器算子支持 支持静态显性batch size,和动态非显性batch size,这是官方所不支持的 支持自定义插件,简化插件的实现过程 支持fp32、fp16、int8的编译 优化代码结构,打印

手写AI 1.4k Nov 23, 2022
A multi object tracking Library Based on tensorrt

YoloV5_JDE_TensorRT_for_Track Introduction A multi object detect and track Library Based on tensorrt 一个基于TensorRT的多目标检测和跟踪融合算法库,可以同时支持行人的多目标检测和跟踪,当然也可

zwg_cv 48 Nov 25, 2022
(ROS) YOLO detection with TensorRT, utilizing tkDNN

tkDNN-ROS YOLO object detection with ROS and TensorRT using tkDNN Currently, only YOLO is supported. Comparison of performance and other YOLO implemen

EungChang-Mason-Lee 5 Sep 4, 2022
An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

手写AI 1.4k Nov 22, 2022
Real-time object detection with YOLOv5 and TensorRT

YOLOv5-TensorRT The goal of this library is to provide an accessible and robust method for performing efficient, real-time inference with YOLOv5 using

Noah van der Meer 37 Nov 19, 2022
Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU.

Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode

NVIDIA Isaac ROS 52 Nov 24, 2022
An R3D network implemented with TensorRT

r3d_TensorRT An r3d network implemented with TensorRT8.x, The weight of the model comes from PyTorch. A description of the models in Pytroch can be fo

null 2 Nov 7, 2021