Optimized & Generic ML Filter Runtimes for VapourSynth (with builtin support for waifu2x, RealESRGANv2 & DPIR)

Overview

vs-mlrt

VapourSynth ML filter runtimes.

Please see the wiki for supported models.

vsov: OpenVINO-based Pure CPU Runtime

OpenVINO is an AI inference runtime developed by Intel, mainly targeting x86 CPUs and Intel GPUs.

The vs-openvino plugin provides optimized pure CPU runtime for some popular AI filters, with Intel GPU support planned in the future.

To install, download the latest release and extract them into your VS plugins directory.

Please visit the vsov directory for details.

vsort: ONNX Runtime-based CPU/GPU Runtime

ONNX Runtime is an AI inference runtime with many backends.

The vs-onnxruntime plugin provides optimized CPU and CUDA GPU runtime for some popular AI filters.

To install, download the latest release and extract them into your VS plugins directory.

Please visit the vsort directory for details.

vstrt: TensorRT-based GPU Runtime

TensorRT is a highly optimized AI inference runtime for NVidia GPUs. It uses benchmarking to find the optimal kernel to use for your specific GPU, and so there is an extra step to build an engine from ONNX network on the machine you are going to use the vstrt filter, and this extra step makes deploying models a little harder than the others runtimes. However, the resulting performance is also typically much much better than the CUDA backend of vsort.

To install, download the latest release and extract them into your VS plugins directory.

Please visit the vstrt directory for details.

Issues
  • 显存问题和颜色问题

    显存问题和颜色问题

    多次调用函数显存会叠加导致爆显存,而upcunet_v3_vs没有问题

    脚本例子

    import vapoursynth as vs
    import mvsfunc as mvf
    import upcunet_v3_vs as realcugan
    realcugan = realcugan.RealWaifuUpScaler()
    from vsmlrt import *
    
    core = vs.core
    src=r"123.jpg"
    src = core.ffms2.Source(src)
    
    def upscale(clip):
    #        clip = realcugan(clip)
            clip = CUGAN(clip, noise=0, scale=2, backend=Backend.ORT_CUDA())
            return clip
    
    src = mvf.ToRGB(src, depth=32, matrix="709")
    src = core.std.Expr(src, "x 0 max 1 min")
    src2 = src
    src = upscale(src) 
    src2 = upscale(src2)
    
    res = core.std.Splice([src,src2])
    res.set_output()
    

    问题2颜色问题 alpha=1 model=保守

    原图 156495643-ab912d3a-533a-4868-ab73-431583f01067 输出 Snipaste_2022-04-26_18-51-24

    def upscale(clip):
            clip = realcugan(clip)
            return clip
            
    def upscale2(clip):
            clip = CUGAN(clip, noise=0, scale=2, tilesize=300,  backend=Backend.ORT_CUDA())
            return clip
    #...
    res = core.std.StackHorizontal([src,src2])
    res.set_output()
    
    opened by ueyome 8
  • Waifu2x-cunet brightness bug

    Waifu2x-cunet brightness bug

    The current cunet-model version makes the output darker than it should be.

    It is a known bug in VS's Waifu2x ports, as it happens in the caffe port and was introduced to ncnn-vulkan in R4. ncnn-vulkan-R3.2 is the only one that doesn't have the bug.

    Here a comparison showing the input and the difference between a spline resize, vs-mlrt-v2 and vs-mlrt-v8, apparently the bug was introduced after v2: https://slow.pics/c/myFP5fXB

    opened by dnjulek 9
Releases(v9.2)
  • v9.2(Aug 7, 2022)

    Fixed issues

    • In vs-mlrt v9 and v9.1 on windows, the ORT_CUDA backend may fails for out of memory when processing a noninitial frame. This has been fixed and the performance should be improved.
    • Parameter use_cuda_graph of the ORT_CUDA backend now works properly on windows. It is however not recommended to use currently.

    Full Changelog: https://github.com/AmusementClub/vs-mlrt/compare/v9.1...v9.2

    Source code(tar.gz)
    Source code(zip)
    models.v9.2.7z(633.40 MB)
    scripts.v9.2.7z(6.59 KB)
    vsmlrt-cuda.v9.2.7z(834.21 MB)
    vsmlrt-windows-x64-cpu.v9.2.7z(651.70 MB)
    vsmlrt-windows-x64-cuda.v9.2.7z(1707.95 MB)
    VSORT-Windows-x64.v9.2.7z(12.63 MB)
    VSOV-Windows-x64.v9.2.7z(11.47 MB)
    VSTRT-Windows-x64.v9.2.7z(373.97 KB)
  • v9.1(Jul 28, 2022)

    Bugfix release for v9. Recommended update for v9 users. Please see release notes for v9 to see all the major new features.

    • Fix ort_cuda fp16 inference for CUGAN(version=2) model.

      A new parameter fp16_blacklist_ops is introduced in ort and ov backends for other issues possibly related to reduced precision.

      Please still carefully review the output of fp16 accelerated CUGAN(version=2).

    • Conform with CUGAN(version=2)'s dynamic range compression. This feature is enabled by setting conformance=True (which is the default) in the CUGAN wrapper in vsmlrt.py, and it's implemented as:

      clip = clip.std.Expr("x 0.7 * 0.15 +")
      clip = CUGAN(cilp, version=2)
      clip = clip.std.Expr("x 0.15 - 0.7 /")
      

    Known issues

    • These two issues are fixed in the v9.2 release.
      • The ORT_CUDA backend allocates memory during inference. This degrades performance and may results in out of memory error.
      • Parameter use_cuda_graph of the ORT_CUDA backend is broken on Windows.

    Full Changelog: https://github.com/AmusementClub/vs-mlrt/compare/v9...v9.1

    Source code(tar.gz)
    Source code(zip)
    models.v9.1.7z(633.40 MB)
    scripts.v9.1.7z(6.59 KB)
    vsmlrt-cuda.v9.1.7z(834.21 MB)
    vsmlrt-windows-x64-cpu.v9.1.7z(651.70 MB)
    vsmlrt-windows-x64-cuda.v9.1.7z(1707.95 MB)
    VSORT-Windows-x64.v9.1.7z(12.63 MB)
    VSOV-Windows-x64.v9.1.7z(11.47 MB)
    VSTRT-Windows-x64.v9.1.7z(374.34 KB)
  • v9(Mar 25, 2022)

    This is a major release.

    • Added support for Intel GPUs (both discrete [Xe Arc series] and integrated [Gen 8+ on Broadwell+])

      • In vsmlrt.py, this corresponds to the OV_GPU backend.
      • The openvino library is now dynamically linked because of the integration of oneDNN for GPU.
    • Added support for RealESRGANv3 and cugan-pro models.

    • Upgraded CUDA toolkit to 11.7.0, TensorRT to 8.4.1 and cuDNN to 8.4.1. It is now possible to build TRT engines for CUGAN, waifu2x cunet and upresnet10 models on RTX 2000 and RTX 3000 series GPUs.

    • The trt backend in vsmlrt.py wrapper now creates a log file for trtexec output in the TEMP directory (this only works if using the bundled trtexec.exe.) The log file will only be retained if trtexec fails (and the vsmlrt exception message will include the full path of the log file.) If you want the log to go to a specific file, set environment variable TRTEXEC_LOG_FILE to the absolute path of the log file. If you don't want this behavior, set log=False when creating the backend (e.g.vsmlrt.Backend.TRT(log=False))

    • The cuda bundles now include VC runtime DLLs as well, so trtexec.exe should run even on systems without proper VC runtime redistributable packages installed (e.g. freshly installed Windows).

    • The ov backend can now configure model compilation via config. Available configurations can be found here.

      • Example:

        core.ov.Model(..., config = lambda: dict(CPU_THROUGHPUT_STREAMS=core.num_threads, CPU_BIND_THREAD="NO"))
        

        This configuration may be useful in improving processor utilization at the expense of significantly increased memory consumption (only try this if you have a huge number of cores underutilized by the default settings.)

        The equivalent form for the python wrapper is

        backend = vsmlrt.Backend.OV_CPU(num_streams=core.num_threads, bind_thread=False)
        
    • When using the vsmlrt.py wrapper, it will no longer create temporary onnx files (e.g. when using non-default alpha CUGAN parameters). Instead, the modified ONNX network will be passed directly into the various ML runtime filters. Those filters now supports (network_path=b'raw onnx protobuf serialization', path_is_serialization=True) for this. This feature also opens the door for generating ONNX on the fly (e.g. ever dreamed of GPU accelerated 2d-convolution or std.Expr?)

    Update Instructions

    1. Delete the previous vsmlrt-cuda, vsov, vsort and vstrt directories and vsov.dll, vsort.dll and vstrt.dll from your VS plugins directory and then extract the newly released files (specifically, do not leave files from previous version and just overwrite with the new release as the new release might have removed some files in those four directories.)
    2. Replace vsmlrt.py in your Python package directory.
    3. Updated models directories by overwriting with the new release. (Models are generally append only. We will make special notices and bump the model release tag if we change any of the previously released models.)

    Compatibility Notes

    vsmrt.py in this release is not compatible with binaries in previous releases, only script level compatibility is maintained. Generally, please make sure to upgrade the filters and vsmlrt.py as a whole.

    We strive to maintain script source level compatibility as much as possible (i.e. there won't be a great api4 breakage), and it means script writing for v7 (for example) will continue to function for the foreseeable future. Minor issues (like the non-monotonic denoise setting of cugan) will be documented instead of fixed with a breaking change.

    Known issue

    CUGAN(version=2) (a.k.a. cugan-pro) may produces blank clip when using the ORT_CUDA(fp16) backend. This is fixed in the v10 release.

    Full Changelog: https://github.com/AmusementClub/vs-mlrt/compare/v8...v9

    Source code(tar.gz)
    Source code(zip)
    models.v9.7z(633.40 MB)
    scripts.v9.7z(6.39 KB)
    vsmlrt-cuda.v9.7z(834.21 MB)
    vsmlrt-windows-x64-cpu.v9.7z(651.69 MB)
    vsmlrt-windows-x64-cuda.v9.7z(1707.95 MB)
    VSORT-Windows-x64.v9.7z(12.63 MB)
    VSOV-Windows-x64.v9.7z(11.46 MB)
    VSTRT-Windows-x64.v9.7z(374.23 KB)
  • v8(Mar 12, 2022)

    • This release upgrades the cuda libraries to their latest version. Models are observed to be accelerated by ~1.1x.
    • vsmlrt.CUGAN() now accepts a new parameter alpha, which controls the strength of filtering. Setting alpha to non-default values requires the Python onnx package (but this might change in the future.)
    • Added tf32 parameter to the trt backend in vsmlrt.py. TF32 acceleration is enabled by default on the Ampere GPUs, mostly for fp32 inference, and it has no effect on other architectures.
    Source code(tar.gz)
    Source code(zip)
    models.v8.7z(603.70 MB)
    scripts.v8.7z(4.87 KB)
    vsmlrt-cuda.v8.7z(847.24 MB)
    vsmlrt-windows-x64-cpu.v8.7z(613.77 MB)
    vsmlrt-windows-x64-cuda.v8.7z(1694.63 MB)
    VSORT-Windows-x64.v8.7z(12.68 MB)
    VSOV-Windows-x64.v8.7z(6.15 MB)
    VSTRT-Windows-x64.v8.7z(355.66 KB)
  • v7(Jan 27, 2022)

    This release adds support for bilibili's Real-CUGAN, please refer to the wiki for details.

    Special notes for CUGAN:

    1. Make sure the RGBS input to CUGAN is within [0,1] range (if in doubt, better to use core.std.Expr(input, "x 0 max 1 min") to condition the input before feeding the NN; fmtconv YUV2RGB might generate out of range RGB values): Out of range values will trip the NN into producing large negative values.
    2. Do not use tiling (i.e. must set tiles=1) as CUGAN requires access to the entire input frame for its depth detection mechanism to work.

    Compared to v6, only scripts.v7.7z, models.v7.7z, vsmlrt-windows-x64-cpu.v7.7z and vsmlrt-windows-x64-cuda.v7.7z files are updated.

    Source code(tar.gz)
    Source code(zip)
    models.v7.7z(603.70 MB)
    scripts.v7.7z(4.38 KB)
    vsmlrt-cuda.v7.7z(906.05 MB)
    vsmlrt-windows-x64-cpu.v7.7z(613.76 MB)
    vsmlrt-windows-x64-cuda.v7.7z(1736.62 MB)
    VSORT-Windows-x64.v7.7z(12.74 MB)
    VSOV-Windows-x64.v7.7z(6.15 MB)
    VSTRT-Windows-x64.v7.7z(351.28 KB)
  • v6(Jan 20, 2022)

    This release contains some performance optimization of the vs-trt plugin. The general takeaway is that vs-trt can beat all benchmarked solutions on DPIR, waifu2x and RealESRGANv2 models. Specific highlights are as follows:

    • waifu2x: when using CPU, vs-ov beats waifu2x-w2xc by 2.7x (Intel 32C64T); when using GPU, vs-ort/vs-trt beats vulkan-ncnn by ~4x.
    • DPIR: vs-trt beats existing implementations on both Volta (Tesla V100) and Ampere (A10) platforms (by at most 1.5x), and vs-ort saves significant amount of GPU memory (by as much as 3.7x) compared to its counterpart
    • RealESRGANv2: vs-trt, being the only backend that utilizes TensorRT, is up to 3.3x faster than the reference implementation

    Please see detailed benchmark results in the wiki:

    • waifu2x: https://github.com/AmusementClub/vs-mlrt/wiki/waifu2x#benchmarking
    • DPIR: https://github.com/AmusementClub/vs-mlrt/wiki/DPIR#benchmarking
    • RealESRGANv2: https://github.com/AmusementClub/vs-mlrt/wiki/RealESRGANv2#benchmarking

    This release also fixed the following two bugs:

    • vs-ov: some openvino error messages from openvino were sent to stdout, affecting vspipe | x265 usage.
    • vs-ort/vs-ov: error in converting RealESRGANv2 model to fp16 format.
    Source code(tar.gz)
    Source code(zip)
    models.v6.7z(552.03 MB)
    scripts.v6.7z(4.19 KB)
    vsmlrt-cuda.v6.7z(906.05 MB)
    vsmlrt-windows-x64-cpu.v6.7z(562.10 MB)
    vsmlrt-windows-x64-cuda.v6.7z(1684.95 MB)
    VSORT-Windows-x64.v6.7z(12.74 MB)
    VSOV-Windows-x64.v6.7z(6.15 MB)
    VSTRT-Windows-x64.v6.7z(351.50 KB)
  • v5(Dec 30, 2021)

    Changelog:

    1. added fp16 support to vs-ov and vs-ort (input model is still fp32, and these filters will convert it to fp16 on the fly). Now all three backends support inference with fp16 (though using fp16 mainly benefit vs-ort's CUDA backend).
    2. ~~fixed vs-ov spurious logging messages to stdout which interferes with vspipe | x265 pipeline (requires patched openvino)~~ Turns out the fix is not picked by the CI. Please use v6 for vs-ov.
    3. changes to the vs-trt backend vsmlrt.Backend.TRT() of the vsmlrt.py wrapper
      • max_shapes defaults to tile size now (as tensorrt GPU memory usage is related to max_shapes rather than the actual shape used in inference, this should help saving GPU memory);
      • the default opt_shapes is None now, which means it will be set to the actual tilesize in use: this is especially beneficial for large models like DPIR. If you prefer faster engine build times, you should set opt_shapes=(64, 64) to restore previous behavior. This change also makes it easier to use the tiles parameter (as in this case, you generally don't know the exact inference shape)
      • changed default cache & engine directory: first try saving the engine and cache file to the same directory as the onnx model and if not writable, use the system temporary directory (on the same drive as the onnx model files).
      • fixed a bug when reusing the same backend variable for different filters

    vsmlrt-cuda and model packages are identical to v4.

    PS: we have successfully used both ~~vs-ov and~~ vs-trt in production anime encodings, so this release should be ready for production. As always, issues and suggestions welcome. Update: turns out vs-ov is broken. The fix to openvino is not correctly picked up by the CI pipeline. Please use v6 for vs-ov.

    Source code(tar.gz)
    Source code(zip)
    models.v5.7z(552.03 MB)
    scripts.v5.7z(4.10 KB)
    vsmlrt-cuda.v5.7z(906.05 MB)
    vsmlrt-windows-x64-cpu.v5.7z(562.10 MB)
    vsmlrt-windows-x64-cuda.v5.7z(1684.95 MB)
    VSORT-Windows-x64.v5.7z(12.57 MB)
    VSOV-Windows-x64.v5.7z(6.14 MB)
    VSTRT-Windows-x64.v5.7z(351.59 KB)
  • v4(Dec 17, 2021)

    This release introduces the following features:

    • vsmlrt.py: added support for vs-trt (including transparent engine compilation)
    • added RealESRGANv2 models, see https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211209/RealESRGANv2_v1.7z
    • full binary releases for Windows, which includes full set of models (waifu2x, RealESRGANv2 and DPIR) and all required DLLs. To simplify installation, we provide two variants:
      • CPU only: vsmlrt-windows-x64-cpu.v4.7z
      • CPU+CUDA: vsmlrt-windows-x64-cuda.v4.7z To install, just extract them into your VS plugins directory (preserving the existing directory structure within the 7z archive), and move vsmlrt.py into your VS python site-packages directory and you're done.

    Component Downloads

    Besides the full releases, each individual component also has its own release, so that users can upgrade only what has been changed:

    • models: full fp32 model release 20211209, includes waifu2x, RealESRGANv2 and DPIR.
    • scripts: vsmlrt.py wrapper script, extract to VS python site-packages directory
    • vsmlrt-cuda: shared CUDA DLLs for vs-ort and vs-trt
    • VSOV-Windows-x64: vs-ov plugin (pure CPU backend)
    • VSORT-Windows-x64: vs-ort plugin, includes both CPU and CUDA backend; CUDA backend requires vsmlrt-cuda package.
    • VSTRT-Windows-x64: vs-trt plugin, requires vsmlrt-cuda package.

    All component packages should be extracted to your VS plugins directory, except for scripts.v4.7z, which needs to be extracted to VS python site-packages directory.

    Known Issues

    1. building TRT engine for waifu2x cunet and upresnet10 will fail on RTX 2000 and RTX 3000 series GPUs, please use vsort if you are using affected GPUs.
    2. due to the way NVidia DLLs are named, there might be DLL conflicts if you also have other AI filters (e.g. waifu2x caffe) in your plugins directory. Due to licensing restrictions and windows technical restrictions, there is no easy way to solve this DLL conflict problem. You will have to remove those conflicting plugins. Fortunately, the only affected plugin seems to be waifu2x caffe and we have already provided full functionality coverage and better performance with the vsmlrt.py script so there is no reason to use the caffe plugin anymore.

    Installation Notes

    1. It is recommended to update to the latest GPU driver (e.g. >= v472.50) if you intend to use the CUDA backend of vsort or vstrt for best performance and compatibility; However, GeForce GPU users with GPU driver >= v452.39 should be able to use the CUDA backend.
    2. There are no changes to vsmlrt-cuda.7z from v3, so no need to re-download it if you already have it from v3.
    Source code(tar.gz)
    Source code(zip)
    models.v4.7z(552.03 MB)
    scripts.v4.7z(3.80 KB)
    vsmlrt-cuda.v4.7z(906.05 MB)
    vsmlrt-windows-x64-cpu.v4.7z(561.87 MB)
    vsmlrt-windows-x64-cuda.v4.7z(1684.26 MB)
    VSORT-Windows-x64.v4.7z(12.03 MB)
    VSOV-Windows-x64.v4.7z(6.12 MB)
    VSTRT-Windows-x64.v4.7z(350.99 KB)
  • v3(Dec 16, 2021)

    This release improves the interface of wrapper and plugins:

    • The argument pad is renamed to overlap, and it is now possible to specify different overlap values on each direction.
    • The arguments block_w and block_h are merged into a single argument tilesize.
    • vsmlrt.py now supports DPIR models. The type of argument backend is changed to a typed data class. To use the plugin, you need to extract v3 DPIR model files into VS plugins\models directory (please keep the directory structure inside the 7z archive intact while extracting.)

    Built-in models can be found at model-20211209.

    Example waifu2x wrapper usage:

    from vsmlrt import Waifu2x, Waifu2xModel, Backend
    
    src = core.std.BlankClip(format=vs.RGBS)
    
    # backend could be:
    #  - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
    #  - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort's cpu backend.
    #  - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
    #     - use device_id to select device
    #     - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
    flt = Waifu2x(src, noise=-1, scale=2, model=Waifu2xModel.upconv_7_anime_style_art_rgb, backend=Backend.ORT_CUDA())
    

    Example DPIR wrapper usage:

    from vsmlrt import DPIR, DPIRModel, Backend
    src = core.std.BlankClip(format=vs.RGBS) # or vs.GRAYS for gray only models
    # DPIR is a huge model and GPU backend is highly recommended.
    # If the model runs out of GPU memory, increase the tiles parameter.
    flt = DPIR(src, strength=5, model=DPIRModel.drunet_color, tiles=2, backend=Backend.ORT_CUDA())
    

    Known Issues

    1. building TRT engine for waifu2x cunet and upresnet10 will fail on RTX 2000 and RTX 3000 series GPUs, please use vsort if you are using affected GPUs.
    2. due to the way NVidia DLLs are named, there might be DLL conflicts if you also have other AI filters (e.g. waifu2x caffe) in your plugins directory. Due to licensing restrictions and windows technical restrictions, there is no easy way to solve this DLL conflict problem. You will have to remove those conflicting plugins. Fortunately, the only affected plugin seems to be waifu2x caffe and we have already provided full functionality coverage and better performance with the vsmlrt.py script so there is no reason to use the caffe plugin anymore.

    Installation Notes

    1. It is recommended to update to the latest GPU driver (e.g. >= v472.50) if you intend to use the CUDA backend of vsort or vstrt for best performance and compatibility; However, GeForce GPU users with GPU driver >= v452.39 should be able to use the CUDA backend.
    2. There are no changes to vsmlrt-cuda.7z from v2, so no need to re-download it if you already have it from v2.
    Source code(tar.gz)
    Source code(zip)
    vsmlrt-cuda.7z(906.51 MB)
    VSORT-Windows-x64.v3.zip(31.36 MB)
    VSOV-Windows-x64.v3.zip(11.41 MB)
    VSTRT-Windows-x64.v3.zip(613.29 KB)
  • v2(Dec 10, 2021)

    This release introduces the vs-trt plugin, which should provde the best possible performance on NVidia GPUs at the expense of requiring an extra tedious engine building step. vs-trt is only recommended for large AI models, e.g. DPIR. Smaller models like waifu2x won't see much performance benefits. Please refer to its docs for further usage instructions (and be forewarned, it's very hard to use unless you are prepared to spend some time understanding the process and doing some trial and error experiments.)

    If you use GPU support for vsort or vstrt, then you also need to download and extract vsmlrt-cuda.7z into your VS plugins directory (while keeping the directory structure inside the 7z files). The DLLs there will be shared by vsort and vstrt. Please also note that vstrt requires the use of new models released in model-20211209.

    This release also introduces builtin model support for vsov and vsort (as vstrt requires building engine separately, builtin model support is moot.) You can place the model onnx files under VS plugins\models directory, and set builtin=True for vsov and vsort filters so that the network_path argument is interpreted as a path relative to plugins\models. This mode makes it easier to make a VS portable release with integrated models. For example, after extracting waifu2x-v3.7z into your VS plugins\models directory (while keeping the directory structure inside the 7z files), you can use do this to use the waifu2x models with vsmlrt.py without worrying about their absolute paths:

    from vsmlrt import Waifu2x, Waifu2xModel
    
    src = core.std.BlankClip(format=vs.RGBS)
    # backend could be: "ort-cpu", "ort-cuda", "ov-cpu"; suggested choice is "ov-cpu" for pure CPU and "ort-cuda" for GPU.
    flt = Waifu2x(src, noise=-1, scale=2, model=Waifu2xModel.upconv_7_anime_style_art_rgb, backend="ort-cuda")
    

    vsmlrt-cuda.7z Changelog

    1. added nvrtc for vstrt dynamic layer fusion support, only necessary if you use vstrt. If you only intend to use vsort, you can just download the smaller package vsmlrt-cuda-no-nvrtc.7z.

    Known Issues

    1. building TRT engine for waifu2x cunet and upresnet10 will fail on RTX 2000 and RTX 3000 series GPUs, please use vsort if you are using affected GPUs.
    2. due to the way NVidia DLLs are named, there might be DLL conflicts if you also have other AI filters (e.g. waifu2x caffe) in your plugins directory. Due to licensing restrictions and windows technical restrictions, there is no easy way to solve this DLL conflict problem. You will have to remove those conflicting plugins. Fortunately, the only affected plugin seems to be waifu2x caffe and we have already provided full functionality coverage and better performance with the vsmlrt.py script so there is no reason to use the caffe plugin anymore.

    Installation Notes

    1. please update to the latest GPU driver (e.g. >= v472.50) if you intend to use the CUDA backend of vsort or vstrt for best performance and compatibility.
    2. GeForce GPU users may use the v2b version of vsort which supports GPU driver >= v452.39.
    Source code(tar.gz)
    Source code(zip)
    vsmlrt-cuda-no-nvrtc.7z(894.86 MB)
    vsmlrt-cuda.7z(906.51 MB)
    VSORT-Windows-x64.v2.zip(11.48 MB)
    VSORT-Windows-x64.v2b.zip(31.36 MB)
    VSOV-Windows-x64.v2.zip(11.41 MB)
    VSTRT-Windows-x64.v2.zip(610.32 KB)
  • model-20211209(Dec 9, 2021)

    Model release 20211209

    This requires plugin release v2 or above. Users of v1 or v0 plugin releases please continue to use the previous release.

    In general, we strive to keep previous model releases usable with newer plugin releases, but new model releases generally require newer plugin releases.

    Changelog

    1. Modified input dimension to -1 to better support dynamic shapes and the upcoming vstrt plugin. vsov and vsort users can continue to use last release (though upgrading is highly recommended.)
    2. Added Real-CUGAN models
    3. Added cugan-pro and RealESRGANv3 models.
    Source code(tar.gz)
    Source code(zip)
    cugan-pro_v1.7z(26.75 MB)
    cugan_v2.7z(51.28 MB)
    dpir_v3.7z(459.91 MB)
    RealESRGANv2_v1.7z(4.32 MB)
    RealESRGANv3_v1.7z(2.19 MB)
    waifu2x_v3.7z(83.12 MB)
  • v1(Dec 4, 2021)

    Initial public preview vs-ort release.

    Changelog

    1. VSOV: moved tbb.dll into its own directory, so that we don't put any non-VS plugin DLL into the top level of the plugins directory.
    2. VSORT: initial release.

    Installation Notes

    • VSORT: ONNX Runtime
      • CPU only: extract VSORT-Windows-x64.zip into vapoursynth/plugins directory. You can additionally remove vsort/onnxruntime_providers_cuda.dll and vsort/onnxruntime_providers_shared.dll to save some disk space.
      • CUDA: extract both VSORT-Windows-x64.zip and vsmlrt-cuda.7z into vapoursynth/plugins directory.
    • VSOV: just extract VSOV-Windows-x64.zip into vapoursynth/plugins directory.

    Please note that the CUDA libraries are huge (requires ~1.9GiB space after extraction).

    Please refer to the wiki for details.

    Source code(tar.gz)
    Source code(zip)
    vsmlrt-cuda.7z(728.05 MB)
    VSORT-Windows-x64.zip(10.99 MB)
    VSOV-Windows-x64.zip(11.41 MB)
  • v0(Dec 3, 2021)

  • model-20211203(Dec 3, 2021)

    Initial ONNX Model Release

    Waifu2x

    Waifu2x is a well-known image super-resolution for anime-style arts.

    Link: https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211203/waifu2x_v2.7z Docs: https://github.com/AmusementClub/vs-mlrt/wiki/waifu2x

    Includes all known publicly available waifu2x models:

    • anime_style_art: noise1 noise2 noise3 scale2.0x
    • anime_style_art_rgb: noise0 noise1 noise2 noise3 scale2.0x
    • upconv_7_anime_style_art_rgb: scale2.0x noise3_scale2.0x noise2_scale2.0x noise1_scale2.0x noise0_scale2.0x
    • photo: noise0 noise1 noise2 noise3 scale2.0x
    • ukbench: scale2.0x
    • upconv_7_photo: scale2.0x noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
    • cunet: noise0 noise1 noise2 noise3 scale2.0x noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
    • upresnet10: scale2.0x noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x

    DPIR

    DPIR, or Plug-and-Play Image Restoration with Deep Denoiser Prior, is a denoise and deblocking neural network. See also https://github.com/HolyWu/vs-dpir.

    DPIR requires a strength parameter, you need to pass it in the form of a GRAYS clip.

    Link: https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211203/dpir_v2.7z Docs: https://github.com/AmusementClub/vs-mlrt/wiki/DPIR

    Includes these models:

    • drunet_gray: GRAY denoise
    • drunet_deblocking_grayscale: GRAY deblocking
    • drunet_color: RGB denoise
    • drunet_deblocking_color.onnx: RGB deblocking
    Source code(tar.gz)
    Source code(zip)
    dpir_v2.7z(460.14 MB)
    waifu2x_v2.7z(83.21 MB)
Owner
私立七森中ごらく部
#VapourSynth-Classic & YuruYuri: Bleeding Edge Anime Encoding
私立七森中ごらく部
Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, SRMD, RealSR, Anime4K, RIFE, CAIN, DAIN and ACNet.

Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, SRMD, RealSR, Anime4K, RIFE, CAIN, DAIN and ACNet.

Aaron Feng 7.3k Aug 7, 2022
waifu2x converter ncnn version, runs fast on intel / amd / nvidia GPU with vulkan

waifu2x ncnn Vulkan ncnn implementation of waifu2x converter. Runs fast on Intel / AMD / Nvidia with Vulkan API. waifu2x-ncnn-vulkan uses ncnn project

null 2.2k Aug 7, 2022
C++ Implementation of "An Equivariant Filter for Visual Inertial Odometry", ICRA 2021

EqF VIO (Equivariant Filter for Visual Inertial Odometry) This repository contains an implementation of an Equivariant Filter (EqF) for Visual Inertia

null 53 Jul 28, 2022
A custom GEGL filter that does layer effects. It may not be non-destructive but you can make presets of your favorite text styles

A custom GEGL filter that does layer effects. It may not be non-destructive but you can make presets of your favorite text styles. Futures plans are going to include an image file overlay, and pro tip you can do a multistroke if sacrifice a shadow/glow.

null 6 Jul 23, 2022
a generic C++ library for image analysis

VIGRA Computer Vision Library Copyright 1998-2013 by Ullrich Koethe This file is part of the VIGRA computer vision library. You may use,

Ullrich Koethe 370 Aug 8, 2022
Object Based Generic Perception Object Model

This model is a highly parameterizable generic perception sensor and tracking model. It can be parameterized as a Lidar or a Radar. The model is based on object lists and all modeling is performed on object level.

TU Darmstadt - FZD 5 Jun 11, 2022
Gstreamer plugin that allows use of NVIDIA Maxine SDK in a generic pipeline.

GST-NVMAXINE Gstreamer plugin that allows use of NVIDIA MaxineTM sdk in a generic pipeline. This plugin is intended for use with NVIDIA hardware. Visi

Alex Pitrolo 14 May 11, 2022
A framework for generic hybrid two-party computation and private inference with neural networks

MOTION2NX -- A Framework for Generic Hybrid Two-Party Computation and Private Inference with Neural Networks This software is an extension of the MOTI

ENCRYPTO 11 Jul 6, 2022
percepnet implemented using Keras, still need to be optimized and tuned.

PercepNet (Still need to be tuned) Unofficial implementation of PercepNet : A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhanceme

cookcodes 30 Jul 19, 2022
ncnn is a high-performance neural network inference framework optimized for the mobile platform

ncnn ncnn is a high-performance neural network inference computing framework optimized for mobile platforms. ncnn is deeply considerate about deployme

Tencent 15.2k Aug 12, 2022
copc-lib provides an easy-to-use interface for reading and creating Cloud Optimized Point Clouds

copc-lib copc-lib is a library which provides an easy-to-use reader and writer interface for COPC point clouds. This project provides a complete inter

Rock Robotic 18 Aug 8, 2022
Advent of Code 2021 optimized solutions in C++

advent2021-fast These solutions are a work in progress. Advent of Code 2021 optimized C++ solutions. Here are the timings from an example run on an i9

Andrew Skalski 9 Aug 9, 2022
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices.

Xiaomi 4.7k Aug 6, 2022
FoxRaycaster, optimized, fixed and with a CUDA option

Like FoxRaycaster(link) but with a nicer GUI, bug fixes, more optimized and with CUDA. Used in project: Code from FoxRaycaster, which was based on thi

Błażej Roszkowski 2 Oct 21, 2021
A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Libonnx A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support. Getting Started The library's

xboot.org 411 Aug 8, 2022
Deep Learning API and Server in C++11 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Open Source Deep Learning Server & API DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state

JoliBrain 2.4k Aug 14, 2022
A PoC for requesting HWIDs directly from hardware, skipping any potential hooks or OS support.

PCIBan A PoC for requesting HWIDs directly from hardware, skipping any potential hooks or OS support. This is probably very unsafe, not supporting edg

null 52 Jul 27, 2022
APFS module for linux, with experimental write support (out-of-tree repository)

Apple File System ================= The Apple File System (APFS) is the copy-on-write filesystem currently used on all Apple devices. This module pro

APFS for Linux 201 Aug 4, 2022
Support Yolov4/Yolov3/Centernet/Classify/Unet. use darknet/libtorch/pytorch to onnx to tensorrt

ONNX-TensorRT Yolov4/Yolov3/CenterNet/Classify/Unet Implementation Yolov4/Yolov3 centernet INTRODUCTION you have the trained model file from the darkn

null 162 Jul 24, 2022