ORB-SLAM3-Monodepth is an extended version of ORB-SLAM3 that utilizes a deep monocular depth estimation network

Overview

ORB_SLAM3_Monodepth

Introduction

This repository was forked from [ORB-SLAM3] (https://github.com/UZ-SLAMLab/ORB_SLAM3). ORB-SLAM3-Monodepth is an extended version of ORB-SLAM3 that utilizes a deep monocular depth estimation network. For this pre-trained models of [Monodepth2] (https://github.com/nianticlabs/monodepth2) are used. The monocular depth network is deployed using LibTorch and executed in an asynchronous thread in parallel with the ORB feature detection to optimize runtime. The estimated metric depth is used to initialize map points and in the cost function similar to the stereo/RGBD case, and can significantly reduce the scale drift in the monocular case. This approach is based on DVSO and CNN-SVO, which have extended DSO and SVO, respectively, with a monocular depth network.

Example

Comparison between the monocular case and monocular case with depth estimation network (KITTI Sequence 01).

Monocular:

Monocular with depth estimation network:

Related Publications

[ORB-SLAM3] Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. M. Montiel and Juan D. Tardós, ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM, IEEE Transactions on Robotics 37(6):1874-1890, Dec. 2021. PDF.

[Monodepth2] Clément Godard, Oisin Mac Aodha, Michael Firman and Gabriel J. Brostow, Digging Into Self-Supervised Monocular Depth Estimation, ICCV 2019. PDF.

[DVSO] Nan Yang, Rui Wang, Jörg Stückler and Daniel Cremers, Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry, ECCV 2018. PDF.

[CNN-SVO] Shing Yan Loo, Ali Jahani Amiri, Syamsiah Mashohor, Sai Hong Tang and Hong Zhang, CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction, ICRA 2019. PDF.

1. License (from ORB-SLAM3)

See LICENSE file.

2. Prerequisites

The library is tested in Ubuntu 16.04 and 18.04, but it should be easy to compile in other platforms. A powerful computer (e.g. i7) will ensure real-time performance and provide more stable and accurate results.

C++11 or C++0x Compiler

We use the new thread and chrono functionalities of C++11.

Pangolin

We use Pangolin for visualization and user interface. Dowload and install instructions can be found at: https://github.com/stevenlovegrove/Pangolin.

OpenCV

We use OpenCV to manipulate images and features. Dowload and install instructions can be found at: http://opencv.org. Required at leat 3.0. Tested with OpenCV 3.2.0 and 4.4.0.

Eigen3

Required by g2o (see below). Download and install instructions can be found at: http://eigen.tuxfamily.org. Required at least 3.1.0.

DBoW2 and g2o (Included in Thirdparty folder)

We use modified versions of the DBoW2 library to perform place recognition and g2o library to perform non-linear optimizations. Both modified libraries (which are BSD) are included in the Thirdparty folder.

Python

Required to calculate the alignment of the trajectory with the ground truth. Required Numpy module.

ROS (optional)

We provide some examples to process input of a monocular, monocular-inertial, stereo, stereo-inertial or RGB-D camera using ROS. Building these examples is optional. These have been tested with ROS Melodic under Ubuntu 18.04.

Pytorch/LibTorch

The Pytorch C++ API (LibTorch) is used for deployment. Download the pre-built version here https://pytorch.org/ (important select the cxx11 ABI).

3. Building ORB-SLAM3 library and examples

Clone the repository:

git clone https://github.com/jan9419/ORB_SLAM3_Monodepth.git ORB_SLAM3_Monodepth

We provide a script build.sh to build the Thirdparty libraries and ORB-SLAM3-Monodepth. Please make sure you have installed all required dependencies (see section 2) and the correct LIBTORCH_PATH is set in build.sh. Execute:

cd ORB_SLAM3_Monodepth
chmod +x build.sh
./build.sh

This will create libORB_SLAM3_Monodepth.so at lib folder and the executables in Examples folder.

4. KITTI RGB-MonoDepth Examples

  1. Download the dataset (color images) from http://www.cvlibs.net/datasets/kitti/eval_odometry.php

  2. Export pre-trained Monodepth2 models (trained on the KITTI dataset) to torchscript models. For this please add the Monodepth2 repository to the PYTHONPATH environment variable. Furthermore, the decoder in the Monodepth2 repository needs to be modified to return the last tuple element (self.outputs[("disp", 0)]). Note that when exporting the torchscript models, the same device (cpu or cuda) must be selected as for deployment (DepthEstimator.device (cpu or gpu)).

python tools/export_models.py --input_encoder_path PATH_TO_MONODEPTH_PRETRAINED_MODEL/encoder.pth --input_decoder_path PATH_TO_MONODEPTH_PRETRAINED_MODEL/decoder.pth --output_encoder_path tools/encoder.pt --output_decoder_path tools/decoder.pt --device cuda
  1. Set the correct path to the exported torchscript models in KITTIX.yaml (DepthEstimator.encoderPath and DepthEstimator.decoderPath).

  2. Execute the following command. Change KITTIX.yaml by KITTI00-02.yaml, KITTI03.yaml or KITTI04-12.yaml for sequence 0 to 2, 3, and 4 to 12 respectively. Change PATH_TO_DATASET_FOLDER to the uncompressed dataset folder. Change SEQUENCE_NUMBER to 00, 01, 02,.., 11.

./Examples/RGBMonoDepth/rgb_monodepth Vocabulary/ORBvoc.txt Examples/RGBMonoDepth/KITTIX.yaml PATH_TO_DATASET_FOLDER/data_odometry_color/dataset/sequences/SEQUENCE_NUMBER
You might also like...
oneAPI Deep Neural Network Library (oneDNN)

oneAPI Deep Neural Network Library (oneDNN) This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-

Benchmark framework of compute-in-memory based accelerators for deep neural network (inference engine focused)

DNN+NeuroSim V1.3 The DNN+NeuroSim framework was developed by Prof. Shimeng Yu's group (Georgia Institute of Technology). The model is made publicly a

Implementing Deep Convolutional Neural Network in C without External Libraries for YUV video Super-Resolution
Implementing Deep Convolutional Neural Network in C without External Libraries for YUV video Super-Resolution

DeepC: Implementing Deep Convolutional Neural Network in C without External Libraries for YUV video Super-Resolution This code uses FSRCNN algorithm t

ESP32/8266 Arduino/PlatformIO library that painlessly enables incredibly fast re-connect to the previous wireless network after deep sleep.

WiFiQuick ESP32/8266 Platformio/Arduino library that painlessly enables incredibly fast re-connect to the previous wireless network after deep sleep.

Simple inference deep head pose ncnn version
Simple inference deep head pose ncnn version

ncnn-deep-head-pose Simple implement inference deep head pose ncnn version with high performance and optimized resource. This project based on deep-he

Implementations of Multiple View Geometry in Computer Vision and some extended algorithms.

MVGPlus Implementations of Multiple View Geometry in Computer Vision and some extended algorithms. Implementations Template-based RANSAC 2D Line estim

The core engine forked from NVidia's Q2RTX. Heavily modified and extended to allow for a nicer experience all-round.

Nail & Crescent - Development Branch Scratchpad - Things to do or not forget: Items are obviously broken. Physics.cpp needs more work, revising. Proba

Ncnn version demo of [CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search (ncnn) The official implementation by pytorch: ht

We implemented our own sequential version of GA, PSO, SA and ACA using C++ and the parallelized version with CUDA support

We implemented our own sequential version of GA, PSO, SA and ACA using C++ (some using Eigen3 as matrix operation backend) and the parallelized version with CUDA support. All of them are much faster than the popular lib scikit-opt.

Comments
  • How to save the decoder.pth of monodepth2

    How to save the decoder.pth of monodepth2

    hello, thanks for your shared code. Its a great work. I am confused with those words "Furthermore, the decoder in the Monodepth2 repository needs to be modified to return the last tuple element (self.outputs[("disp", 0)])" I want to know how to get the decoder.pth from monodepth2, which line to add or modify the code to get the decoder.pth. Wish you to apply me, thx a lot.

    opened by jzwqaq 3
  • CUDA error

    CUDA error

    Thanks for your great work first. I met some problem in DepthEstimator::EstimateDepth()

    terminate called after throwing an instance of 'std::runtime_error' what(): The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): RuntimeError: CUDA error: no kernel image is available for execution on the device

    My libtorch version is 1.8.0 with cuda11.1, and my pytorch version is all 1.8.0. I have no idea to fix it, can you help me?

    opened by JXFOnestep 2
  • About 4.2

    About 4.2

    Thanks for your great work. I have problems to understand part of 4.2. just as how to add Monodepth2 models in to pythonpath. I also can't understand what the parameters in the command represent. I really hope you can help. Thank you very much .

    opened by RoZhong 1
Owner
null
Dense Depth Estimation from Multiple 360-degree Images Using Virtual Depth

Dense Depth Estimation from Multiple 360-degree Images Using Virtual Depth [Project] [Paper] [arXiv] This is the official code of our APIN 2022 paper

null 8 Nov 7, 2022
monodepth running in Android by ncnn

monodepth-NCNN 将wavelet-monodepth的模型搬运到NCNN上,工程里面给了安卓的工程以及以及生成好的app安装包 wavelet-monodepth wavelet-monodepth:RGB图像的深度估计,wavelet顾名思义,就使用了小波变换的,官方的工程在这:ht

WuJinxuan 14 Aug 13, 2022
A lightweight version of OrcVIO that uses monocular images, inertial data, as well as bounding box measurements

OrcVIO-Lite About Object residual constrained Visual-Inertial Odometry (OrcVIO) is a visual-inertial odometry pipeline, which is tightly coupled with

Sean 26 Oct 27, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Dec 30, 2022
International Business Machines 10 Dec 20, 2022
Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

Abhinav Kumar 76 Jan 2, 2023
Mixed reality VR laser tag using Oculus Quest 2 and OAK-D depth cameras. First prize winner for North America region in OpenCV AI Competition 2021.

Mixed Reality Laser Tag Copyright 2021 Bart Trzynadlowski Overview This is the source code to my Mixed Reality Laser Tag project, which won first priz

null 34 Jun 3, 2022
Experiments with ORB-SLAM and emscripten

Experiments with ORB-SLAM3 and emscripten Experiments to attempt to get ORB-SLAM3 working with emscripten. Please use the binvoc branch of my own fork

Nick Whitelegg 18 Dec 19, 2022
Finds static ORB features in a video(excluding the dynamic objects), typically for a SLAM scenario

static-ORB-extractor : SORBE Finds static ORB features in a video(excluding the dynamic objects), typically for a SLAM scenario Requirements OpenCV 3

null 4 Dec 17, 2022