DeepI2P - Image-to-Point Cloud Registration via Deep Classification. CVPR 2021

Overview

#DeepI2P: Image-to-Point Cloud Registration via Deep Classification

Summary

Video

PyTorch implementation for our CVPR 2021 paper DeepI2P. DeepI2P solves the problem of cross modality registration, i.e, solve the relative rotation R and translation t between the camera and the lidar.

DeepI2P: Image-to-Point Cloud Registration via Deep Classification
Jiaxin Li 1, Gim Hee Lee 2
1ByteDance, 2National University of Singapore

Method

The intuition is to perform the Inverse Camera Projection, as shown in the images below. overview_1 overview_2

Repo Structure

  • data: Generate and process datasets
  • evaluation: Registration codes, include Inverse Camera Projection, ICP, PnP
    • frustum_reg: C++ codes of the Inverse Camera Projection, using Gauss-Newton Optimization. Installation method is shown below. It requires the Ceres Solver.
    python evaluation/frustum_reg/setup.py install
    
    • icp: codes for ICP (Iterative Closest Point)
    • registration_lsq.py: Python code for Inverse Camera Projection, which utilizes the per-point coarse classification prediction, and the frustum_reg solver.
    • registration_pnp.py: Python code for PnP solver utilizing the per-point fine classification prediction.
  • kitti: Training codes for KITTI
  • nuscenes: Training codes for nuscenes
  • oxford: Training codes for Oxford Robotcar dataset
  • models: Networks and layers
    • 'index_max_ext': This is a custom operation from SO-Net, which is the backbone of our network. Installation:
    python models/index_max_ext/setup.py install
    
    • networks_img.py: Network to process images. It is a resnet-like structure.
    • networks_pc.py: Network to process point clouds, it is from SO-Net
    • network_united.py: Network to fuse information between point clouds and images.

Dataset and Models

Comments
  • How long is the training time

    How long is the training time

    Hello, thanks for sharing your code. I want to follow up your work but may not have enough equipment. What is the GPU you are using, and how long is the training time on the two data sets?

    opened by Ayanami2019 3
  • Enquiry for kitti dataloader file,line 296

    Enquiry for kitti dataloader file,line 296

    In kitti_pc_img_pose_loader.py file,line 296,the transformation matrix was written as: Pc = np.dot(self.calib_helper.get_matrix(seq, img_key), self.calib_helper.get_matrix(seq, 'Tr')) In my view,this matrix only works in function "search_for_accumulation". To transform pcs from timestamp j to timestamp i ,it is a little strange to :

    1. transforming pcs into camera 0
    2. translation from camera 0 to camera i
    3. pose transforming.
    4. ... Although camera i is parallel to camera 0 so there is no problem in your code,why not just throw away the translation step(2)? It is confusing...
    opened by Martin-Liao 0
  • How to process my data

    How to process my data

    Thank you very much for your outstanding work. Now I want to use your code, but I don't quite understand how to use evaluation/*.py, do they have any sequence? I don't need to retrain the model, just use the model you have trained in advance to process my data. I would be grateful if you could give me some help.

    opened by githubhupan 0
  • Different versions of open3d used in data preprocess of KITTI

    Different versions of open3d used in data preprocess of KITTI

    In file kitti_pc_bin_to_npz_in_img_frame.py and kitti_pc_bin_to_npy_with_downsample_sn.py, open3d.geometry.voxel_down_sample(), open3d.geometry.estimate_normals() and open3d.geometry.orient_normals_to_align_with_direction() is used but these APIs are now replaced by 'pcd.voxel_down_sample()', 'pcd.estimated_normals()' and pcd.orient_normals_to_align_with_direction(), which may cause errors if you use the pip install open3d to install open3d. However in file frame_accumulation.py, the newest API is used.

    opened by Yuli-yx 0
  • Range of scene

    Range of scene

    Dear Jiaxin, Recently I have read DeepI2P, and for the experiment, I wonder the range of the scene you have used. For example, as for KITTI, you have write the range from -1 to 80m in 'option.py', but in paper it seem that you used -80 to 80m. Looking forward to your reply.

    opened by rsy6318 0
  • About the experimental results in Table 1

    About the experimental results in Table 1

    Sorry to bother you. I'm very interested in this excellent work, DeepI2P: Image-to-Point Cloud Registration via Deep Classification. I ran the code you released on the Oxford dataset. However, I found that point clouds are not randomly rotated and translated during the test phase. Is this released code not consistent with the experimental settings in the paper? Or are the experimental results in Table 1 obtained on the point clouds without random rotation and translation? Thank you very much!

    opened by wangyujiewj 1
Owner
Li Jiaxin
PhD in Computer Vision, Deep Learning, Robotics.
Li Jiaxin
The code implemented in ROS projects a point cloud obtained by a Velodyne VLP16 3D-Lidar sensor on an image from an RGB camera.

PointCloud on Image The code implemented in ROS projects a point cloud obtained by a Velodyne VLP16 3D-Lidar sensor on an image from an RGB camera. Th

Edison Velasco Sánchez 5 Aug 12, 2022
Insight Toolkit (ITK) is an open-source, cross-platform toolkit for N-dimensional scientific image processing, segmentation, and registration

ITK: The Insight Toolkit C++ Python Linux macOS Windows Linux (Code coverage) Links Homepage Download Discussion Software Guide Help Examples Issue tr

Insight Software Consortium 1.1k Nov 24, 2022
the implementations of 'Parzen-Window Based Normalized Mutual Information for Medical Image Registration'

ImageRegistration_NormalisedMutualInformation 代码复现论文《Parzen-Window Based Normalized Mutual Information for Medical Image Registration》 利用归一化互信息对医学图像进行

gtc1072 3 Apr 4, 2022
The repository contains our dataset and C++ implementation of the CVPR 2022 paper, Geometric Structure Preserving Warp for Natural Image Stitching.

Geometric Structure Preserving Warp for Natural Image Stitching This repository contains our dataset and C++ implementation of the CVPR 2022 paper, Ge

null 19 Nov 26, 2022
Using image classification with ConvMixer

Japanese Handwriting Classification with Fragment Shaders NOTE: This was built and tested with Unity 2019.4.31f1 using built-in render pipeline, there

null 11 Nov 22, 2022
MobileNet Image Classification with ESP32-CAM and Edge Impulse (TinyML)

MobileNet Image Classification on ESP32-CAM and Edge Impulse (TinyML) This example is for running a MobileNet neural network model on a 10-dollar Ai-T

Alan Wang 14 Nov 16, 2022
Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

Abhinav Kumar 76 Nov 6, 2022
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"

SSM-VLN Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation". Environment Installation Download Room-to-Room

hanqing 34 Aug 24, 2022
The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

Chenyu 107 Nov 13, 2022
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV Workshop @ CVPR 2021.

MarkerPose: Robust Real-time Planar Target Tracking for Accurate Stereo Pose Estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Jhacson Meza 47 Nov 18, 2022
[CVPR 2021] NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning Project Page | Paper | Supplemental material #1 | Supplement

KAIST VCLAB 49 Nov 24, 2022
An implementation on Fast Ground Segmentation for 3D LiDAR Point Cloud Based on Jump-Convolution-Process.

An implementation on "Shen Z, Liang H, Lin L, Wang Z, Huang W, Yu J. Fast Ground Segmentation for 3D LiDAR Point Cloud Based on Jump-Convolution-Process. Remote Sensing. 2021; 13(16):3239. https://doi.org/10.3390/rs13163239"

Wangxu1996 59 Nov 18, 2022
A LiDAR point cloud cluster for panoptic segmentation

Divide-and-Merge-LiDAR-Panoptic-Cluster A demo video of our method with semantic prior: More information will be coming soon! As a PhD student, I don'

YimingZhao 64 Nov 24, 2022
Ground segmentation and point cloud clustering based on CVC(Curved Voxel Clustering)

my_detection Ground segmentation and point cloud clustering based on CVC(Curved Voxel Clustering) 本项目使用设置地面坡度阈值的方法,滤除地面点,使用三维弯曲体素聚类法完成点云的聚类,包围盒参数由Apol

null 9 Jul 15, 2022
An unified library for fitting primitives from 3D point cloud data with both C++&Python API.

PrimitivesFittingLib An unified library for fitting multiple primitives from 3D point cloud data with both C++&Python API. The supported primitives ty

Yueci Deng 10 Jun 30, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 108 Nov 21, 2022
copc-lib provides an easy-to-use interface for reading and creating Cloud Optimized Point Clouds

copc-lib copc-lib is a library which provides an easy-to-use reader and writer interface for COPC point clouds. This project provides a complete inter

Rock Robotic 24 Nov 16, 2022
GA-NET: Global Attention Network for Point Cloud Semantic Segmentation

GA-NET: Global Attention Network for Point Cloud Semantic Segmentation We propose a global attention network, called GA-Net, to obtain global informat

null 4 Jul 18, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Nov 17, 2022