CalibDNN: multimodal sensor calibration for perception using deep neural networks

Zhao, Ganning; Hu, Jiesi; You, Suya; Kuo, C.-C. Jay

doi:10.1117/12.2587994

Cited by 26 publications

(26 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The calibration results of the model were evaluated using the KITTI raw dataset, as shown in Tables 2 and 3. It shows a comparison of the performance in rotation and translation errors compared to RegNet [10], CalibNet [12], CalibDNN [29], and CFNet [30], where CFNet [30] denotes the 'stateof-the-art' deep learning method on the target-less LiDARcamera calibration problem. To estimate extrinsic calibration parameters, these methods utilize different networks that are trained with varying deviations of miscalibration.…”

Section: Results and Evaluationsmentioning

confidence: 99%

“…Yuan et al [13] showed RGGNet, which considers the tolerance loss function, to achieve calibrated parameters by leveraging Riemannian geometry. Zhao et al [29] proposed a DNN architecture that is lightweight with a single iteration to obtain a calibration transformation by maximizing the consistency of mutlimodal data. Its performance can be refined using a training model with multiple iterations of different miscalibration ranges.…”

Section: B Target-less Approachmentioning

confidence: 99%

See 1 more Smart Citation

CalibBD: Extrinsic Calibration of the LiDAR and Camera Using a Bidirectional Neural Network

Nguyen

Yoo

2022

IEEE Access

View full text Add to dashboard Cite

With the rapid growth of self-driving vehicles, automobiles demand diverse data from multiple sensors to perceive the surrounding environment. Calibrating preprocessing between multiple sensors is necessary to utilize the data effectively. In particular, the LiDAR-camera pair, a suitable complement with 2D-3D information for each other, has been widely used in autonomous vehicles. Most traditional calibration methods require specific calibration targets set up under complicated environmental conditions, which require expensive human manual work. In this study, we propose a deep neural network that does not require any specific targets and offline setup to find the six degrees of freedom (6 DoF) transformation between LiDAR and the camera. Unlike previous deep learning CNN-based methods, which use raw 3D point clouds and 2D images frame by frame, CalibBD utilizes Bi-LSTM for sequence data to extract temporal features between consecutive frames. It not only predicts the calibration parameters by minimizing both transformation and depth losses but also calibrates the camera parameters by using temporal loss to refine the calibration parameters. The proposed model achieves a steady performance under various deviations of mis-calibration parameters and achieved higher results in terms of accuracy than the state-of-the-art CNN-based method on the KITTI datasets.

show abstract

Section: Results and Evaluationsmentioning

confidence: 99%

Section: B Target-less Approachmentioning

confidence: 99%

CalibBD: Extrinsic Calibration of the LiDAR and Camera Using a Bidirectional Neural Network

Nguyen

Yoo

2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…CMRNet [20] is an approach for locating a camera in a LiDAR-Map, and it is the first method to use the correlation layer of PWC-Net [21] to match features acquired from two sensors to achieve 6-DoF extrinsic calibration. CalibRCNN [12] used the constraint relationship between successive frames for calibration, which improved the accuracy, and CalibDNN [13] added geometric and transformation supervisions to solve the calibration problem and applied the method on a challenging dataset.…”

Section: Deep Learning Methodsmentioning

confidence: 99%

“…3. The proposed method has a key advantage, namely, the multiresolution representations extracted from the backbone are semantically stronger than in prior methods [11]- [13]. Furthermore, the high-resolution representations are spatially accurate.…”

Section: B Network Architecture 1) Feature Extraction Networkmentioning

confidence: 99%

PSNet: LiDAR and Camera Registration Using Parallel Subnetworks

Zhu

Liang

2022

IEEE Access

View full text Add to dashboard Cite

The working environment of autonomous driving and robot navigation is so complex and dynamic that a single type of sensor is insufficient for performing object detection. Thus, in many perception schemes, the LiDAR-camera fusion strategy is preferred. However, the performance of a LiDAR-camera fusion heavily relies on a set of accurately calibrated extrinsic parameters. We propose PSNet, an end-to-end convolutional neural network (CNN) for calibration; this is the first calibration network to use parallel subnetworks to obtain multiresolution features and fuse them adaptively to encourage robustness against different initial error ranges. The method has three key characteristics: (i) Addition of a downsampling block to improve suitability for sparse projected depth maps; (ii) Connection of the high-to-low resolution convolution streams in parallel to obtain multiresolution features that are spatially more precise and contain richer semantic information; (iii) Fusion of multiresolution streams by the multiscale feature aggregation module. The network corrects errors from initial calibration to the ground truth online, rather than directly obtaining the accurate parameters. We evaluated our model on the KITTI datasets and it outperformed other CNN-based methods. In addition, extensive experiments evaluating the model with untrained and unfamiliar datasets demonstrated that our method exhibited good generalization ability.

show abstract

“…We consider the case where the autonomous sensing platform is composed of a set of RGB cameras and LiDAR sensors moving unconstrainedly in urban/terrain environments. Under this condition, the data streams of RGB video and LiDAR point clouds are collected in real time and then fed into a downstream system for data preprocessing, calibrating, and converting the 3D point clouds to 2D depth image 30 that results in pair of RGB-D data stream registered in a common coordinate frame. Given the pair of RGB-D data, our segmentation module performs feature detection, data fusion, segmentation, and pixel-level annotation automatically in an end-to-end fashion.…”

Section: Introductionmentioning

confidence: 99%

Evaluation of multimodal semantic segmentation using RGB-D data

Zhao

You

et al. 2021

Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III

Self Cite

View full text Add to dashboard Cite

Our goal is to develop stable, accurate, and robust semantic scene understanding methods for wide-area scene perception and understanding, especially in challenging outdoor environments. To achieve this, we are exploring and evaluating a range of related technology and solutions, including AI-driven multimodal scene perception, fusion, processing, and understanding. This work reports our efforts on the evaluation of a state-of-the-art approach for semantic segmentation with multiple RGB and depth sensing data. We employ four large datasets composed of diverse urban and terrain scenes and design various experimental methods and metrics. In addition, we also develop new strategies of multi-datasets learning to improve the detection and recognition of unseen objects. Extensive experiments, implementations, and results are reported in the paper.

show abstract

CalibDNN: multimodal sensor calibration for perception using deep neural networks

Cited by 26 publications

References 23 publications

CalibBD: Extrinsic Calibration of the LiDAR and Camera Using a Bidirectional Neural Network

CalibBD: Extrinsic Calibration of the LiDAR and Camera Using a Bidirectional Neural Network

PSNet: LiDAR and Camera Registration Using Parallel Subnetworks

Evaluation of multimodal semantic segmentation using RGB-D data

Contact Info

Product

Resources

About