Scale‐aware camera localization in 3D LiDAR maps with a monocular visual odometry

Sun, Manhui; Yang, Shaowu; Liu, Henzhu

doi:10.1002/cav.1879

Cited by 2 publications

(1 citation statement)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For precisely tracking, it compared RGB images and a synthesized depth image projected from a LiDAR map. Our previous work 25 tracked a 6-DoF pose of a monocular camera within a LiDAR map by matching sparse camera point clouds acquired from direct sparse odometry (DSO-SLAM) with an a priori LiDAR map to find corresponding points. This work is also in need of a coarse estimation of initial position as a prior information.…”

Section: Related Workmentioning

confidence: 99%

Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps

Sun

Yang

Liu

2019

International Journal of Advanced Robotic Systems

Self Cite

View full text Add to dashboard Cite

Initial position estimation in global maps, which is a prerequisite for accurate localization, plays a critical role in mobile robot navigation tasks. Global positioning system signals often become unreliable in disaster sites or indoor areas, which require other localization methods to help the robot in searching and rescuing. Many visual-based approaches focus on estimating a robot's position within prior maps acquired with cameras. In contrast to conventional methods that need a coarse estimation of initial position to precisely localize a camera in a given map, we propose a novel approach that estimates the initial position of a monocular camera within a given 3D light detection and ranging map using a convolutional neural network with no retraining is required. It enables a mobile robot to estimate a coarse position of itself in 3D maps with only a monocular camera. The key idea of our work is to use depth information as intermediate data to retrieve a camera image in immense point clouds. We employ an unsupervised learning framework to predict the depth from a single image. Then we use a pretrained convolutional neural network model to generate depth image descriptors to construct representations of the places. We retrieve the position by computing similarity scores between the current depth image and the depth images projected from the 3D maps. Experiments on the publicly available KITTI data sets have demonstrated the efficiency and feasibility of the presented algorithm.

show abstract

Section: Related Workmentioning

confidence: 99%