Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion

Lee, Sihaeng; Lee, Janghyeon; Kim, Doyeon; Kim, Junmo

doi:10.1109/access.2020.2990212

Cited by 40 publications

(19 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CFCNet [8] learns to capture the semantically correlated features between RGB and depth information. CG-Net [9] proposes a cross guidance module to fuse the multi-modal feature from RGB and LiDAR. Some methods rely on iterative Spatial Propagation Network (SPN) to better treat the difficulties made by the sparsity and irregular distribution of the input [10][11][12] [13].…”

Section: Related Workmentioning

confidence: 99%

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

Xiang

et al. 2021

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Depth Completion can produce a dense depth map from a sparse input and provide a more complete 3D description of the environment. Despite great progress made in depth completion, the sparsity of the input and low density of the ground truth still make this problem challenging. In this work, we propose DenseLiDAR, a novel real-time pseudodepth guided depth completion neural network. We exploit dense pseudo-depth map obtained from simple morphological operations to guide the network in three aspects: (1) Constructing a residual structure for the output; (2) Rectifying the sparse input data; (3) Providing dense structural loss for training the network. Thanks to these novel designs, higher performance of the output could be achieved. In addition, two new metrics for better evaluating the quality of the predicted depth map are also presented. Extensive experiments on KITTI depth completion benchmark suggest that our model is able to achieve the state-of-the-art performance at the highest frame rate of 50Hz. The predicted dense depth is further evaluated by several downstream robotic perception or positioning tasks. For the task of 3D object detection, 3˜5 percent performance gains on small objects categories are achieved on KITTI 3D object detection dataset. For RGB-D SLAM, higher accuracy on vehicle's trajectory is also obtained in KITTI Odometry dataset. These promising results not only verify the high quality of our depth prediction, but also demonstrate the potential of improving the related downstream tasks by using depth completion results.

show abstract

Section: Related Workmentioning

confidence: 99%

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

Xiang

et al. 2021

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

show abstract

“…Some methods, e.g. [45], fuse the sparse depth and RGB image via early fusion while others [18,26,29,37,44,64] utilize a late fusion scheme, or jointly utilize both the early and late fusion [36,65,68]. Another line of research focuses on utilizing affinity or geometric information of the scene via surface normal, occlusion boundaries, and the geometric convolutional layer [11,12,25,34,52,54,77,86].…”

Section: Related Workmentioning

confidence: 99%

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning

Changgyoon¹,

Cho²,

Park³

et al. 2021

Preprint

View full text Add to dashboard Cite

Providing omnidirectional depth along with RGB information is important for numerous applications, e.g., VR/AR. However, as omnidirectional RGB-D data is not always available, synthesizing RGB-D panorama data from limited information of a scene can be useful. Therefore, some prior works tried to synthesize RGB panorama images from perspective RGB images; however, they suffer from limited image quality and can not be directly extended for RGB-D panorama synthesis. In this paper, we study a new problem: RGB-D panorama synthesis under the arbitrary configurations of cameras and depth sensors. Accordingly, we propose a novel bi-modal (RGB-D) panorama synthesis (BIPS) framework. Especially, we focus on indoor environments where the RGB-D panorama can provide a complete 3D model for many applications. We design a generator that fuses the bi-modal information and train it with residualaided adversarial learning (RDAL). RDAL allows to synthesize realistic indoor layout structures and interiors by jointly inferring RGB panorama, layout depth, and residual depth. In addition, as there is no tailored evaluation metric for RGB-D panorama synthesis, we propose a novel metric to effectively evaluate its perceptual quality. Extensive experiments show that our method synthesizes high-quality indoor RGB-D panoramas and provides realistic 3D indoor models than prior methods. Code will be released upon acceptance.

show abstract

“…These solutions train their normal prediction with synthetic data [10] or based on principal component analysis [11]. Others use confidence to fuse image and depth features, giving more weight to the modality with less uncertainty [12], [13], [14].…”

Section: A State Of the Art -Depth Completionmentioning

confidence: 99%

“…Various networks integrate Spatial Pyramid Pooling (SPP) [22] for depth completion [5], [16], [18], [23]. Atrous Spatial Pyramid Pooling (ASPP) [24] has been studied at the end of an encoder [16] or within residual blocks [13]. Li et al [25] combine multiple networks, each using different resolutions of sparse input.…”

Section: A State Of the Art -Depth Completionmentioning

confidence: 99%

DVMN: Dense Validity Mask Network for Depth Completion

Reichardt¹,

Mangat²,

Wasenmüller³

2021

Preprint

View full text Add to dashboard Cite

LiDAR depth maps provide environmental guidance in a variety of applications. However, such depth maps are typically sparse and insufficient for complex tasks such as autonomous navigation. State of the art methods use image guided neural networks for dense depth completion. We develop a guided convolutional neural network focusing on gathering dense and valid information from sparse depth maps. To this end, we introduce a novel layer with spatially variant and content-depended dilation to include additional data from sparse input. Furthermore, we propose a sparsity invariant residual bottleneck block. We evaluate our Dense Validity Mask Network (DVMN) on the KITTI depth completion benchmark and achieve state of the art results. At the time of submission, our network is the leading method using sparsity invariant convolution.

show abstract

Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion

Cited by 40 publications

References 33 publications

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning

DVMN: Dense Validity Mask Network for Depth Completion

Contact Info

Product

Resources

About