2018
DOI: 10.48550/arxiv.1807.00275
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
101
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(102 citation statements)
references
References 4 publications
1
101
0
Order By: Relevance
“…However, the assistance from other modalities, e.g., color images, can significantly improve the completion accuracy. Ma et al concatenated the sparse depth and color image as the inputs of an off-the-shelf network [26] and further explored the feasibility of self-supervised Li-DAR completion [23]. Moreover, [14,16,33,4] proposed different network architectures to better exploit the potential of the encoder-decoder framework.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the assistance from other modalities, e.g., color images, can significantly improve the completion accuracy. Ma et al concatenated the sparse depth and color image as the inputs of an off-the-shelf network [26] and further explored the feasibility of self-supervised Li-DAR completion [23]. Moreover, [14,16,33,4] proposed different network architectures to better exploit the potential of the encoder-decoder framework.…”
Section: Related Workmentioning
confidence: 99%
“…With the advances of deep learning methods, many depth completion approaches based on convolutional neural networks (CNNs) have been proposed. The mainstream of these methods is to directly input the sparse depth maps (with/without color images) into an encoder-decoder network and predict dense depth maps [26,16,36,15,10,23,2]. These black-box methods force the CNN to learn a mapping from sparse depth measurements to dense maps, which is generally a challenging task and leads to unsatisfactory completion results, as shown in Fig.…”
Section: Introductionmentioning
confidence: 99%
“…The depth estimation problem can be further divided into different categories according to different sensor inputs. It includes depth in-painting for RGB-D camera and depth completion for planar LIDAR [3], [4].…”
Section: Related Workmentioning
confidence: 99%
“…However, this method is insufficient for the perception system of autonomous vehicles, as multimodal sensing data need to be fused together. To better fuse the input from the RGB image and sparse LIDAR, [3], [9] proposed a self-supervised training pipeline that takes the sequential information of the RGB image as a geometry constraint of the optimization. However, the performance of this architecture is highly reliant on the accuracy of the transition relation between nearby frames, which can be influenced by moving objects within the scene.…”
Section: Related Workmentioning
confidence: 99%
“…Teed and Deng [48] proposed an iterative method to regress dense correspondences from pairs of depth frames and compute the 6-DoF estimate using a PnP [29] algorithm. More recently, the authors of [35] use a model-based pose estimation solution via Perspectiven-Point to recover 6 DoF pose estimates from monocular videos and use the estimate as a form of supervision to enable semi-supervised depth learning from unlabeled videos and LiDAR. Our work borrows a similar concept, however, we take advantage of the model-based PnP solution and the inliers established to outfit a fully differentiable pose estimation module within the 3D keypoint learning frame- We illustrate the overall architecture of our proposed method that uses two consecutive images (target It and source Is) as input to selfsupervise 3D keypoint learning for monocular ego-motion estimation.…”
Section: Related Workmentioning
confidence: 99%