2022
DOI: 10.3390/s22145353
|View full text |Cite
|
Sign up to set email alerts
|

Monocular Depth Estimation Using Deep Learning: A Review

Abstract: In current decades, significant advancements in robotics engineering and autonomous vehicles have improved the requirement for precise depth measurements. Depth estimation (DE) is a traditional task in computer vision that can be appropriately predicted by applying numerous procedures. This task is vital in disparate applications such as augmented reality and target tracking. Conventional monocular DE (MDE) procedures are based on depth cues for depth prediction. Various deep learning techniques have demonstra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 66 publications
(31 citation statements)
references
References 128 publications
0
18
0
Order By: Relevance
“…Recent surveys [ 9 , 10 , 11 , 12 ] present summarized descriptions and comparisons of single image depth estimation methods. The main observation made in these surveys is that estimating depth from a single image remains difficult, especially for autonomous vehicle applications.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent surveys [ 9 , 10 , 11 , 12 ] present summarized descriptions and comparisons of single image depth estimation methods. The main observation made in these surveys is that estimating depth from a single image remains difficult, especially for autonomous vehicle applications.…”
Section: Related Workmentioning
confidence: 99%
“…As in previous works [50][51][52], we use a multi-scale loss function. For each frame and each level, we compute the L 1 distance on the logarithm of the depths resulting from the conversion of parallax estimates using Equation (9). The logarithm leads to a scale invariant loss function [43] and the use of an L 1 distance is motivated by its good convergence properties [53].…”
Section: Loss Function Definitionmentioning
confidence: 99%
See 1 more Smart Citation
“…The network model architecture consists of customized building blocks, such as convolution layers, pooling functions, activation layers, and expansion layers [ 6 , 7 , 8 ]. The state–of–the–art CNN use supervised or self–supervised training strategies [ 6 , 9 , 10 , 11 , 12 , 13 ]. Supervised training [ 5 ] requires a labeled depth image for the network to converge.…”
Section: Introductionmentioning
confidence: 99%
“…Binocular depth estimation, also known as stereo vision, is an alternate approach that relies on triangulating two cameras with overlapping fields of view. Pixels in each image are matched to each other using a wide range of methods from classical [14] to deep learning-based [15]. These methods are less reliant on large training data but can still be computationally expensive and large baselines are typically required for most distances encountered in a maritime context.…”
Section: Introductionmentioning
confidence: 99%