2021
DOI: 10.1016/j.neucom.2020.10.025
|View full text |Cite
|
Sign up to set email alerts
|

Self-supervised monocular depth estimation with direct methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…However, learning strategies utilize relatively larger and highly varied datasets of images. In "Self-supervised monocular depth estimation with direct methods" [120], the use of ground-truth or binocular stereo depth is replaced with unlabeled data of monocular video sequences. No assumptions about scene geometry or pre-trained information are required.…”
Section: Self-supervised Monocular Modelsmentioning
confidence: 99%
“…However, learning strategies utilize relatively larger and highly varied datasets of images. In "Self-supervised monocular depth estimation with direct methods" [120], the use of ground-truth or binocular stereo depth is replaced with unlabeled data of monocular video sequences. No assumptions about scene geometry or pre-trained information are required.…”
Section: Self-supervised Monocular Modelsmentioning
confidence: 99%
“…Then, the angle α between the camera coordinate system and the world coordinate system can be obtained by using the two vectors, as shown in Formula (17).…”
Section: Visual Localization Processmentioning
confidence: 99%
“…Visual guidance is typically based on monocular vision models or multi-vision models for the positioning of the target, and on using the obtained position and posture of the target to guide its movement [15]. Multi-vision requires multiple cameras for shooting, resulting in higher costs, and also requires addressing the feature matching issue from different cameras, which leads to complex operations [16,17]. In contrast, monocular vision only requires a single camera, enabling implementation through the pinhole imaging principle, resulting in lower costs and convenient operation [18,19].…”
Section: Introductionmentioning
confidence: 99%
“…Research has shown that the pixels of the moving DC objects that are visible in the target frame but not visible in the adjacent frames ultimately affect pose prediction. Inspired by [48], we not only combine the dynamic mask with per-pixel minimum reprojection loss but also employ the dynamic semantic masking to the DDVO method to improve the accuracy of pose prediction.…”
Section: Joint Learningmentioning
confidence: 99%