2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01210
|View full text |Cite
|
Sign up to set email alerts
|

CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Abstract: Single-view depth estimation suffers from the problem that a network trained on images from one camera does not generalize to images taken with a different camera model. Thus, changing the camera model requires collecting an entirely new training dataset. In this work, we propose a new type of convolution that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns. Experiments confirm that this improves the generalization capabilities of depth prediction … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
50
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 99 publications
(51 citation statements)
references
References 50 publications
(85 reference statements)
0
50
0
1
Order By: Relevance
“…However, they did not consider strategies for optimal mixing, or study the impact of combining multiple datasets. Similarly, Facil et al [43] used multiple datasets with a naive mixing strategy for learning monocular depth with known camera intrinsics. Their test data is very similar to half of their training collection, namely RGB-D recordings of indoor scenes.…”
Section: Related Workmentioning
confidence: 99%
“…However, they did not consider strategies for optimal mixing, or study the impact of combining multiple datasets. Similarly, Facil et al [43] used multiple datasets with a naive mixing strategy for learning monocular depth with known camera intrinsics. Their test data is very similar to half of their training collection, namely RGB-D recordings of indoor scenes.…”
Section: Related Workmentioning
confidence: 99%
“…[18][19][20][21]. Besides, recent developments have shown that the pixel-level depth map can be recovered from a single image in an end-to-end manner based on deep learning [22]. A variety of neural networks have manifested their effectiveness to address the monocular depth estimation, such as convolutional neural networks (CNNs) [23], recurrent neural networks (RNNs) [24], variational auto-encoders (VAEs) [25] and generative adversarial networks (GANs) [26].…”
Section: Introductionmentioning
confidence: 99%
“…Several recent works are proposed to tackle this practical open-world problem for stereo matching [62,50,49], but few work focus on online monocular depth adaptation. Compared with online stereo, online mono-depth learning is even more challenging from two aspects: (i) scale ambiguity and lack of support from geometrical information inherent to a monocular setting make the model extremely dependent on domain-specific visual features and prone to overfit on the current domain [11,10]; (ii) the environmental variation (e.g., speed or scene changes) introduces additional challenges both for depth or pose estimation, making the whole model fragile. On the top of these, as deep networks are not flexible in an online learning setting, there is catastrophic forgetting issue.…”
Section: Introductionmentioning
confidence: 99%