2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00257
|View full text |Cite
|
Sign up to set email alerts
|

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
546
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 643 publications
(638 citation statements)
references
References 42 publications
1
546
0
Order By: Relevance
“…NLCA-Net [ 32 ] replaces the concatenation operation by calculating the variance of extracted features, which can reduce the C channel by half. For the D channel, recent CSN [ 26 ] reduces this dimension by generating a disparity candidate range and gradually refining the disparity map in a coarse-to-fine manner. These methods can reduce the memory and computational cost to a certain extent.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…NLCA-Net [ 32 ] replaces the concatenation operation by calculating the variance of extracted features, which can reduce the C channel by half. For the D channel, recent CSN [ 26 ] reduces this dimension by generating a disparity candidate range and gradually refining the disparity map in a coarse-to-fine manner. These methods can reduce the memory and computational cost to a certain extent.…”
Section: Related Workmentioning
confidence: 99%
“…In terms of different matching cost computation methods, current neural network-based stereo methods can be mainly divided into the following: 2D networks [ 18 , 19 , 20 , 21 , 22 , 23 ] with cost volumes generated by traditional methods or the correlation layer. 3D networks [ 24 , 25 , 26 , 27 ] with cost volumes generated by concatenation. According to published papers on KITTI official website, these two architectures have obvious differences in speed and accuracy, as shown in Figure 1 .…”
Section: Introductionmentioning
confidence: 99%
“…Their method only calculates one depth map at a time instead of calculating the entire 3D scene. Gu et al [ 58 ] further improved MVSNet, which solved the cubic increase in computational complexity as the image resolution increased. Most 3D reconstruction algorithms are only applicable to static scenes.…”
Section: Related Workmentioning
confidence: 99%
“…The MVSNet [1] is proposed to estimate the depth map for each view by building a cost volume followed by 3D CNN regularization. Moreover, due to the unideal run-time and memory requirements, the cascade pyramid structure [2] is proposed to build cost volume and infer depth in coarse to fine, which greatly reduces run-time and memory consumption. Besides, some unsupervised methods [7,3] are proposed to overcome the difficulty of obtaining ground-truth depth maps.…”
Section: Related Workmentioning
confidence: 99%
“…Depth estimation from multi-view images has a wide range of applications, such as 3D reconstruction, scene understanding, view synthesis, and robot vision. Recently, deep learningbased MVS methods have achieved promising results [1,2], and most of them are used for 3D reconstruction tasks. However, most learning-based methods rely on ground-truth depth as supervision, which is difficult to obtain so that the application scenarios of supervised methods are very limited.…”
Section: Introductionmentioning
confidence: 99%