2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00297
|View full text |Cite
|
Sign up to set email alerts
|

Learning for Disparity Estimation Through Feature Constancy

Abstract: Stereo matching algorithms usually consist of four steps, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement. Existing CNN-based methods only adopt CNN to solve parts of the four steps, or use different networks to deal with different steps, making them difficult to obtain the overall optimal solution. In this paper, we propose a network architecture to incorporate all steps of stereo matching. The network consists of three parts. The first part calc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
263
0
2

Year Published

2018
2018
2019
2019

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 342 publications
(266 citation statements)
references
References 32 publications
1
263
0
2
Order By: Relevance
“…Zbontar et al and Luo et al [19,40] use siamese networks to extract patch-wise features, which are then processed in a traditional cost volume by classic postprocessing. Some recent methods [5,12,13,16,20,23] replace post-processing with 2D/3D convolutions applied to the cost volume, producing SOTA performance on the KITTI benchmark. However surprisingly, none of them outperform traditional methods on Middlebury, possibly due to 1) memory constraints and 2) lack of training data.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Zbontar et al and Luo et al [19,40] use siamese networks to extract patch-wise features, which are then processed in a traditional cost volume by classic postprocessing. Some recent methods [5,12,13,16,20,23] replace post-processing with 2D/3D convolutions applied to the cost volume, producing SOTA performance on the KITTI benchmark. However surprisingly, none of them outperform traditional methods on Middlebury, possibly due to 1) memory constraints and 2) lack of training data.…”
Section: Related Workmentioning
confidence: 99%
“…When applied to down-scaled images, these methods run faster, but gives blurry results and inaccurate disparity estimates for the far-field. Recent "deep" stereo methods perform well on low-resolution benchmarks [5,11,16,21,38], while failing to produce SOTA results on high-res benchmarks [26]. This is likely due to: 1) Their architectures are not efficiently designed to operate on high-resolution images.…”
Section: Introductionmentioning
confidence: 99%
“…They also offered a large synthetic dataset called Scene Flow with dense disparity groundtruth. [7] and [8] improved DispNet by adding a second stage disparity correction network. [9] integrated an explicit edge detection network to improve disparity estimation, while [10] proposed a unified segmentation and disparity estimation architecture in order to incorporate semantic information.…”
Section: Cnn Based End-to-end Methodsmentioning
confidence: 99%
“…Our proposed refinement network takes geometric error E g , photometric error E p and unrefined disparity as input and produces refined disparity (via residual learning) and the occlusion map. Refinement procedures proposed in CRL [17], iResNet [12], StereoNet [8] and FlowNet2 [5] only use photometeric error (either in image or feature domain) as part of the input in the refinement networks. To the best of our knowledge we are the first to explore the importance of geometric error and occlusion training for disparity refinement.…”
Section: Related Workmentioning
confidence: 99%
“…The proposed refinement network described in Table. 7 is inspired by the refinement procedures proposed in CRL [17], iResNet [12], StereoNet [8], and ActiveStere-oNet [31]. We adopted the basic architecture for refinement as described in StereoNet [8] with dilated residual blocks [28] to increase the receptive field of filtering without compromising resolution.…”
Section: D Dilation In Cost Filteringmentioning
confidence: 99%