2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) 2017
DOI: 10.1109/mlsp.2017.8168183
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End learning of cost-volume aggregation for real-time dense stereo

Abstract: We present a new deep learning-based approach for dense stereo matching. Compared to previous works, our approach does not use deep learning of pixel appearance descriptors, employing very fast classical matching scores instead. At the same time, our approach uses a deep convolutional network to predict the local parameters of cost volume aggregation process, which in this paper we implement using differentiable domain transform. By treating such transform as a recurrent neural network, we are able to train ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(12 citation statements)
references
References 34 publications
0
12
0
Order By: Relevance
“…DispNetC [30] gives very accurate results and this can be seen especially for foreground objects. In terms of accuracy, our solution is as good as DeepCostAggr [25], both methods relying on good object boundary estimates for enhanced reliability. These results prove that our refinement method can compensate the reduced accuracy of cost computation.…”
Section: Results On Kitti 2015 Testing Datasetmentioning
confidence: 90%
See 1 more Smart Citation
“…DispNetC [30] gives very accurate results and this can be seen especially for foreground objects. In terms of accuracy, our solution is as good as DeepCostAggr [25], both methods relying on good object boundary estimates for enhanced reliability. These results prove that our refinement method can compensate the reduced accuracy of cost computation.…”
Section: Results On Kitti 2015 Testing Datasetmentioning
confidence: 90%
“…This approach controls the aggregation window by 2 parameters corresponding to locality and intensity similarity. A different technique is proposed by [25] in which a residual neural network is responsible for edge detection. Edges are detected at different scales and are later used for setting the aggregation window boundaries.…”
Section: A Classic Taxonomymentioning
confidence: 99%
“…Instead, we propose a method that uses only a single, raster-friendly minimisation, but one that incorporates information from four directions at once. Our approach compares favourably with the two state-ofthe-art GPU-based methods [15], [24] that can process the KITTI dataset in real time, achieving similar levels of accuracy whilst reducing the power consumption by two orders of magnitude. Moreover, in comparison to other FPGA-based methods on the Middlebury dataset, we achieve comparable accuracy either at a much higher frame-rate (c.f.…”
Section: Discussionmentioning
confidence: 96%
“…On KITTI, we compare our approach to the only two published approaches from the benchmark that are able to achieve state-of-the-art performance in real time [15], [24], both of which require a powerful GPU (an Nvidia GTX Titan X) to run. Since, unlike these approaches, our approach does not naturally produce disparities for every single pixel in the image, we interpolate as specified by the KITTI evaluation protocol in order to make our results comparable with theirs.…”
Section: Methodsmentioning
confidence: 99%
“…The goal of our work is accurate but real-time stereo estimation, which we achieve through the use of traditional matching costs and 2D (instead of 3D) convolution layers. In this context, it is useful to discuss the work of Kuzmin et al [17], who also use traditional matching costs as well as a largely traditional pipeline for aggregation, using a learned deep network to control the parameters of this aggregation in different regions. This allows them to achieve a low runtime of 0.034 seconds (i.e., 29.4 frames per second) but with worse accuracy.…”
Section: Background and Related Workmentioning
confidence: 99%