While machine learning has been instrumental to the ongoing progress in most areas of computer vision, it has not been applied to the problem of stereo matching with similar frequency or success. We present a supervised learning approach for predicting the correctness of stereo matches based on a random forest and a set of features that capture various forms of information about each pixel.We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps.
Stereo matching, as many problems in computer vision, has been addressed by a multitude of algorithms, each with its own strengths and weaknesses. Instead of following the conventional approach and trying to tune or enhance one of the algorithms so that it dominates the competition, we resign to the idea that a truly optimal algorithm may not be discovered soon and take a different approach. We present a novel methodology for combining a large number of heterogeneous algorithms that is able to clearly surpass the accuracy of the most accurate algorithms in the set. At the core of our approach is the design of an ensemble classifier trained to decide whether a particular stereo matcher is correct on a certain pixel. In addition to features describing the pixel, our feature vector encodes the agreement and disagreement between the matcher under consideration and all other matchers. This formulation leads to high accuracy in disparity estimation on the KITTI stereo benchmark.
In this paper we propose an approach for estimating the confidence of stereo matches for superpixel-based disparity estimation. To our knowledge, this is the first such method reported in the literature. Starting from a simple superpixel stereo algorithm, we present a representative set of features that can be extracted from the disparity map and the superpixel fitting process. A random forest classifier is then trained on these features to predict whether the disparity assigned to each pixel of a test disparity map is correct or not. We perform experiments on the KITTI stereo benchmark and show that our confidence estimator is very accurate in predicting which disparities are correct and which are not. We also present a post-processing algorithm for improving the accuracy of the disparity maps that exploits the confidence estimates to reject wrong disparity values and achieves significant error reduction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.