3D scene understanding plays an essential role for intelligent vehicle applications. In these applications, passive stereo vision systems offer some significant advantages to estimate depth information compared with active systems such as 3D LIDAR. To apply stereo vision in autonomous driving, a new real-time stereo matching algorithm paired with an online auto-rectification framework is proposed. This method uses a bidirectional Viterbi algorithm at 4 paths to decode the matching cost space and a hierarchical structure (as shown in Fig. 1) is proposed to merge the 4 paths to further decrease the decoding error. We introduce Total Variation [1] constraint into Viterbi path for approximately modeling 3D planes at different orientations to reach a similar effect as slanted-plane models. Structural similarity (SSIM)[3] is used to to measure the pixel difference between left and right images at epipolar lines to improve robustness to luminance variation. The equation for one Viterbi path is expressed by:where e(p, u) is the energy of Viterbi node at pixel p and disparity u, G is the gradient information of image, λ is the parameter, L u denotes connected Viterbi nodes to the Viterbi node at pixel p and disparity u. Based on the output of Viterbi process, a convex optimization equation is derived to estimate epipolar line distortion. we summarize the properties of the epipolar line distortion caused by normal factors in intelligent vehicle applications. Based on these properties and inspired by the famous optical flow problem, we convert this distortion estimation problem to an optimization problem and employ the convex optimization theory to solve it. The Viterbi process and convex optimization are integrated into an online framework (as shown in Fig. 2) and two parts benefit each other without losing speed in this framework. It can automatically keep the epipolar line constraint to avoid the degradation of stereo matching results, which usually happens when other stereo matching methods being applied for driving vehicles.Extensive experiments were conducted to compare proposed algorithm with other practical state-of-the-art methods for intelligent vehicle applications. According to evaluation results at the KITTI [2] training dataset which includes total 194 images, our method has 7.38% average error rate compared to SGBM's 12.88% and ELAS's 11.99%. We also test the proposed algorithm in our experimental autonomous vehicle at real driving environments. For any 640x480 images with maximum 40 disparities, the running time is about 196ms with GTX TITAN GPU and Xeon E5-2620 CPU. Real driving videos including featured cases and typical failure cases can be found in the supplementary material.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.