Ego-motion estimation and 3D scene reconstruction from image data has been a long term aim both in the Robotics and Computer Vision communities. Nevertheless, while both visual SLAM and Structure from Motion already provide an accurate ego-motion estimation, visual scene estimation does not offer yet such a satisfactory result; being in most cases limited to a sparse set of salient points. In this paper we propose an algorithm to densify a sparse point-based reconstruction into a dense multi-plane based one, from the only input of a set of sparse images.The method starts by recovering a sparse set of 3D salient points and uses them to robustly estimate the dominant planes of the scene. The number of planes is not known in advance and there may exist outliers from the planes in the point cloud. In a second step, the image data and the estimated 3D structure are combined to determine which parts of each plane actually belong to the scene exploiting photoconsistency and geometrical constraints.Experimental results with real images show that the described approach achieves accurate and dense estimation results in man-made environments. Moreover, the method is able to recover areas without texture, where usually there are no salient points.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.