Traditional structure from motion is hard in indoor environments with only a few detectable point features. These environments, however, have other useful characteristics: they often contain severable visible lines, and their layout typically conforms to a Manhattan world geometry. We introduce a new algorithm to cluster visible lines in a Manhattan world, seen from two different viewpoints, into coplanar bundles. This algorithm is based on the notion of "characteristic line", which is an invariant of a set of parallel coplanar lines. Finding coplanar sets of lines becomes a problem of clustering characteristic lines, which can be accomplished using a modified mean shift procedure. The algorithm is computationally light and produces good results in real world situations.
The human visual system possesses the remarkable ability to pick out salient objects in images. Even more impressive is its ability to do the very same in the presence of disturbances. In particular, the ability persists despite the presence of noise, poor weather, and other impediments to perfect vision. Meanwhile, noise can significantly degrade the accuracy of automated computational saliency detection algorithms. In this article, we set out to remedy this shortcoming. Existing computational saliency models generally assume that the given image is clean, and a fundamental and explicit treatment of saliency in noisy images is missing from the literature. Here we propose a novel and statistically sound method for estimating saliency based on a nonparametric regression framework and investigate the stability of saliency models for noisy images and analyze how state-of-the-art computational models respond to noisy visual stimuli. The proposed model of saliency at a pixel of interest is a data-dependent weighted average of dissimilarities between a center patch around that pixel and other patches. To further enhance the degree of accuracy in predicting the human fixations and of stability to noise, we incorporate a global and multiscale approach by extending the local analysis window to the entire input image, even further to multiple scaled copies of the image. Our method consistently outperforms six other state-of-the-art models (Bruce & Tsotsos, 2009; Garcia-Diaz, Fdez-Vidal, Pardo, & Dosil, 2012; Goferman, Zelnik-Manor, & Tal, 2010; Hou & Zhang, 2007; Seo & Milanfar, 2009; Zhang, Tong, & Marks, 2008) for both noise-free and noisy cases.
We present an end-to-end system for structure and motion computation in a Manhattan layout from monocular videos. Unlike most SFM algorithms that rely on point feature matching, only line matches are considered in this work. This may be convenient in indoor environment characterized by extended textureless walls, where point features may be scarce. Our system relies on the notion of "characteristic lines", which are invariants of two views of the same parallel line pairs on a surface of known orientation. Experiments with indoor video sequences demonstrate the robustness of the proposed system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.