The paper introduces an accurate solution to dense orthographic Non-Rigid Structure from Motion (NRSfM) in scenarios with severe occlusions or, likewise, inaccurate correspondences. We integrate a shape prior term into variational optimisation framework. It allows to penalize irregularities of the time-varying structure on the per-pixel level if correspondence quality indicator such as an occlusion tensor is available. We make a realistic assumption that several non-occluded views of the scene are sufficient to estimate an initial shape prior, though the entire observed scene may exhibit non-rigid deformations. Experiments on synthetic and real image data show that the proposed framework significantly outperforms state of the art methods for correspondence establishment in combination with the state of the art NRSfM methods. Together with the profound insights into optimisation methods, implementation details for heterogeneous platforms are provided.
Establishing correspondences from image to 3D has been a key task of 6DoF object pose estimation for a long time. To predict pose more accurately, deeply learned dense maps replaced sparse templates. Dense methods also improved pose estimation in the presence of occlusion. More recently researchers have shown improvements by learning object fragments as segmentation. In this work, we present a discrete descriptor, which can represent the object surface densely. By incorporating a hierarchical binary grouping, we can encode the object surface very efficiently. Moreover, we propose a coarse to fine training strategy, which enables fine-grained correspondence prediction. Finally, by matching predicted codes with object surface and using a PnP solver, we estimate the 6DoF pose. Results on the public LM-O and YCB-V datasets show major improvement over the state of the art w.r.t. ADD(-S) metric, even surpassing RGB-D based methods in some cases.
Handling large occlusions in non-rigid structure from motion (NRSfM) currently requires either an expensive correspondence correction or estimation of a shape prior on several non-occluded views. To save computational cost and remove the dependency on additional pre-processing steps, this paper introduces the concept of depth fields. With the proposed depth fields, NRSfM is interpreted as an alternating estimation of vector fields with fixed origins on the one side, and estimation of displacements of the origins along the depth dimension on the other. The core of the new energy-based Coherent Depth Fields (CDF) approach is the spatial smoothness coherency term (CT) applied on the depth fields. Having its origins in the Motion Coherence Theory, CT interprets data as a displacement vector field and penalises irregularities in displacements. Not only for handling occlusions but also for unoccluded scenes CT has multiple advantages compared to previously proposed regularisers such as total variation. We show experimentally that CDF achieves state-of-the-art in dense NRSfM including scenarios with long and large occlusions, inaccurate correspondences as well as inaccurate initialisations, without requiring any additional pre-processing steps. c 2017. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.Contributions. This paper introduces two novel concepts which allow to design a computationally efficient, accurate and easy to implement (practical) approach as well as overcome the dependency on the pre-processing steps in NRSfM while handling large occlusions. The first concept is the notion of a depth vector field or, concisely, a depth field. A depth field is a 2D parametrisation of a surface embedded into 3D space so that every tracked 2D point is associated with a displacement along the depth dimension. This definition implies that all displacements are parallel to each other, or, in other words, a depth field is an irrotational vector field. With the new parametrisation the problem of NRSfM is interpreted as a filtering of a depth field in the first alternating step (at that moment point origins are fixed), and shape refinement in the second alternation step. The updated shape, in turn, alters the depth field which is further filtered, and so on until convergence.Next, we propose coherency term (CT) as a new soft spatial regulariser on the adjacent depth vectors. CT derives its origin from the motion coherence theory (MCT) [43,44] which studies principles of coherent motion and perception. MCT, in accordance to the human visual system states that neighboring structures tend to move coherently, i.e., with a common velocity and direction. We call the proposed approach Coherent Depth Fields (CDF) and formulate it as an energy-minimisation problem with CT. The main reason lies in the expressiveness of energy-based methods -an energy functional explicitly encodes assumptions on the underlying physical processes and relates input data with the sou...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.