Abstract. This paper describes an approach for tracking rigid and articulated objects using a view-based representation. The approach builds on and extends work on eigenspace representations, robust estimation techniques, and parameterized optical flow estimation. First, we note that the least-squares image reconstruction of standard eigenspace techniques has a number of problems and we reformulate the reconstruction problem as one of robust estimation. Second we define a "subspace constancy assumption" that allows us to exploit techniques for parameterized optical flow estimation to solve for both the view of an object and the affine transformation between the eigenspace and the image. To account for large affine transformations between the eigenspace and the image we define a multi-scale eigenspace representation and a coarse-to-fine matching strategy. Finally, we use these techniques to track objects over long image sequences in which the objects simultaneously undergo both affine image motions and changes of view. In particular we use this "EigenTracking" technique to track and recognize the gestures of a moving hand.
We present a technique for the computation of 2D component velocity from image sequences. Initially, the image sequence is represented by a family of spatiotemporal velocity-tuned linear filters. Component velocity, computed from spatiotemporal responses of identically tuned filters, is expressed in terms of the local first-order behavior of surfaces of constant phase. Justification for this definition is discussed from the perspectives of both 2D image translation and deviations from translation that are typical in perspective projections of 3D scenes. The resulting technique is predominantly linear, efficient, and suitable for parallel processing. Moreover, it is local in space-time, robust with respect to noise, and permits multiple estimates within a single neighborhood. Promising quantitative results are reported from experiments with realistic image sequences, including cases with sizeable perspective deformation. I IntroductionThis article addresses the quantitative measurement of velocity in image sequences. The important issues are (1) the accuracy with which velocity can be computed;(2) robustness with respect to smooth contrast variations and affine deformation (i.e., deviations from 2D image translation that are typical in perspective projections of 3D scenes); (3) localization in space-time; (4) noise robustness; and (5) the ability to discern different velocities within a single neighborhood. Our approach is based on the phase information in a local-frequency representation of the image sequence that is produced by a family of velocity-tuned linear filters. The velocity measurements are limited to component velocity: the projected components of 2D velocity onto directions normal to oriented structure in the image (a definition is given in section 3). The combination of these measurements to derive the full 2D velocity is briefly discussed.Our reasons for concentrating on component velocity (also referred to as normal velocity) stem from a desire for local measurements, and the well-known aperture problem (Mart and Ullman 1981). Local measurements allow smoothly varying velocity fields to be estimated based on translational image velocity as opposed to more complicated descriptions of the velocity field over larger image patches. However, in narrow spatiotemporal apertures the intensity structure is often roughly one-dimensional so that only one component of the image velocity can be accurately determined. To obtain full 2D velocity fields, larger space-time support is therefore required. In our view, the common assumptions of smoothness, uniqueness, and the coherence of neighboring measurements that are involved in combining local measurements to determine 2D velocity, to fill in regions without measurements, and to reduce the effects of noise, should be viewed as aspects of interpretation, and as such, are distinct issues. In considering just normal components of velocity we hope to obtain more accurate estimates of motion within smaller apertures, which leads to better spatial resolution of veloci...
As an observer moves and explores the environment, the visual stimulation in his/her eye is constantly changing. Somehow he/she is able to perceive the spatial layout of the scene, and to discern his/her movement through space. Computational vision researchers have been trying to solve this problem for a number of years with only limited success. It is a difficult problem to solve because the optical flow field is nonlinearly related to the 3D motion and depth parameters.Here, we show that the nonlinear equation describing the optical flow field can be split by an exact algebraic manipulation to form three sets of equations. The first set relates the flow field to only the translational component of 3D motion. Thus, depth and rotation need not be known or estimated prior to solving for translation. Once the translation has been recovered, the second set of equations can be used to solve for rotation. Finally, depth can be estimated with the third set of equations, given the recovered translation and rotation.The algorithm applies to the general case of arbitrary motion with respect to an arbitrary scene. It is simple to compute, and it is plausible biologically. The results reported in this article demonstrate the potential of our new approach, and show that it performs favorably when compared with two other well-known algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.