Abstract-The evolution of an image sequence obtained by a real camera from a real scene can be conceptually separated into two parts: 1) motion of the camera and 2) motion of the objects in a scene. Most existing motion estimation algorithms use the block matching algorithm (BMA) to model both the camera motion and local motion due to the objects. In doing so, successive frames are divided into small blocks and the movement of each block is approximately modeled by a translation, thus resulting in one motion vector per block. In this paper, we propose two classes of algorithms for modeling camera motion in video sequences captured by a camera. The first class can be applied in situations where there is no camera translation and the motion of camera can be adequately modeled by zoom, pan, and rotation parameters. The second class is more general in that it can be applied to situations where the camera is undergoing a translational motion, as well as a rotation and zoom and pan. This class uses seven parameters to describe the motion of the camera and requires the depth map to be known at the receiver. The salient feature of both of our algorithms is that the camera motion is estimated using binary matching of the edges in successive frames. In doing so, we show that unlike local motion estimation, edge matching can be sufficient in estimating camera motion parameters. Finally, we compare the rate distortion characteristics of our algorithms with that of the B M A and show that we can achieve similar performance characteristics as BMA, with reduced computational complexity.
Abstract-The evolution of an image sequence obtained by a real camera from a real scene can be conceptually separated into two parts: 1) motion of the camera and 2) motion of the objects in a scene. Most existing motion estimation algorithms use the block matching algorithm (BMA) to model both the camera motion and local motion due to the objects. In doing so, successive frames are divided into small blocks and the movement of each block is approximately modeled by a translation, thus resulting in one motion vector per block. In this paper, we propose two classes of algorithms for modeling camera motion in video sequences captured by a camera. The first class can be applied in situations where there is no camera translation and the motion of camera can be adequately modeled by zoom, pan, and rotation parameters. The second class is more general in that it can be applied to situations where the camera is undergoing a translational motion, as well as a rotation and zoom and pan. This class uses seven parameters to describe the motion of the camera and requires the depth map to be known at the receiver. The salient feature of both of our algorithms is that the camera motion is estimated using binary matching of the edges in successive frames. In doing so, we show that unlike local motion estimation, edge matching can be sufficient in estimating camera motion parameters. Finally, we compare the rate distortion characteristics of our algorithms with that of the B M A and show that we can achieve similar performance characteristics as BMA, with reduced computational complexity.
In this paper, we propose two classes of algorithms for modeling camera motion in video sequences captured by a camera. The first class can be applied in situations where there is no camera translation and the motion of camera can be adequately modeled by zoom, pan and rotation parameters. The second class is more general in that it can be applied to situations where the camera is undergoing a translational motion, as well as a rotation and zoom and pan. The salient feature of both of our algorithms is that the camera motion is estimated using binary matching of the edges in successive frames. In doing so, we show that unlike local motion estimation, edge matching can be suficient in estimating camera motion parameters.
In this paper, we propose two classes of algorithms for modeling camera motion in video sequences captured by a camera. The first class can be applied in situations where there is no camera translation and the motion of camera can be adequately modeled by zoom, pan and rotation parameters. The second class is more general in that it can be applied to situations where the camera is undergoing a translational motion, as well as a rotation and zoom and pan. These algorithms are then applied to two problems: predictive video coding and three dimensional ( 3D) scene reconstruction. In predictive coding, the 3D camera motion is estimated from a sequence of frames and used to predict the apparent change in successive frames. In 3D scene reconstruction, a 3D object is "scanned" by translation/rotation motion of the camera along a pre-specified path. The information in successive frames is then used to recover depth at few select locations. The depth and intensity information at these locations are then used to recover the intensity along intermediate points on the scanning path. Experimental results on both these applications will be shown.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.