The aim of this chapter is to present a general overview of the feature-based 3-D pose estimation and tracking techniques. Principles, classical techniques and recent advances are presented and discussed in the context of a monocular camera. The objective is to focus on techniques employed within both the visual servoing and registration fields for the wideclass of rigid objects. The main assumption to this problem rely on the availability of a 3-D model of the object to track.
Introduction: Model-based tracking and pose estimationThe recovery of the 3-D geometric information from 2-D images is a fundamental problem in computer vision. When only one view is available, the appearance or the relative arrangement of the object features of interest should be modelled in a symbolic description so as to be compared with the image descriptors thanks to a similarity criterion. Geometricbased approaches restrict the search for correspondence to a sparse set of geometrical features. They use numerical and symbolic properties of entities available. To automatically compute a rigid-body transformation (the pose), it is necessary to match a 3-D model features with part of the visible 2-D image features, a process referred to as the correspondence problem, and for the past four decades, the model-based pose estimation of objects with a simple geometry has been intensively studied. The major goal is tracking at camera frame rate the pose parameters in the world space. Therefore, features such as points, lines, ellipses are not only extracted from 2-D images, but the 3-D model and the pose of the object has to be also exploited.
Related work on model-based 3-D trackingThe modelbased 3-D tracking is closely related to the pose estimation problem. It can cope with abrupt motions and it is generally more efficient to deal with partial occlusions of the object of interest than 2-D tracking. However, it needs the correspondence problem to be solve at least once. The definition of object tracking algorithms in image sequences is an important issue for research and applications related to robot vision. A robust extraction and realtime spatiotemporal tracking of measurements is one of the keys to a successful visual servoingbased tracking, in particular for positionbased visual servoing approaches
360(PBVS) (Hutchinson, 1996), where the tracking error is computed in the task space and thereafter used by the robot control system. The reference work on visionbased navigation is the dynamic tracking approach introduced by Dickmanns and Graefe (Dickmanns, 1988). In this work, the steering of cars, vehicle docking and aircraft landing are performed by dynamic modelling and edge feature extraction and processing. Feature extraction and matching play an important role, and restriction of matching is commonly done by windowing technique. Harris and Stennet (Harris, 1990), Papanikolopoulos et al. (Papanikolopoulos, 1993) and Hager and Toyama (Hager, 1998) with the XVision system restrict the search space of matching to a neighbourhood of th...