A method is described of visually tracking a known three-dimensional object as it moves with six degrees of freedom. The method uses the predicted position of known features on the object to find the features in images from one or more cameras, measures the position of the features in the images, and uses these measurements to update the estimates of position, orientation, linear velocity, and angular velocity of the object model. The features usually used are brightness edges that correspond to markings or the edges of solid objects, although point features can be used. The solution for object position and orientation is a weighted least-squares adjustment that includes filtering over time, which reduces the effects of errors, allows extrapolation over times of missing data, and allows the use of stereo information from multiple-camera images that are not coincident in time. The filtering action is derived so as to be optimum if the acceleration is random. (Alternatively, random torque can be assumed for rotation.) The filter is equivalent to a Kalman filter, but for efficiency it is formulated differently in order to take advantage of the dimensionality of the observations and the state vector which occur in this problem. The method can track accurately with arbitrarily large angular velocities, as long as the angular acceleration (or torque) is small. Results are presented showing the successful tracking of partially obscured objects with clutter.
A method is described for accurately calibrating cameras including radial lens distortion, by using known points such as those measured from a calibration fixture. Both the intrinsic and extrinsic parameters are calibrated in a single least-squares adjustment, but provision is made for including old values of the intrinsic parameters in the adjustment. The distortion terms are relative to the optical axis, which is included in the model so that it does not have to be orthogonal to the image sensor plane. These distortion terms represent corrections to the basic lens model, which is a generalization that includes the perspective projection and the ideal fish-eye lens as special cases. The position of the entrance pupil point as a function of off-axis angle also is included in the model. (The complete camera model including all of these effects often is called CAHVORE.) A way of adding decentering distortion also is described. A priori standard deviations can be used to apply weight to given initial approximations (which can be zero) for the distortion terms, for the difference between the optical axis and the perpendicular to the sensor plane, and for the terms representing movement of the entrance pupil, so that the solution for these is well determined when there is insufficient information in the calibration data. For the other parameters, initial approximations needed for the nonlinear least-squares adjustment are obtained in a simple manner from the calibration data and other known information. (Weight can be given to these also, if desired.) Outliers among the calibration points that disagree excessively with the other data are removed by means of automatic editing based on analysis of the residuals. The use of the camera model also is described, including partial derivatives for propagating both from object space to image space and vice versa. These methods were used to calibrate the cameras on the Mars Exploration Rovers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.