This article examines the problem of a moving robot tracking a moving object with its cameras, without requiring the ability to recognize the target to distinguish it from distracting surroundings. A novel aspect of the approach taken is the use of controlled camera movements to simplify the visual processing necessary to keep the cameras locked on the target. A gaze-holding system implemented on a robot's binocular head demonstrates this approach. Even while the robot is moving, the cameras are able to track an object that rotates and moves in three dimensions.The central idea is that localizing attention in 3-D space makes precategorical visual processing sufficient to hold gaze. Visual fixation can help separate the target object from distracting surroundings. Converged cameras produce a horopter (surface of zero stereo disparity) in the scene. Binocular features with no disparity can be located with a simple filter, showing the object's location in the image. Similarly, an object that is being tracked is imaged near the center of the field of view, so spatially localized processing helps concentrate visual attention on the target. Instead of requiring a way to recognize the target, the system relies on active control of camera movements and binocular fixation segmentation to locate the target.