A BSTRACTIn the context of content-oriented applications such as video surveillance and video retrieval this paper proposes a stable object tracking method based on both object segmentation and motion estimation. The method focuses on the issues of speed of execution and reliability in the presence of noise, coding artifacts, shadows, occlusion, and object split.Objects are tracked based on the similarity of their features in successive images. This is done in three steps: object segmentation and motion estimation, object matching, and feature monitoring and correction. In the first step, objects are segmented and their spatial and temporal features are computed. In the second step, using a non-linear voting strategy, each object of the previous image is matched with an object of the current image creating a unique correspondence. In the third step, object segmentation errors, such as when objects occlude or split, are detected and corrected. These new data are then used to update the results of previous steps, i.e., object segmentation and motion estimation. The contributions in this paper are the multi-voting strategy and the monitoring and correction of segmentation errors.Extensive experiments on indoor and outdoor video shots containing over 6000 images, including images with multi-object occlusion, noise, and coding artifacts have demonstrated the reliability and real-time response of the proposed method.