In this paper we propose a pixel-wise visual tracking method using a novel tri-model representation. The newly proposed tri-model is composed of three models, which each model learns the target object, the background, and other non-target moving objects online. The proposed method performs tracking by simultaneous estimation of the holistic position of the target object and the pixel-wise labels. By utilizing the information in the background and the foreground models as well as the target model, our method obtains robust results even under background clutters and partial occlusions in complex scenes. Furthermore, our method is able to give pixel-wise results, and uses them in the learning process to prevent drifting. The method is extensively tested against seven representative trackers both quantitatively and qualitatively showing promising results.