In the gastrointestinal biopsy, online tracking and relocation of the region-of-interest are essential to early diagnosis and surgical intervention of colorectal cancer. However, it is challenging for the examiner to track and retarget the optical biopsy site due to interfering factors, e.g. violent rotation of the lens, illumination variation, shape deformation, and target long-time-lost. Previous works may not effectively handle the mentioned challenges due to the complexity of gastrointestinal environment and the limitation of data. In this work, we construct an online tracking and relocation framework based on the concept of detection and tracking, which is dramatically adapted to the inherent characteristics of the gastrointestinal biopsy image. To effectively distinguish the target area from the gastrointestinal biopsy, we designed a new rotated invariant Haar-like statistical descriptor which is robust for rotating and illumination changes. The descriptor is based on the sector-ring difference under the circular sampling area. A simplified statistical random forest discriminator based on confidence statistics is proposed to complete the preliminary screening of the potential tracking target. In order to further estimate the location of the target, a supervised support vector machine is introduced to rank the candidate target regions. Based on proposals of Siamese network and the random forest, a location refinement fusion has been proposed to determine the location and the confidence of the tracking area. Extensive experiments on various gastrointestinal videos, which consists of open source and self-collected data, demonstrate that the proposed framework is superior to the mainstreams methods in accuracy and robustness.