Identifying and tracking objects in surveillance videos is an important means of mining information during surveillance. Currently, most object‐tracking methods rely only on image features, which cannot accurately express the motion of the tracked object in real geographical scenes and are easily influenced by occlusion and surrounding objects having similar features. However, tracked objects, such as pedestrians and vehicles, usually move in geographical space with fixed patterns of motion, and the motion in a short time is constrained by geographical space and time, the motion trajectory is predictable, and the range of motion is limited. Therefore, based on the SiamFC object tracking framework, this study introduces geographical spatiotemporal constraints into the tracking framework and proposes the GeoSiamFC method. The objective of this is to: (1) construct the mapping relationship between geographical space and image space to solve the problem that the pixel movement within the image after perspective imaging cannot accurately express the motion of the tracked object in a real geographical scene; (2) add candidate search areas according to the predicted trajectory location to correct the tracking errors caused by the occlusion of the object; and (3) focus on the search for the range of motion of the mapped pixel within the image space according to the limited geographical range of motion of the tracked objects in a certain time to reduce the interference of similar objects within the search area. In this study, separate experiments were conducted on a common test dataset using multiple methods to deal with challenges such as occlusion and illumination changes. In addition, a robust test dataset with noise addition and luminance adjustment based on the common test dataset was used. The results show that GeoSiamFC outperforms other object‐tracking methods in the common test dataset with a precision score of 0.995 and a success score of 0.756 compared with most other object‐tracking algorithms under the base condition of using only shallow networks. Moreover, GeoSiamFC maintained the highest precision score (0.970) and high success score (0.734) in the more challenging robust test dataset as well. The tracking speed of 59 frames per second far exceeds the real‐time requirement of 25 FPS. Geographical spatiotemporal constraints were considered to improve tracker performance while providing real‐time feedback on the motion trajectory of the target in geographical space. Thus, the proposed method is suitable for real‐time tracking of the motion trajectory of a target in real geographical scenes in various surveillance videos.