Quantifying rat behavior through video surveillance is crucial for medicine, neuroscience, and other fields. In this paper, we focus on the challenging problem of estimating landmark points, such as the rat's eyes and joints, only with image processing and quantify the motion behavior of the rat. Firstly, we placed the rat on a special running machine and used a high frame rate camera to capture its motion. Secondly, we designed the cascade convolution network (CCN) and cascade hourglass network (CHN), which are two structures to extract features of the images. Three coordinate calculation methods-fully connected regression (FCR), heatmap maximum position (HMP), and heatmap integral regression (HIR)-were used to locate the coordinates of the landmark points. Thirdly, through a strict normalized evaluation criterion, we analyzed the accuracy of the different structures and coordinate calculation methods for rat landmark point estimation in various feature map sizes. The results demonstrated that the CCN structure with the HIR method achieved the highest estimation accuracy of 75%, which is sufficient to accurately track and quantify rat joint motion.