Road visual navigation relies on accurate road models. This study was aimed at proposing an improved scale-invariant feature transform (SIFT) algorithm for recovering depth information from farmland road images, which would provide a reliable path for visual navigation. The mean image of pixel value in five channels (R, G, B, S and V) were treated as the inspected image and the feature points of the inspected image were extracted by the Canny algorithm, for achieving precise location of the feature points and ensuring the uniformity and density of the feature points. The mean value of the pixels in 5×5 neighborhood around the feature point at an interval of 45º in eight directions was then treated as the feature vector, and the differences of the feature vectors were calculated for preliminary matching of the left and right image feature points. In order to achieve the depth information of farmland road images, the energy method of feature points was used for eliminating the mismatched points. Experiments with a binocular stereo vision system were conducted and the results showed that the matching accuracy and time consuming for depth recovery when using the improved SIFT algorithm were 96.48% and 5.6 s, respectively, with the accuracy for depth recovery of-7.17%-2.97% in a certain sight distance. The mean uniformity, time consuming and matching accuracy for all the 60 images under various climates and road conditions were 50%-70%, 5.0-6.5 s, and higher than 88%, respectively, indicating that performance for achieving the feature points (e.g., uniformity, matching accuracy, and algorithm real-time) of the improved SIFT algorithm were superior to that of conventional SIFT algorithm. This study provides an important reference for navigation technology of agricultural equipment based on machine vision.