Indoor visual positioning is a key technology in a variety of indoor location services and applications. The particular spatial structures and environments of indoor spaces is a challenging scene for visual positioning. To address the existing problems of low positioning accuracy and low robustness, this paper proposes a precision single-image-based indoor visual positioning method for a smartphone. The proposed method includes three procedures: First, color sequence images of the indoor environment are collected in an experimental room, from which an indoor precise-positioning-feature database is produced, using a classic speed-up robust features (SURF) point matching strategy and the multi-image spatial forward intersection. Then, the relationships between the smartphone positioning image SURF feature points and object 3D points are obtained by an efficient similarity feature description retrieval method, in which a more reliable and correct matching point pair set is obtained, using a novel matching error elimination technology based on Hough transform voting. Finally, efficient perspective-n-point (EPnP) and bundle adjustment (BA) methods are used to calculate the intrinsic and extrinsic parameters of the positioning image, and the location of the smartphone is obtained as a result. Compared with the ground truth, results of the experiments indicate that the proposed approach can be used for indoor positioning, with an accuracy of approximately 10 cm. In addition, experiments show that the proposed method is more robust and efficient than the baseline method in a real scene. In the case where sufficient indoor textures are present, it has the potential to become a low-cost, precise, and highly available indoor positioning technology.