The street spaces in tourist-oriented traditional villages served both the daily lives of villagers and the leisure activities of tourists. However, due to insufficient understanding of the spatial characteristics and under-exploration of spatial genes, these spaces often suffered from homogenization during tourism development. Thus, identifying the characteristics and connotations of such streets, understanding the relationship between tourists’ perceptions and built environment elements, and developing optimization strategies for these rural street spaces were urgent issues. Many studies have evaluated street space characteristics from tourists’ behavior, but few have focused on rural areas. Especially, research combining new technologies like artificial intelligence to study the psychological perceptions of tourists is still in its infancy. This study used a typical traditional village as a case study and applied the YOLOv5 deep-learning model to build a perception evaluation system based on three dimensions: tourists’ aggregation degree, stay time, and facial expressions. The study conducted a multivariate regression analysis on 21 factors across 4 aspects: street scale morphology, environmental facilities, ground-floor interface, and street business types. Results indicated that the functional business type of the scene had the greatest impact on tourists’ perception of the street space environment, followed by ground-floor features and environmental facilities. The regression coefficient for business in situ values and spatial perception was 0.47, highlighting it as a key factor influencing characteristic perception. Landscape water systems, flat ground-floor façades, and business diversity also positively affected tourists’ perception. This study utilized advanced techniques like the YOLOv5 model, known for its speed and accuracy, to scientifically analyze tourists’ behavior and perceptions, serving as feedback and evaluation for the village’s built environment. Empirical analysis of Yuanjia Village validated the effectiveness of the multidimensional approach and spatial gene theory. Ultimately, this method identified 12 street characteristic factors significantly affecting tourists’ perceptions. The uniqueness of this study lies in its comprehensive approach, combining empirical research, spatial gene theory, and advanced object detection technology, providing new insights for village spatial planning and construction.