With the advancement of computer vision technology, the demand for fruit recognition in agricultural automation is increasing. To improve the accuracy and efficiency of recognizing young red pears, this study proposes an improved model based on the lightweight YOLOv9s, termed YOLOv9s-Pear. By constructing a feature-rich and diverse image dataset of young red pears and introducing spatial-channel decoupled downsampling (SCDown), C2FUIBELAN, and the YOLOv10 detection head (v10detect) modules, the YOLOv9s model was enhanced to achieve efficient recognition of small targets in resource-constrained agricultural environments. Images of young red pears were captured at different times and locations and underwent preprocessing to establish a high-quality dataset. For model improvements, this study integrated the general inverted bottleneck blocks from C2f and MobileNetV4 with the RepNCSPELAN4 module from the YOLOv9s model to form the new C2FUIBELAN module, enhancing the model’s accuracy and training speed for small-scale object detection. Additionally, the SCDown and v10detect modules replaced the original AConv and detection head structures of the YOLOv9s model, further improving performance. The experimental results demonstrated that the YOLOv9s-Pear model achieved high detection accuracy in recognizing young red pears, while reducing computational costs and parameters. The detection accuracy, recall, mean precision, and extended mean precision were 0.971, 0.970, 0.991, and 0.848, respectively. These results confirm the efficiency of the SCDown, C2FUIBELAN, and v10detect modules in young red pear recognition tasks. The findings of this study not only provide a fast and accurate technique for recognizing young red pears but also offer a reference for detecting young fruits of other fruit trees, significantly contributing to the advancement of agricultural automation technology.