In order to address the challenges of inefficiency and insufficient accuracy in the manual identification of young citrus fruits during thinning processes, this study proposes a detection methodology using the you only look once for complex backgrounds of young citrus fruits (YCCB-YOLO) approach. The method first constructs a dataset containing images of young citrus fruits in a real orchard environment. To improve the detection accuracy while maintaining the computational efficiency, the study reconstructs the detection head and backbone network using pointwise convolution (PWonv) lightweight network, which reduces the complexity of the model without affecting the performance. In addition, the ability of the model to accurately detect young citrus fruits in complex backgrounds is enhanced by integrating the fusion attention mechanism. Meanwhile, the simplified spatial pyramid pooling fast-large kernel separated attention (SimSPPF-LSKA) feature pyramid was introduced to further enhance the multi-feature extraction capability of the model. Finally, the Adam optimization function was used to strengthen the nonlinear representation and feature extraction ability of the model. The experimental results show that the model achieves 91.79% precision (P), 92.75% recall (R), and 97.32% mean average precision (mAP)on the test set, which were improved by 1.33%, 2.24%, and 1.73%, respectively, compared with the original model, and the size of the model is only 5.4 MB. This study could meet the performance requirements for citrus fruit identification, which provides technical support for fruit thinning.