In order to shorten detection times and improve average precision in embedded devices, a lightweight and high-accuracy model is proposed to detect passion fruit in complex environments (e.g., with backlighting, occlusion, overlap, sun, cloud, or rain). First, replacing the backbone network of YOLOv5 with a lightweight GhostNet model reduces the number of parameters and computational complexity while improving the detection speed. Second, a new feature branch is added to the backbone network and the feature fusion layer in the neck network is reconstructed to effectively combine the lower- and higher-level features, which improves the accuracy of the model while maintaining its lightweight nature. Finally, a knowledge distillation method is used to transfer knowledge from the more capable teacher model to the less capable student model, significantly improving the detection accuracy. The improved model is denoted as G-YOLO-NK. The average accuracy of the G-YOLO-NK network is 96.00%, which is 1.00% higher than that of the original YOLOv5s model. Furthermore, the model size is 7.14 MB, half that of the original model, and its real-time detection frame rate is 11.25 FPS when implemented on the Jetson Nano. The proposed model is found to outperform state-of-the-art models in terms of average precision and detection performance. The present work provides an effective model for real-time detection of passion fruit in complex orchard scenes, offering valuable technical support for the development of orchard picking robots and greatly improving the intelligence level of orchards.