In order to shorten the detection time and improve the average precision on embedded devices, A lightweight and high accuracy model is proposed for passion fruit in complex environments (backlight, occlusion, overlap, sunny, cloudy, rainy). Firstly, replacing the backbone network of YOLOv5 with a lightweight GhostNet model reduces the number of parameters and computation while improving detection speed. Secondly, a new feature branch is added to the GhostNet network, and the feature fusion layer in the neck network is reconstructed to effectively combine the lower-level and higher-level features, which not only improves the accuracy of the model but also maintains its lightweight. Finally, the knowledge distillation methods are used to transfer the knowledge from the more capable teacher model to the less capable student model, which significantly improving the detection accuracy. The improved model is denoted as G-YOLO-NK. The average accuracy of the G-YOLO-NK network is 96.00%, which is 1.00% higher than the original YOLOv5s model. Furthermore, the improved model size is 7.14MB, reduced to half of the original model, and the real-time detection frame rate is 11.25 FPS on the Jetson Nano. Compared to the state-of-the-art model, the proposed model outperforms them in terms of average precision and detection performance. The present work provides an effective model for real-time detection of passion fruits in complex orchard scenes, which can provide valuable technical support for the development of orchard picking robots and greatly improve the intelligence level of orchards.