Pests have caused significant losses to agriculture, greatly increasing the detection of pests in the planting process and the cost of pest management in the early stages. At this time, advances in computer vision and deep learning for the detection of pests appearing in the crop open the door to the application of target detection algorithms that can greatly improve the efficiency of tomato pest detection and play an important technical role in the realization of the intelligent planting of tomatoes. However, in the natural environment, tomato leaf pests are small in size, large in similarity, and large in environmental variability, and this type of situation can lead to greater detection difficulty. Aiming at the above problems, a network target detection model based on deep learning, YOLONDD, is proposed in this paper. Designing a new loss function, NMIoU (Normalized Wasserstein Distance with Mean Pairwise Distance Intersection over Union), which improves the ability of anomaly processing, improves the model’s ability to detect and identify objects of different scales, and improves the robustness to scale changes; Adding a Dynamic head (DyHead) with an attention mechanism will improve the detection ability of targets at different scales, reduce the number of computations and parameters, improve the accuracy of target detection, enhance the overall performance of the model, and accelerate the training process. Adding decoupled head to Head can effectively reduce the number of parameters and computational complexity and enhance the model’s generalization ability and robustness. The experimental results show that the average accuracy of YOLONDD can reach 90.1%, which is 3.33% higher than the original YOLOv5 algorithm and is better than SSD, Faster R-CNN, YOLOv7, YOLOv8, RetinaNet, and other target detection networks, and it can be more efficiently and accurately utilized in tomato leaf pest detection.