The current detection methods for grassland fires mainly rely on manual means, which are costly, inefficient, and difficult to achieve real-time and full coverage detection. Therefore, the YOLOv5m-D model and its static and dynamic characteristics are proposed to detect and identify smoke and flame throughout the entire process of grassland fires, and its effectiveness is verified. The experimental results showed that in smoke recognition and detection, YOLOv5m-D model showed slow local convergence under the condition of low Learning rate, and the mAP value of YOLOv5m-D was 86.4% when the batch size was 16. In the comparison of mAP values under the optimal hyperparameters, the Faster RCNN value was 72.34%, SSD value was 75.90%, YOLOv5m value was 86.75%, and YOLOv5m-D value was 89.28%, which was higher than the comparison model. In flame recognition detection, in the Hu1 moment, the ordinary image sequence and infrared thermal imaging sequence, except for a few that were around 0.8, were mostly maintained at around 0.3-0.7, while the color tent was both below 0.1. After combining infrared images with flame static and dynamic feature recognition, the flames are basically recognized. In the comparison of single feature recognition time, the current frame recognition time of different images under the vast majority of features is lower than the reference frame. Overall, the YOLOv5m-D model proposed in the study and its static and dynamic characteristics are effective in detecting and identifying smoke and flame throughout the entire process of grassland fires, and have high practical effects for practical grassland fire detection.INDEX TERMS YOLOv5m-D, Static and dynamic characteristics, Grassland fires, Smoke, Flame