Wild animal monitoring is of great significance to population discovery and research on animal behavior and habits. Early wild animal monitoring mainly relied on human effort, which is time-consuming and contains safety risks. In recent years, with the continuous development of pattern recognition techniques, automated wildlife detection algorithms based on image content analysis have also made progress thanks to these achievements. However, due to the complexity of field scenes, the recognition accuracy and robustness of existing methods can not meet the practical application requirements. Based on considerations, we suggest a field animal detection method based on YOLOv5, which aims to localize and recognize the wild animal. We analyzed the change of recognition accuracy for different scenes in detail, especially for the scene containing multiple targets, small targets or occluded targets. We have used a large number of experiments to verify the feasibility of this method.