The refined monitoring and identification of fishing operations by fishing vessels is of great significance and value to fishing vessels. In order to solve the problem of inaccurate statistics of current Engraulis japonicus fishing quota and classification, this paper proposes an improved identification algorithm based on YOLOv5. This method introduces the SENet attention mechanism into the YOLOv5 backbone network structure, integrates the target information in different periods of fishing operations, reduces the interference of complex backgrounds, improves the precision of model detection, and ensures real-time detection efficiency. The artificially shot Engraulis japonicus videos are used as the dataset of the study, and the video is converted into a picture format to realize pre-labeling and processing. The 5550 images are divided into the training set, validation set, and test set according to 8:1:1. In order to verify the validity of the data, a set of in the control experiment, the YOLOv5 backbone network was replaced by MobileNetV2 and the SENet attention mechanism was introduced, and four models were implemented for comparison. The experimental results show that the research algorithm can obtain a mean average precision (mAP) of 99.3%, a precision of 98.9%, and a recall of 98.7%, which are improved by 1.4%, 1.7% and 2.5% respectively compared with the original model. The experimental results match the expectations. According to the statistics of some categories, the Kalman filter and the Hungarian matching method are used to count the main categories of fishing baskets, and an accuracy of 96.5% can be obtained. The threshold method for fishing nets and processing vessels can obtain 85.8% and 75% accuracy. These results shows that this target detection research can provide new ideas for job identification of Engraulis japonicus and provide auxiliary means for job statistics.