To date, general-purpose object-detection methods have achieved a great deal. However, challenges such as degraded image quality, complex backgrounds, and the detection of marine organisms at different scales arise when identifying underwater organisms. To solve such problems and further improve the accuracy of relevant models, this study proposes a marine biological object-detection architecture based on an improved YOLOv5 framework. First, the backbone framework of Real-Time Models for object Detection (RTMDet) is introduced. The core module, Cross-Stage Partial Layer (CSPLayer), includes a large convolution kernel, which allows the detection network to precisely capture contextual information more comprehensively. Furthermore, a common convolution layer is added to the stem layer, to extract more valuable information from the images efficiently. Then, the BoT3 module with the multi-head self-attention (MHSA) mechanism is added into the neck module of YOLOv5, such that the detection network has a better effect in scenes with dense targets and the detection accuracy is further improved. The introduction of the BoT3 module represents a key innovation of this paper. Finally, union dataset augmentation (UDA) is performed on the training set using the Minimal Color Loss and Locally Adaptive Contrast Enhancement (MLLE) image augmentation method, and the result is used as the input to the improved YOLOv5 framework. Experiments on the underwater datasets URPC2019 and URPC2020 show that the proposed framework not only alleviates the interference of underwater image degradation, but also makes the mAP@0.5 reach 79.8% and 79.4% and improves the mAP@0.5 by 3.8% and 1.1%, respectively, when compared with the original YOLOv8 on URPC2019 and URPC2020, demonstrating that the proposed framework presents superior performance for the high-precision detection of marine organisms.
Underwater target detection is a critical task in various applications, including environmental monitoring, underwater exploration, and marine resource management. As the demand for underwater observation and exploitation continues to grow, there is a greater need for reliable and efficient methods of detecting underwater targets. However, the unique underwater environment often leads to significant degradation of the image quality, which results in reduced detection accuracy. This paper proposes an improved YOLOv5 underwater-target-detection network to enhance accuracy and reduce missed detection. First, we added the global attention mechanism (GAM) to the backbone network, which could retain the channel and spatial information to a greater extent and strengthen cross-dimensional interaction so as to improve the ability of the backbone network to extract features. Then, we introduced the fusion block based on DAMO-YOLO for the neck, which enhanced the system’s ability to extract features at different scales. Finally, we used the SIoU loss to measure the degree of matching between the target box and the regression box, which accelerated the convergence and improved the accuracy. The results obtained from experiments on the URPC2019 dataset revealed that our model achieved an mAP@0.5 score of 80.2%, representing a 1.8% and 2.3% increase in performance compared to YOLOv7 and YOLOv8, respectively, which means our method achieved state-of-the-art (SOTA) performance. Moreover, additional evaluations on the MS COCO dataset indicated that our model’s mAP@0.5:0.95 reached 51.0%, surpassing advanced methods such as ViDT and RF-Next, demonstrating the versatility of our enhanced model architecture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.