2023
DOI: 10.3390/s23073634
|View full text |Cite
|
Sign up to set email alerts
|

Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images

Abstract: This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering algorithm, and then an anchor of appropriate size for the dataset was generated. Second, a modified CSPDarknet53 structure combined with Swin Transformer was proposed to retain sufficient global context information… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 24 publications
(6 citation statements)
references
References 32 publications
0
6
0
Order By: Relevance
“…Poor detection of small objects is one of the challenges in object detection tasks in the context of UAV aerial photography. In many existing works [ 39 , 40 , 41 , 42 ], detection scales are added to the model to reduce the missed detection rate of small objects, which is an effective improvement method. However, this approach can complicate the structure of the model and increase the consumption of computational and storage resources.…”
Section: Methodsmentioning
confidence: 99%
“…Poor detection of small objects is one of the challenges in object detection tasks in the context of UAV aerial photography. In many existing works [ 39 , 40 , 41 , 42 ], detection scales are added to the model to reduce the missed detection rate of small objects, which is an effective improvement method. However, this approach can complicate the structure of the model and increase the consumption of computational and storage resources.…”
Section: Methodsmentioning
confidence: 99%
“…Previous BM detection studies have used a confidence threshold of 50% [ 9 , 31 , 32 ] or confidence thresholds ranging from 0.1 to 0.9 [ 11 ]; however, these approach may not lead to optimal results for BM detection. Other object detection research has utilized the F1-score, which represents the harmonic mean of precision and recall, to determine the optimal confidence threshold [ 33 , 34 , 35 ]. As the recall is more important than the precision in BM detection, we introduced the F2-score, which emphasizes the importance of recall by assigning it twice the weight of precision, to determine the optimal confidence threshold.…”
Section: Methodsmentioning
confidence: 99%
“…Additionally, some methods [15] leverage global context information between detected instances and images to eliminate the reliance on anchor boxes and non-maximum suppression (NMS). Moreover, employing attention mechanisms to focus on the surrounding environment of detected instances [16,17] has also yielded promising results in small object detection.…”
Section: Related Workmentioning
confidence: 99%