2022
DOI: 10.3390/s22186993
|View full text |Cite
|
Sign up to set email alerts
|

FEA-Swin: Foreground Enhancement Attention Swin Transformer Network for Accurate UAV-Based Dense Object Detection

Abstract: UAV-based object detection has recently attracted a lot of attention due to its diverse applications. Most of the existing convolution neural network based object detection models can perform well in common object detection cases. However, due to the fact that objects in UAV images are spatially distributed in a very dense manner, these methods have limited performance for UAV-based object detection. In this paper, we propose a novel transformer-based object detection model to improve the accuracy of object de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 45 publications
0
6
0
Order By: Relevance
“…The Swin Transformer [ 33 ] module consists of a multi-layer perceptron, a Window Multi-head Self-Attention (WMSA), a Shifted Window-based Multi-head Self Attention (SWMSA), and a Layer Normalization (LN). The workflow is shown as Figure 2 .…”
Section: Related Workmentioning
confidence: 99%
“…The Swin Transformer [ 33 ] module consists of a multi-layer perceptron, a Window Multi-head Self-Attention (WMSA), a Shifted Window-based Multi-head Self Attention (SWMSA), and a Layer Normalization (LN). The workflow is shown as Figure 2 .…”
Section: Related Workmentioning
confidence: 99%
“…This innovation was named Transformer Enhanced FPN (TEF) module. In another study, Xu et al [48] developed a weighted Bidirectional Feature Pyramid Network (BiFPN) through the integration of skip connection operations with the Swin transformer. This approach effectively preserved information pertinent to small objects.…”
Section: Fast Attention For High-resolution or Multi-scale Feature Mapsmentioning
confidence: 99%
“…Moreover, DESTR uses a mini-detector to ensure proper content query initialization in the decoder and enhances the self-attention module. Another research [48], introduced FEA-Swin, which leverages advanced foreground Fig. 9: DAB-DETR improves Conditional DETR and utilizes dynamic anchor boxes to sequentially provide better reference query points and anchor sizes (figure from [67]).…”
Section: Architecture and Block Modificationsmentioning
confidence: 99%
“…(2) Influence of M-ASFF In this section, we explore the impact of micro adaptive feature fusion (M-ASFF) on the model. Since the main goal of M-ASFF is to achieve adaptive fusion of features at different scales, we selected comparative models of different feature fusion methods, mainly including YOLOv3 using only the FPN structure, the YOLOv5s source model using FPN+PANet, and a combination of The Swin Transformer's Weighted Bidirectional Feature Pyramid Network (TBIFPN) [31]. From Table 6, we can conclude that the YOLOv5s model with the addition of the M-ASFF module performs the best on the rail surface defect dataset.…”
Section: Ablation Experimentsmentioning
confidence: 99%
“…In [30], the researchers used the fuzzy C-means algorithm to re-cluster the anchor boxes based on YOLOv4 and added a shallow feature layer to solve the problem of occlusion of hanging insulators and power components. In [31], contextual information is integrated into the backbone of the Swin Transformer, and skip-connected BiFPN is used to improve detection of small objects.…”
Section: Introductionmentioning
confidence: 99%