2022
DOI: 10.3390/s22072633
|View full text |Cite
|
Sign up to set email alerts
|

RODFormer: High-Precision Design for Rotating Object Detection with Transformers

Abstract: Aiming at the problem of Transformers lack of local spatial receptive field and discontinuous boundary loss in rotating object detection, in this paper, we propose a Transformer-based high-precision rotating object detection model (RODFormer). Firstly, RODFormer uses a structured transformer architecture to collect feature information of different resolutions to improve the collection range of feature information. Secondly, a new feed-forward network (spatial-FFN) is constructed. Spatial-FFN fuses the local sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 35 publications
0
13
0
Order By: Relevance
“…To tackle the challenges posed by the lack of local spatial perceptual field and discontinuous boundary loss in transformer-based rotating target detection, Dai [133] proposes RODFormer, a high-precision model that uses a structured transformer architecture to collect feature information at different resolutions and improve the range of feature information collection. Additionally, a spatial-FFN feedforward network is designed to address the deficiency of FFN in local spatial modeling by fusing local spatial features of depth-separable convolution with the global channel features of multilayer perceptron.…”
Section: Problem Solved Optimization Strategiesmentioning
confidence: 99%
“…To tackle the challenges posed by the lack of local spatial perceptual field and discontinuous boundary loss in transformer-based rotating target detection, Dai [133] proposes RODFormer, a high-precision model that uses a structured transformer architecture to collect feature information at different resolutions and improve the range of feature information collection. Additionally, a spatial-FFN feedforward network is designed to address the deficiency of FFN in local spatial modeling by fusing local spatial features of depth-separable convolution with the global channel features of multilayer perceptron.…”
Section: Problem Solved Optimization Strategiesmentioning
confidence: 99%
“…The great achievements of transformer in the field of natural language processing have greatly encouraged researchers to explore the role of transformer in the field of computer vision. In recent two years, transformer structure and its variants have been successfully applied to visual tasks such as image classification [44], image captioning [45], object detection [46], and segmentation [47]. The Google team also analyzed the training of vision transformer and provided an effective guidance for future research on visual transformers [48].…”
Section: Transformer-based Architecturesmentioning
confidence: 99%
“…Many works have been devoted to investigating high-precision vehicle detection methods by now [ 12 , 13 , 14 , 15 ]. These works mainly focus on the two categories of vehicle detection technology based on traditional image processing technology or convolutional neural networks (CNN).…”
Section: Introductionmentioning
confidence: 99%