2022
DOI: 10.3390/rs14061488
|View full text |Cite
|
Sign up to set email alerts
|

CRTransSar: A Visual Transformer Based on Contextual Joint Representation Learning for SAR Ship Detection

Abstract: Synthetic-aperture radar (SAR) image target detection is widely used in military, civilian and other fields. However, existing detection methods have low accuracy due to the limitations presented by the strong scattering of SAR image targets, unclear edge contour information, multiple scales, strong sparseness, background interference, and other characteristics. In response, for SAR target detection tasks, this paper combines the global contextual information perception of transformers and the local feature re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
50
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 91 publications
(50 citation statements)
references
References 39 publications
0
50
0
Order By: Relevance
“…After the recent appearance of the popular visual transformer (ViT), some transformer models have been introduced into SAR ship detection, considering the advantages of ViT in establishing long-range dependencies. Xia et al [36] proposed a ViT architecture, called CRTransSar, to enhance context learning by combining transformer and CNN. Li et al [37] introduced the Swin transformer as the backbone in cascade R-CNN to improve feature extraction ability and proposed a feature fusion module to optimize the feature fusion capability of the feature pyramid.…”
Section: The Problemmentioning
confidence: 99%
“…After the recent appearance of the popular visual transformer (ViT), some transformer models have been introduced into SAR ship detection, considering the advantages of ViT in establishing long-range dependencies. Xia et al [36] proposed a ViT architecture, called CRTransSar, to enhance context learning by combining transformer and CNN. Li et al [37] introduced the Swin transformer as the backbone in cascade R-CNN to improve feature extraction ability and proposed a feature fusion module to optimize the feature fusion capability of the feature pyramid.…”
Section: The Problemmentioning
confidence: 99%
“…To minimize damage from maritime accidents, a control system that can urgently grasp the current situation and respond to accidents is needed. To this end, remote sensing platforms such as drones, helicopters, manned aircraft and satellites have attracted considerable attention as surveillance means and applied in various scenarios such as maritime debris detection (Choi, 2021), ship detection (Xia et al, 2022), and oil spill monitoring (Zhang et al, 2022). Drones, helicopters and manned aircraft can perform precise detection with high spatial resolution, but the coverage area is narrow as they fly at low altitude.…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning has the advantages of high precision, speed, and ability to perform end-to-end target detection, and deep learning-based networks have been widely used in the field of SAR ship detection in recent years. Xia et al 13 proposed the CRTransSar, which is a visual transformer framework based on contextual joint-representation learning. While enhancing the SAR target feature attributes, the CRTransSar can extract richer context feature information, enabling high accuracy, but its speed is limited.…”
Section: Introductionmentioning
confidence: 99%