2021
DOI: 10.1109/jstars.2021.3119654
|View full text |Cite
|
Sign up to set email alerts
|

STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

Abstract: The applied research in remote sensing images has been pushed by convolutional neural network (CNN). Because of the fixed size of the perceptual field, CNN is unable to model global semantic relevance. Modeling global semantic information is possible with the self-attentive Transformer-based model. However, the method of patch computation used by Transformer for self-attentive computation ignores the spatial information inside each patch. To address these issues, we offer the STransFuse model as a new semantic… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
70
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 151 publications
(70 citation statements)
references
References 38 publications
0
70
0
Order By: Relevance
“…The models based on Swin Transformer architecture have demonstrated superior performance in computer vision fields such as image classification, target detection, and semantic segmentation [39]. In this paper, we proposed the LEG Transformer method to classify different fault states.…”
Section: Leg Transformer Methodsmentioning
confidence: 99%
“…The models based on Swin Transformer architecture have demonstrated superior performance in computer vision fields such as image classification, target detection, and semantic segmentation [39]. In this paper, we proposed the LEG Transformer method to classify different fault states.…”
Section: Leg Transformer Methodsmentioning
confidence: 99%
“…A self-supervised multitask representation learning method was designed to capture effective visual representations of remote sensing images in [22] for semantic segmentation. In [23], authors introduced the STransFuse model as a new semantic segmentation method for remote sensing images. Gao et al [24] proposed a novel unsupervised domain adaptive semantic segmentation method by selecting some classes from a source domain image and softly pasting the corresponding image patch on both source and target training images with a fusion weight.…”
Section: Related Workmentioning
confidence: 99%
“…However, most of this research has been conducted on high-resolution remote sensing images because of their high spatial information and the appropriate feature scale of the target. On the contrary, the scale of features contained in the medium-resolution remote sensing images varies greatly [33]. The poor performance of medium-resolution remote sensing image segmentation was due to its insufficient spatial feature information on the one hand [34], and many large scale features cannot be extracted in medium resolution due to the perceptual field limitation of CNNs.…”
Section: Introductionmentioning
confidence: 99%