2024
DOI: 10.1109/jstars.2023.3326967
|View full text |Cite
|
Sign up to set email alerts
|

Swin Transformer Embedding Dual-Stream for Semantic Segmentation of Remote Sensing Imagery

Xuanyu Zhou,
Lifan Zhou,
Shengrong Gong
et al.

Abstract: The acquisition of global context and boundary information is crucial for the semantic segmentation of remote sensing (RS) images. In contrast to convolutional neural networks (CNNs), transformers exhibit superior performance in global modeling and shape feature encoding, which provides a novel avenue for obtaining global context and boundary information. However, current methods fail to effectively leverage these distinctive advantages of transformers. To address this issue, we propose a novel single encoder … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 60 publications
0
2
0
Order By: Relevance
“…The output of one block is used as the input for the next, allowing the model to capture global dependencies across patches Zhou et al. ( 39 ). Finally, following hierarchical processing, a classification head is attached to the Swin Transformer to predict class labels for image classification tasks.…”
Section: Methodsmentioning
confidence: 99%
“…The output of one block is used as the input for the next, allowing the model to capture global dependencies across patches Zhou et al. ( 39 ). Finally, following hierarchical processing, a classification head is attached to the Swin Transformer to predict class labels for image classification tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Remote sensing images have been widely applied in object detection [1,41], image semantic segmentation [42], image classification [43], and change detection [44]. However, the captured remote sensing images under diverse light conditions, such as overexposure and underexposure, often suffer from low dynamics and noise [45,46].…”
Section: Remote Sensing Image Enhancementmentioning
confidence: 99%