2021
DOI: 10.3390/rs13224518
|View full text |Cite
|
Sign up to set email alerts
|

Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation

Abstract: The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between different classes and to explore representative features for all instances of a given class. While common deep convolutional neural networks (DCNNs) can effectively focus on local features, they are limited by their… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 58 publications
0
5
0
Order By: Relevance
“…Recently, many transformer-based methods were proposed to learn better feature representation. In [32], Swin Transformers [33] were introduced as the backbone to extract the context information, and a memory-augmented transformer [34] was proposed to model both the local and global feature representation. Beyond bandwise representations in classic transformers, SpectralFormer [35] was developed to learn spectrally local sequence representation.…”
Section: A Semantic Segmentation In Remote Sensingmentioning
confidence: 99%
“…Recently, many transformer-based methods were proposed to learn better feature representation. In [32], Swin Transformers [33] were introduced as the backbone to extract the context information, and a memory-augmented transformer [34] was proposed to model both the local and global feature representation. Beyond bandwise representations in classic transformers, SpectralFormer [35] was developed to learn spectrally local sequence representation.…”
Section: A Semantic Segmentation In Remote Sensingmentioning
confidence: 99%
“…TransUNet combines U-Net and Transformer to capture local and global features for medical image segmentation, and achieves excellent segmentation performance [18]. To model local and global context, nn-Former [19] exploited the combination of interleaved convolution and self-attention operations within the encoder and decoder for volumetric medical image segmentation. Furthermore, MISSFormer [20] embedded depth-wise convolution into the transformer block for capturing local and global dependencies.…”
Section: Introductionmentioning
confidence: 99%
“…However, this method requires high accuracy of false labels and is prone to receive interference from outliers. In addition, there are also some algorithms for SAR image classification with small samples using techniques in other branches of machine learning, such as generative admissible network [8,9], transfer learning [10,11], recurrent neural network [12], graph neural network [13] etc.…”
Section: Introductionmentioning
confidence: 99%