2022
DOI: 10.3390/rs14215415
|View full text |Cite
|
Sign up to set email alerts
|

Pixel Representation Augmented through Cross-Attention for High-Resolution Remote Sensing Imagery Segmentation

Abstract: Natural imagery segmentation has been transferred to land cover classification in remote sensing imagery with excellent performance. However, two key issues have been overlooked in the transfer process: (1) some objects were easily overwhelmed by the complex backgrounds; (2) interclass information for indistinguishable classes was not fully utilized. The attention mechanism in the transformer is capable of modeling long-range dependencies on each sample for per-pixel context extraction. Notably, per-pixel cont… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 43 publications
0
5
0
Order By: Relevance
“…To further validate the generalisation of the SABNet network, we conducted a series of comparison experiments using the GID-15 dataset with 15 classes and seven other advanced land cover classification methods, namely: BiseNet [22], PSPNet [17], Segformer [41], UNet [16], Deeplabv3+ [26], AMFFNet [48], HPSNet [53] and CAFHNet [54], and calculated three quantitative metrics as well as the number of parameters and computation of the model for the different network experimental results.…”
Section: G Comparison With Other Advanced Network On the Gid-15 Datasetmentioning
confidence: 99%
“…To further validate the generalisation of the SABNet network, we conducted a series of comparison experiments using the GID-15 dataset with 15 classes and seven other advanced land cover classification methods, namely: BiseNet [22], PSPNet [17], Segformer [41], UNet [16], Deeplabv3+ [26], AMFFNet [48], HPSNet [53] and CAFHNet [54], and calculated three quantitative metrics as well as the number of parameters and computation of the model for the different network experimental results.…”
Section: G Comparison With Other Advanced Network On the Gid-15 Datasetmentioning
confidence: 99%
“…During the forward propagation process of the CNN, the receptive field increases continuously with the convolution and pooling operations. Multiscale features from channel and spatial can be captured by fusing the features from CNN's different stages (Zheng et al, 2020a;Li Z. et al, 2021;Liu B. et al, 2022;Luo et al, 2022;Wang et al, 2022b;Zhao et al, 2022;Zheng et al, 2022).…”
Section: Hierarchical Structurementioning
confidence: 99%
“…It learned scale-invariant and small objects context information Liu B. et al (2022). designed a method that can efficiently extract different scale features and generate maps, which helps to subdivide objects into small and different sizes Luo et al (2022). extracted categorical object representations from multi-scale pixel features.…”
mentioning
confidence: 99%
“…However, this result can be tackled by hybrid through introduction of encoder-decoder style semantic segmentation models, leverage existing deep learning backbone [70], and explore diverse data settings and parameters in their experimentation [124]. Other methods of structure's enhancement include architectural modifications through the integration of attention mechanisms, transformer architecture, module fusion, and multi-scale feature fusion [125,126]. Example is the SCOCNN framework [127], which addresses the limitation faced by CNN through module integration: A module for semantic segmentation, a module for superpixel optimization, and a module for fusion.…”
mentioning
confidence: 99%