2022
DOI: 10.1109/jstars.2022.3177235
|View full text |Cite
|
Sign up to set email alerts
|

A CNN-Transformer Network With Multiscale Context Aggregation for Fine-Grained Cropland Change Detection

Abstract: Non-agriculturalization incidents are serious threats to local agricultural ecosystem and global food security. Remote sensing change detection (CD) can provide an effective approach for in-time detection and prevention of such incidents. However, existing CD methods are difficult to deal with the large intra-class differences of cropland changes in high-resolution images (HRIs). In addition, traditional CNN based models are plagued by the loss of long-range context information, and the high computational comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
80
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 158 publications
(80 citation statements)
references
References 47 publications
(48 reference statements)
0
80
0
Order By: Relevance
“…Recently, many ViT-based methods are proposed to extract robust and discriminative features to represent RS images during CD tasks. Li et al [34] proposed a CNN-Transformer network to fulfill efficient cropland CD results, where they combine the merits of CNN and transformer to fuse the multiscale context information. Zhang et al [35] designed a pure transformer network named SwinSUNet with a Siamese U-shaped structure to realize CD.…”
Section: B Vit-based CD Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, many ViT-based methods are proposed to extract robust and discriminative features to represent RS images during CD tasks. Li et al [34] proposed a CNN-Transformer network to fulfill efficient cropland CD results, where they combine the merits of CNN and transformer to fuse the multiscale context information. Zhang et al [35] designed a pure transformer network named SwinSUNet with a Siamese U-shaped structure to realize CD.…”
Section: B Vit-based CD Methodsmentioning
confidence: 99%
“…For the purpose of demonstrating the superiority of STCD-Former, we choose five representative methods for comparison: SVM, M3D-DCNN [34], ViT-spectral, ViT-spatial, and SST-Former [33]. We ditto used 0.5% train samples to train these networks.…”
Section: B Experimental Setupmentioning
confidence: 99%
“…In addition, transformers, 40 which were originally widely used in natural language processing, have also shown performance comparable to or even better than CNNs in the task of CD in recent years. 26,27,30,41,42 Chen et al 26 represented bitemporal images as semantic tokens and used transformers to model the context and feed it back into pixel space to obtain more refined original features. Bandara et al 27 proposed a Siamese network using a hierarchical transformer encoder combined with a multilayer perceptron (MLP) decoder for CD tasks that can more effectively represent multiscale remote details.…”
Section: Based On Deep Learning Methodsmentioning
confidence: 99%
“…Figure 2 shows an illustration of this process. A similar approach is also described by Liu et al [55], who use the labels of the semantic change detection dataset HRSCD [34] to only select cropland changes as their binary labels.…”
Section: A Dataset and Few-shot Tasksmentioning
confidence: 99%