Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
In recent years, transformers have been introduced into the field of remote sensing image change detection (CD) due to their excellent global context modeling capabilities. However, the global nature of the self-attention used by transformers is not sensitive to local high-frequency information, making it challenging to address complex CD problems. To address this issue, some methods have considered combining convolutional neural networks and transformers to jointly harvest local-global features. Nevertheless, these methods have not paid much attention to the interactions between the features extracted by the two components. Therefore, to address the challenges faced by existing CD methods in balancing local and global features, as well as their inadequacy in handling complex scenarios, we propose a frequency-driven transformer network (FDTNet) that improves self-attention and the overall architecture. In the overall framework, the network first extracts features to obtain primary and deep features and then utilizes the transformer encoder-decoder network to obtain context embeddings with spatiotemporal information from these primary features to guide the subsequent processing of deep features. In the transformer encoding part, we introduce a frequency-driven attention module, comprising low-frequency attention (LFA) branch, high-frequency attention (HFA) branch, and local window self-attention, where LFA captures global dependencies, HFA handles important high-frequency information, and local window self-attention supplements detailed local information loss. In the transformer decoding part, an interactive attention module is utilized to integrate context information from the transformer encoder into deep features. In addition, we propose an edge enhancement module and gate-controlled channel exchange operation, where the former enhances boundary features using the Sobel operator and the latter swaps channels to obtain richer perspective information. The experimental results show that FDTNet achieved an F 1 score of 90.95% on LEVIR-CD, 82.70% on NJDS, and 79.84% on SYSU, outperforming several state-of-the-art CD methods.
In recent years, transformers have been introduced into the field of remote sensing image change detection (CD) due to their excellent global context modeling capabilities. However, the global nature of the self-attention used by transformers is not sensitive to local high-frequency information, making it challenging to address complex CD problems. To address this issue, some methods have considered combining convolutional neural networks and transformers to jointly harvest local-global features. Nevertheless, these methods have not paid much attention to the interactions between the features extracted by the two components. Therefore, to address the challenges faced by existing CD methods in balancing local and global features, as well as their inadequacy in handling complex scenarios, we propose a frequency-driven transformer network (FDTNet) that improves self-attention and the overall architecture. In the overall framework, the network first extracts features to obtain primary and deep features and then utilizes the transformer encoder-decoder network to obtain context embeddings with spatiotemporal information from these primary features to guide the subsequent processing of deep features. In the transformer encoding part, we introduce a frequency-driven attention module, comprising low-frequency attention (LFA) branch, high-frequency attention (HFA) branch, and local window self-attention, where LFA captures global dependencies, HFA handles important high-frequency information, and local window self-attention supplements detailed local information loss. In the transformer decoding part, an interactive attention module is utilized to integrate context information from the transformer encoder into deep features. In addition, we propose an edge enhancement module and gate-controlled channel exchange operation, where the former enhances boundary features using the Sobel operator and the latter swaps channels to obtain richer perspective information. The experimental results show that FDTNet achieved an F 1 score of 90.95% on LEVIR-CD, 82.70% on NJDS, and 79.84% on SYSU, outperforming several state-of-the-art CD methods.
In the field of remote sensing image (RSI) change detection (CD), existing methods often struggle to balance local and global features and adapt to complex scenes. Therefore, we propose a bidirectional-enhanced transformer network to address these issues. In the encoding part, we introduce a bidirectional-enhanced attention operation that encodes information both horizontally and vertically, as well as deep convolution to improve local contextual connections, thereby reducing computational complexity while improving the network's perception of global and local information. In the feature fusion part, we propose a channel weighting fusion module, which recalibrates channel-wise features to suppress noise and enhance semantic relevance. We tested the proposed method on two publicly available RSI CD datasets, the LEVIR-CD and DSIFN-CD datasets. Experimental results show that our model outperforms several state-of-the-art CD methods, including one based on convolution, three based on attention, and three based on the transformer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.