A CBAM Based Multiscale Transformer Fusion Approach for Remote Sensing Image Change Detection

Wang, Wei; Zhang, Peng; Wang, Xin

doi:10.1109/jstars.2022.3198517

Cited by 89 publications

(34 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…STransUNet [66] combined transformer and UNet architectures, which can capture shallow detail features and model global context in high-level features. In order to capture the spatial and channel information of feature maps, MTCNet [67] divides CBAM into a SAM and a CAM, which are applied to the front-end and back-end of the multi-scale transformer, respectively. However, these methods do not consider the fusion of multi-scale tokens when using transformers to model the long-range context information of images.…”

Section: B Transformer-based Methodsmentioning

confidence: 99%

MDFENet: A Multiscale Difference Feature Enhancement Network for Remote Sensing Change Detection

Liu

et al. 2023

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

The main task of remote sensing change detection (CD) is to identify object differences in bitemporal remote sensing images. In recent years, methods based on deep convolutional neural networks (CNNs) have made great progress in remote sensing CD. However, due to illumination changes and seasonal changes in the images acquired by the same sensor, the problem of "pseudo change" in the change map is still difficult to solve. In this article, in order to reduce "pseudo changes", we propose a multi-scale difference feature enhancement network (MDFENet) to extract the most discriminative features from bitemporal remote sensing images. MDFENet contains three procedures: first, multi-scale bitemporal features are generated by a shared weighted Siamese encoder. Then features of each scale are fed into a difference enhancement module to generate refined difference features. Finally, they are combined and reconstructed by a decoder to generate change map. The difference enhancement module includes multiple layers of difference enhancement (DE) encoder and transformer decoder. They are applied to features of different scales to establish long-range relationships of pixels semantic changes, while high-level difference features participate in the generation of low-level difference features to enhance information transmission among features of different scales, reducing "pseudo changes". Compared with state-of-the-art methods, the proposed method achieved the best performance on two datasets, with F1 of 81.15% on the SYSU-CD dataset and 90.85% on the LEVIR-CD dataset.

show abstract

Section: B Transformer-based Methodsmentioning

confidence: 99%

MDFENet: A Multiscale Difference Feature Enhancement Network for Remote Sensing Change Detection

Liu

et al. 2023

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

show abstract

“…Convolution Block Attention Module (CBAM) [30] is a lightweight and versatile attention module with structural diagram in the lower left corner of the Fig. 4.…”

Section: B Attention Mechanismmentioning

confidence: 99%

Improved YOLOX-DeepSORT for Multitarget Detection and Tracking of Automated Port RTG

YU,

ZHENG,

YANG

et al. 2024

IEEE Open J. Ind. Electron. Soc.

View full text Add to dashboard Cite

Rubber Tire Gantry (RTG) plays a pivotal role in facilitating efficient container handling within port operations. Conventional RTG, highly depending on human operations, is inefficient, labor-intensive and also poses safety issues in adverse environments. This paper introduces a multi-target detection and tracking (MTDT) algorithm specifically tailored for automated port RTG operations. The approach seamlessly integrates enhanced YOLOX for object detection and improved DeepSORT for object tracking to enhance MTDT performance in the complex port settings. In particular, Light-YoloX, an upgraded version of YOLOX incorporating separable convolution and attention mechanism, is introduced to improve real-time capability and small target detection. Subsequently, OSNet-DeepSORT, an enhanced version of DeepSORT, is proposed to mitigate ID switching challenges arising from unreliable data communication or occlusion in real port scenarios. The effectiveness of the proposed method is validated in various real-life port operations. Ablation studies and comparative experiments against typical MTDT algorithms demonstrate noteworthy enhancements in key performance metrics, encompassing small target detection, tracking accuracy, ID switching frequency, and realtime performance.

show abstract

“…The CBAM attention mechanism is a typical hybrid attention mechanism that sequentially applies channel attention mechanism (CAM) and spatial attention mechanism (SAM) modules. Compared to using channel attention or spatial attention independently, CBAM can achieve better results [26]. As illustrated in Figure 10, the CBAM attention mechanism takes a given intermediate feature map F ∈ R C×H×W as input.…”

Section: Soft-pooling and Multi-scale Convolution Cbammentioning

confidence: 99%

SC-YOLOv8 Network with Soft-Pooling and Attention for Elevator Passenger Detection

Wang,

Chen,

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

This paper concentrates on the elevator passenger detection task, a pivotal element for subsequent elevator passenger tracking and behavior recognition, crucial for ensuring passenger safety. To enhance the accuracy of detecting passenger positions inside elevators, we improved the YOLOv8 network and proposed the SC-YOLOv8 elevator passenger detection network with soft-pooling and attention mechanisms. The main improvements in this paper encompass the following aspects: Firstly, we transformed the convolution module (ConvModule) of the YOLOv8 backbone network by introducing spatial and channel reconstruction convolution (SCConv). This improvement aims to reduce spatial and channel redundancy in the feature extraction process of the backbone network, thereby improving the overall efficiency and performance of the detection network. Secondly, we propose a dual-branch SPP-Fast module by incorporating a soft-pooling branch into the YOLOv8 network’s SPP-Fast module. This dual-branch SPP-Fast module can preserve essential information while reducing the impact of noise. Finally, we propose a soft-pooling and multi-scale convolution CBAM module to further enhance the network’s performance. This module enhances the network’s focus on key regions, allowing for more targeted feature extraction, thereby further improving the accuracy of object detection. Additionally, the attention module enhances the network’s robustness in handling complex backgrounds. We conducted experiments on an elevator passenger dataset. The results show that the precision, recall, and mAP of our improved YOLOv8 network are 94.32%, 91.17%, and 92.95%, respectively, all surpassing those of the original YOLOv8 network.

show abstract

A CBAM Based Multiscale Transformer Fusion Approach for Remote Sensing Image Change Detection

Cited by 89 publications

References 39 publications

MDFENet: A Multiscale Difference Feature Enhancement Network for Remote Sensing Change Detection

MDFENet: A Multiscale Difference Feature Enhancement Network for Remote Sensing Change Detection

Improved YOLOX-DeepSORT for Multitarget Detection and Tracking of Automated Port RTG

SC-YOLOv8 Network with Soft-Pooling and Attention for Elevator Passenger Detection

Contact Info

Product

Resources

About