Spectral-Spatial Attention Networks for Hyperspectral Image Classification

Mei, Xiaoguang; Pan, Erting; Ma, Yong; Dai, Xiaobing; Huang, Jun; Fan, Fan; Du, Qian; Zheng, Hong; Ma, Jiayi

doi:10.3390/rs11080963

Cited by 234 publications

(95 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result, more emphasis should be laid on those important objects, and less emphasis should be laid on redundant objects when representing scene images. For this reason, the visual attention mechanism is studied in the CNN over recent years [15][16][17][18][19][20][21], and the literature on the attention mechanism is shown in Section 2.2. In the attention mechanism, some salient regions selected from the entire image rather than the entire image are processed by the visual attention mechanism at once.…”

Section: Introductionmentioning

confidence: 99%

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Zhu

Yan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.

show abstract

Section: Introductionmentioning

confidence: 99%

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Zhu

Yan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

show abstract

“…One of the most important deep learning models is the convolution neural network (CNN), which is widely used in VHR remote sensing image classification [37,38]. In general, traditional CNNs consist of five fundamental structures: the convolutional layer, non-linear mapping (NL) layer, pooling layer, full connection (FC) layer, and classification layer.…”

Section: Deep Convolutional Neural Networkmentioning

confidence: 99%

Hierarchical Multi-View Semi-Supervised Learning for Very High-Resolution Remote Sensing Image Classification

Cheng

Yang

et al. 2020

Remote Sensing

View full text Add to dashboard Cite

Traditional classification methods used for very high-resolution (VHR) remote sensing images require a large number of labeled samples to obtain higher classification accuracy. Labeled samples are difficult to obtain and costly. Therefore, semi-supervised learning becomes an effective paradigm that combines the labeled and unlabeled samples for classification. In semi-supervised learning, the key issue is to enlarge the training set by selecting highly-reliable unlabeled samples. Observing the samples from multiple views is helpful to improving the accuracy of label prediction for unlabeled samples. Hence, the reasonable view partition is very important for improving the classification performance. In this paper, a hierarchical multi-view semi-supervised learning framework with CNNs (HMVSSL) is proposed for VHR remote sensing image classification. Firstly, a superpixel-based sample enlargement method is proposed to increase the number of training samples in each view. Secondly, a view partition method is designed to partition the training set into two independent views, and the partitioned subsets are characterized by being inter-distinctive and intra-compact. Finally, a collaborative classification strategy is proposed for the final classification. Experiments are conducted on three VHR remote sensing images, and the results show that the proposed method performs better than several state-of-the-art methods.

show abstract

“…Compared with conventional methods, the deep which achieves high-order features of the hyperspectral data in a cascade manner and has an explicit physical meaning. Additionally, neural network using attention mechanism has the ability to focus on specific parts of information in the feature space [30,33], which is helpful in learning both spatial and spectral features in HSIs.…”

Section: Introductionmentioning

confidence: 99%

Attention-Based Residual Network with Scattering Transform Features for Hyperspectral Unmixing with Limited Training Samples

et al. 2020

View full text Add to dashboard Cite

This paper proposes a framework for unmixing of hyperspectral data that is based on utilizing the scattering transform to extract deep features that are then used within a neural network. Previous research has shown that using the scattering transform combined with a traditional K-nearest neighbors classifier (STFHU) is able to achieve more accurate unmixing results compared to a convolutional neural network (CNN) applied directly to the hyperspectral images. This paper further explores hyperspectral unmixing in limited training data scenarios, which are likely to occur in practical applications where the access to large amounts of labeled training data is not possible. Here, it is proposed to combine the scattering transform with the attention-based residual neural network (ResNet). Experimental results on three HSI datasets demonstrate that this approach provides at least 40% higher unmixing accuracy compared to the previous STFHU and CNN algorithms when using limited training data, ranging from 5% to 30%, are available. The use of the scattering transform for deriving features within the ResNet unmixing system also leads more than 25% improvement when unmixing hyperspectral data contaminated by additive noise.

show abstract

Spectral-Spatial Attention Networks for Hyperspectral Image Classification

Cited by 234 publications

References 38 publications

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Hierarchical Multi-View Semi-Supervised Learning for Very High-Resolution Remote Sensing Image Classification

Attention-Based Residual Network with Scattering Transform Features for Hyperspectral Unmixing with Limited Training Samples

Contact Info

Product

Resources

About