Hyperspectral Image Classification Based on 3-D Multihead Self-Attention Spectral–Spatial Feature Fusion Network

Qigao, Zhou; Zhou, Shuai; Shen, Feng; Yin, Jie; Xu, Dingjie

doi:10.1109/jstars.2022.3226758

Cited by 5 publications

(9 citation statements)

References 58 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…employed 2D CNN and a transformer to gain joint spatial spectral features. Zhou et al 19 . used multi-scale convolution to mine features and combined multi-scale features to enhance feature expressiveness.…”

Section: Related Workmentioning

confidence: 99%

“…15,16 To address these limitations, spectral-spatial methods have been proposed. In recent spectral-spatial methods, [17][18][19][20] CNNs are commonly employed to extract spectralspatial features from adjacent pixels, and convolution is an important component. [21][22][23] Recently, attention mechanisms were developed by simulating the visual system of humans, which selectively concentrates on prominent parts rather than handling each part consistently.…”

Section: Introductionmentioning

confidence: 99%

“…To address these limitations, spectral-spatial methods have been proposed. In recent spectral-spatial methods, 17 – 20 CNNs are commonly employed to extract spectral-spatial features from adjacent pixels, and convolution is an important component 21 – 23 …”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Center-similarity spectral-spatial attention network for hyperspectral image classification

Zhang,

Liang,

Niu

et al. 2024

J. Appl. Rem. Sens.

View full text Add to dashboard Cite

Hyperspectral image (HSI) classification aims to assign labels to pixels to be classified. The high-dimensional form of HSI and the introduction of spatial information can introduce challenges such as redundant spectral bands and interference pixels. Recently, many methods based on convolutional neural networks and attention mechanisms have been employed to address these issues. However, existing methods often fail to adequately utilize spatial information and exhibit interference pixel diffusion, which may result in these methods being unable to extract discriminative features. In addition, convolutions are not capable of extracting robust features, which results in the poor robustness of recent methods when HSI is rotated. To address these challenges, a center-similarity spectral-spatial attention network model (CS 3 AN) is proposed. First, a center-similarity spectral attention (CSSpeA) module is proposed to exploit the input spatial information rationally to better reduce the impact of redundant bands. Next, a center-similarity rectified spatial attention (CSRSpaA) module that selectively weighs neighboring pixels based on interference pixels to prevent the diffusion phenomenon is provided. Second, a similarity spatial aggregation module is employed to extract robust spectral-spatial features. Finally, the robust features are input into the softmax to get the label. The validity of the CS 3 AN is validated on three different databases.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Center-similarity spectral-spatial attention network for hyperspectral image classification

Zhang,

Liang,

Niu

et al. 2024

J. Appl. Rem. Sens.

View full text Add to dashboard Cite

show abstract

“…Hyperspectral images (HSIs) are data cubes captured by hyperspectral sensors, which simultaneously reveal 2-D spatial and 1-D spectral information about land cover substances [1]. What distinguishes HSIs from panchromatic and multispectral images is that their pixels record the distinctive spectral signatures using hundreds of nearly continuous spectral bands [2][3][4]. The high-resolution spectral response curves reflect detailed characteristics of land cover substances [5].…”

Section: Introductionmentioning

confidence: 99%

A U-Shaped Convolution-Aided Transformer with Double Attention for Hyperspectral Image Classification

Qin,

Wang,

et al. 2024

Remote Sensing

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) and transformers have achieved great success in hyperspectral image (HSI) classification. However, CNNs are inefficient in establishing long-range dependencies, and transformers may overlook some local information. To overcome these limitations, we propose a U-shaped convolution-aided transformer (UCaT) that incorporates convolutions into a novel transformer architecture to aid classification. The group convolution is employed as parallel local descriptors to extract detailed features, and then the multi-head self-attention recalibrates these features in consistent groups, emphasizing informative features while maintaining the inherent spectral–spatial data structure. Specifically, three components are constructed using particular strategies. First, the spectral groupwise self-attention (spectral-GSA) component is developed for spectral attention, which selectively emphasizes diagnostic spectral features among neighboring bands and reduces the spectral dimension. Then, the spatial dual-scale convolution-aided self-attention (spatial-DCSA) encoder and spatial convolution-aided cross-attention (spatial-CCA) decoder form a U-shaped architecture for per-pixel classifications over HSI patches, where the encoder utilizes a dual-scale strategy to explore information in different scales and the decoder adopts the cross-attention for information fusion. Experimental results on three datasets demonstrate that the proposed UCaT outperforms the competitors. Additionally, a visual explanation of the UCaT is given, showing its ability to build global interactions and capture pixel-level dependencies.

show abstract

“…However, the above-mentioned spatial attention modules generally deduce a few modes of attention. To express possible spatial dependency sufficiently, transformers [194,195], which originate from the field of natural language processing and have been the core component of the ChatGPT model [196], adopt multi-head SA (MHSA) modules [181,[197][198][199] to integrate various types of attention from different subspaces into a linear representation [200][201][202]. Transformer is also good at handling long-distance spectral dependency.…”

Section: Introductionmentioning

confidence: 99%

Discriminating Spectral–Spatial Feature Extraction for Hyperspectral Image Classification: A Review

Li,

Wang,

Cheikh

2024

Sensors

View full text Add to dashboard Cite

Hyperspectral images (HSIs) contain subtle spectral details and rich spatial contextures of land cover that benefit from developments in spectral imaging and space technology. The classification of HSIs, which aims to allocate an optimal label for each pixel, has broad prospects in the field of remote sensing. However, due to the redundancy between bands and complex spatial structures, the effectiveness of the shallow spectral–spatial features extracted by traditional machine-learning-based methods tends to be unsatisfying. Over recent decades, various methods based on deep learning in the field of computer vision have been proposed to allow for the discrimination of spectral–spatial representations for classification. In this article, the crucial factors to discriminate spectral–spatial features are systematically summarized from the perspectives of feature extraction and feature optimization. For feature extraction, techniques to ensure the discrimination of spectral features, spatial features, and spectral–spatial features are illustrated based on the characteristics of hyperspectral data and the architecture of models. For feature optimization, techniques to adjust the feature distances between classes in the classification space are introduced in detail. Finally, the characteristics and limitations of these techniques and future challenges in facilitating the discrimination of features for HSI classification are also discussed further.

show abstract

Hyperspectral Image Classification Based on 3-D Multihead Self-Attention Spectral–Spatial Feature Fusion Network

Cited by 5 publications

References 58 publications

Center-similarity spectral-spatial attention network for hyperspectral image classification

Center-similarity spectral-spatial attention network for hyperspectral image classification

A U-Shaped Convolution-Aided Transformer with Double Attention for Hyperspectral Image Classification

Discriminating Spectral–Spatial Feature Extraction for Hyperspectral Image Classification: A Review

Contact Info

Product

Resources

About