In recent years, deep learning has been successfully applied to hyperspectral image classification (HSI) problems, with several convolutional neural network (CNN) based models achieving an appealing classification performance. However, due to the multi-band nature and the data redundancy of the hyperspectral data, the CNN model underperforms in such a continuous data domain. Thus, in this article, we propose an end-to-end transformer model entitled SAT Net that is appropriate for HSI classification and relies on the self-attention mechanism. The proposed model uses the spectral attention mechanism and the self-attention mechanism to extract the spectral–spatial features of the HSI image, respectively. Initially, the original HSI data are remapped into multiple vectors containing a series of planar 2D patches after passing through the spectral attention module. On each vector, we perform linear transformation compression to obtain the sequence vector length. During this process, we add the position–coding vector and the learnable–embedding vector to manage capturing the continuous spectrum relationship in the HSI at a long distance. Then, we employ several multiple multi-head self-attention modules to extract the image features and complete the proposed network with a residual network structure to solve the gradient dispersion and over-fitting problems. Finally, we employ a multilayer perceptron for the HSI classification. We evaluate SAT Net on three publicly available hyperspectral datasets and challenge our classification performance against five current classification methods employing several metrics, i.e., overall and average classification accuracy and Kappa coefficient. Our trials demonstrate that SAT Net attains a competitive classification highlighting that a Self-Attention Transformer network and is appealing for HSI classification.
In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN).
Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.
In recent years, the deep learning-based hyperspectral image (HSI) classification method has achieved great success, and the convolutional neural network (CNN) method has achieved good classification performance in the HSI classification task. However, the convolutional operation only works with local neighborhoods, and is effective in extracting local features. It is difficult to capture interactive features over long distances, which affects the accuracy of classification to some extent. At the same time, the data from HSI have the characteristics of three-dimensionality, redundancy, and noise. To solve these problems, we propose a 3D self-attention multiscale feature fusion network (3DSA-MFN) that integrates 3D multi-head self-attention. 3DSA-MFN first uses different sized convolution kernels to extract multiscale features, samples the different granularities of the feature map, and effectively fuses the spatial and spectral features of the feature map. Then, we propose an improved 3D multi-head self-attention mechanism that provides local feature details for the self-attention branch, and fully exploits the context of the input matrix. To verify the performance of the proposed method, we compare it with six current methods on three public datasets. The experimental results show that the proposed 3DSA-MFN achieves competitive classification and highlights the HSI classification task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.