“…In the last few years, deep learning, especially Convolutional Neural Networks (CNNs), has received widespread attention due to its ability to automatically learn nonlinear features for classification, i.e., overcome the challenges of hand-crafted features for HSIC using traditional methods [16] such as Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Random Forest, Ensemble Learning, Artificial Neural Network, and Extreme Learning Machine (ELM) [17,18]. Moreover, CNN can jointly investigate the spatial-spectral information and such models can be categorized into two groups, i.e., single and two-stream; more information regarding single or two-stream methods can be found in [19]. This work explicitly investigates a single-stream method similar to the works proposed by Ahmad et al [20] (A Fast and Compact 3D CNN for HSIC), Xie et al [21] (Hyperspectral Face Recognition-based on Sparse Spectral Attention Deep Neural Network), Liu et al [22] (A semi-supervised CNN for HSIC), Hamida et al [23] (3D Deep Learning Approach for Remote Sensing Image Classification), Lee et al [24] (Contextual Deep CNN-based HSIC), Chen et al [25] (Contextual Deep CNN-based HSIC), Li [26] (Spectral-Spatial Classification of HSI with 3D CNN), He et al [27] (Multi-scale 3D Deep CNN Network for HSI), Zhao et al [28] (Hybrid Depth-Separable Residual Networks for HSIC).…”