In recent years, deep learning algorithms, particularly convolutional neural networks (CNNs), have significantly improved the performance of the hyperspectral image (HSI) classification. However, due to the high dimensionality of HSI and limited training samples, the deep neural network causes model overfitting. Additionally, considering all the bands of HSI datasets equally for feature learning and being unable to distinguish between the edge and the center pixels of a neighborhood reduces classification accuracy. Thus, in this paper, we propose an end-toend deep spectral-spatial residual attention network (DSSpRAN) motivated by the attention mechanism of the human visual system for HSI classification. The DSSpRAN considers input HSI data as a 3-D cube instead of using dimensionality reduction methods. The proposed model simultaneously incorporates spectral and spatial features by considering a spectral residual attention network (SRAN) and a spatial residual attention network (SpRAN). In SRAN, the weights are assigned and learned adaptively to select essential features from each band. The SpRAN enhances the importance of classifying each nearby pixel to the center pixel. It assigns the same label as that of the center pixel to the surrounding pixels, thus limiting pixels with different labels. The proposed method has been evaluated on five different datasets to prove the state-of-the-art for various land use land cover scenarios. A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.