Deep learning has achieved good performance in hyperspectral image classification (HSIC). Many methods based on deep learning use deep and complex network structures to extract rich spectral and spatial features of hyperspectral images (HSIs) with high accuracy. During the process, how to accurately extract the features and information from pixel blocks in HSIs is important. All of the spectral features are treated equally in classification, and the input of the network often contains much useless pixel information, leading to a low classification result. To solve this problem, an enhanced spectral-spatial residual attention network (ESSRAN) is proposed for HSIC in this paper. In the proposed network, the spectral-spatial attention network (SSAN), residual network (ResNet) and long-short term memory (LSTM) are combined to extract more discriminative spectral and spatial features. More specifically, SSAN is first applied to extract image features by using the spectral attention module to emphasize useful bands and suppress useless bands. The spatial attention module is used to emphasize pixels that have same category with the central pixel. Then, these obtained features are fed into an improved ResNet, which adopts LSTM to learn representative high-level semantic features of the spectral sequences, since the use of ResNet can prevent gradient disappearance and explosion. The proposed ESSRAN model is implemented on three commonly used HSI datasets and compared to some state-of-the-art methods. The results confirm that ESSRAN effectively improves accuracy.