With the development of deep learning, various convolutional neural network (CNN) based methods have been proposed for the hyperspectral image (HSI) classification. Although most of them achieve good classification performance, there are still more misclassifications in the prediction map with fewer training samples. In order to address this shortcoming, this paper proposes to simultaneously use pixels' spatial information and spectral information for HSI classification. Briefly speaking, a new cross-mixing residual network denoted by CMR-CNN is developed, wherein one three-dimensional (3D) residual structure responsible for extracting the spectral characteristics, one twodimensional (2D) residual structure responsible for extracting the spatial characteristics, and one assisted feature extraction (AFE) structure responsible for linking the first two structures are respectively designed. With respect to experiments performed on five different datasets Indian Pines, University of Pavia, Salinas Scene, KSC, and Xuzhou in the case of different numbers of training samples show that, compared to some state-of-theart methods, CMR-CNN can achieve higher overall accuracy (OA), average accuracy (AA), and Kappa values. Particularly, compared with the newly proposed HSI classification methods OCT-MCNN, CMR-CNN respectively improves OA, AA and kappa by 4.13%, 3.67%, and 2.75% on average.