The shortage of training samples remains one of the main obstacles in applying the neural networks to the hyperspectral images classification. To fuse the spatial and spectral information, pixel patches are often utilized to train a model, which may further aggregate this problem. In the existing works, an ANN model supervised by center-loss (ANNC) was introduced. Training merely with spectral information, the ANNC yields discriminative spectral features suitable for the subsequent classification tasks. In this paper, we propose a novel CNN-based spatial feature fusion (CSFF) algorithm which allows a smart integration of spatial information to the spectral features extracted by ANNC. As a critical part of CSFF, a CNN-based discriminant model is introduced to estimate whether two pixels belong to the same class. At the testing stage, by applying the discriminant model to the pixel pairs generated by a test pixel and each of its neighbors, the local structure is estimated and represented as a customized convolutional kernel. The spectral-spatial feature is generated by a convolutional operation between the estimated kernel and the corresponding spectral features within a local region. The final label is determined by classifying the resulting spectral-spatial feature. Without increasing the number of training samples or involving pixel patches at the training stage, the CSFF framework achieves the state-of-the-art by declining 20% − 50% classification failures in experiments on three well-known hyperspectral images.