Short-term load forecasting (STLF) with excellent precision and prominent efficiency plays a significant role in the stable operation of power grid and the improvement of economic benefits. In this paper, a novel model based on data mining and deep learning is proposed. Firstly, the preprocessing of data includes normalization of historical load, and fuzzification of influencing factors (meteorological factors, date types and economy) based on Pearson correlation coefficient (PCC). Secondly, kernel fuzzy c-means (KFCM) modified by particle swarm optimization (PSO-KFCM) algorithm clusters the daily load curve. In the clustering experiments, the within-cluster sum of squared error (SSE) index is presented to determine the number of clusters and the clustering validity has a 31.9% enhancement compared with the traditional FCM algorithm. Thirdly, the cosine similarity establishes the resemblance between the prediction date and each cluster, and the similar cluster is determined according to the principle of maximum similarity. Finally, a multivariate and multi-step hybrid model MMCNN-LSTM based on convolution neural network (CNN) and long short-term memory (LSTM) neural network is proposed to forecast the load in following 24 hours, in which similar cluster data is applied to training set. To demonstrate the effectiveness of proposed integrated technique, the accuracy has been verified in three predictive experiments. The fruitful results indicated that the average mean absolute percent error (MAPE) in the entire test set was only 1.34%, a 3.02% reduction compared to a single LSTM.