Ultra-short-term load demand forecasting is significant to the rapid response and real-time dispatching of the power demand side. Considering too many random factors that affect the load, this paper combines convolution, long short-term memory (LSTM), and gated recurrent unit (GRU) algorithms to propose an ultra-short-term load forecasting model based on deep learning. Firstly, more than 100,000 pieces of historical load and meteorological data from Beijing in the three years from 2016 to 2018 were collected, and the meteorological data were divided into 18 types considering the actual meteorological characteristics of Beijing. Secondly, after the standardized processing of the time-series samples, the convolution filter was used to extract the features of the high-order samples to reduce the number of training parameters. On this basis, the LSTM layer and GRU layer were used for modeling based on time series. A dropout layer was introduced after each layer to reduce the risk of overfitting. Finally, load prediction results were output as a dense layer. In the model training process, the mean square error (MSE) was used as the objective optimization function to train the deep learning model and find the optimal super parameter. In addition, based on the average training time, training error, and prediction error, this paper verifies the effectiveness and practicability of the load prediction model proposed under the deep learning structure in this paper by comparing it with four other models including GRU, LSTM, Conv-GRU, and Conv-LSTM.