Short-term load forecasting is critical to ensuring the safe and stable operation of the power system. To this end, this study proposes a load power prediction model that utilizes outlier correction, decomposition, and ensemble reinforcement learning. The novelty of this study is as follows: firstly, the Hampel identifier (HI) is employed to correct outliers in the original data; secondly, the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is used to extract the waveform characteristics of the data fully; and, finally, the temporal convolutional network, extreme learning machine, and gate recurrent unit are selected as the basic learners for forecasting load power data. An ensemble reinforcement learning algorithm based on Q-learning was adopted to generate optimal ensemble weights, and the predictive results of the three basic learners are combined. The experimental results of the models for three real load power datasets show that: (a) the utilization of HI improves the model’s forecasting result; (b) CEEMDAN is superior to other decomposition algorithms in forecasting performance; and (c) the proposed ensemble method, based on the Q-learning algorithm, outperforms three single models in accuracy, and achieves smaller prediction errors.