To effectively mine historical data information and improve the accuracy of short-term load prediction, this paper aims at the characteristics of time series and nonlinear power load. Deep learning for load forecasting has received a lot of attention in recent years, and it has become popular in the analysis of electricity load forecasting. Long short-term memory (LSTM) and gated recurrent unit (GRU) are specifically designed for time-series data. However, due to the gradient disappearing and exploding problem, recurrent neural networks (RNNs) cannot capture long-term dependence. The Transformer, a self-attention-based sequence model, has produced impressive results in a variety of generating tasks that demand long-range coherence. This shows that self-attention could be useful in power load forecasting modeling. In this paper, to effectively and efficiently model the large-scale load forecasting, we further design the transform encoder with relative position encoding, which consists of four main components: single-layer neural network, relative positional encoding module, encoder module, and feed-forward network. Experimental results on real-world datasets demonstrate that our method outperforms the GRU, LSTM, and original Transformer encoder.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.