Load forecasting is a nonlinear problem and complex task that plays a key role in power system planning, operation, and control. A recent study proposed a deep learning approach called historical data augmentation (HDA) to improve the accuracy of the load forecasting model by dividing the input data into several yearly sub-datasets. When the original data is associated with high time step changes from 1 year to another, the approach was not found as effective as it should be for long-term forecasting because the time-series information is disconnected by the approach between the end of 1-year sub-data and the beginning of the next-year sub-data. Alternatively, this paper proposes the use of 2-year sub-dataset in order to connect the two ends of the yearly subsets. A correlation analysis is conducted to show how the yearly datasets are correlated to each other. In addition, a Simulink-based program is introduced to simulate the problem which has an advantage of visualizing the algorithm. To increase the model generalization, several inputs are considered in the model including load demand profile, weather information, and some important categorical data such as week-day and weekend data that are embedded using onehot encoding technique. The deep learning methods used in this study are the long short-term memory (LSTM) and gated rest unit (GRU) neural networks which have been increasingly employed in the recent years for time series and sequence problems. To provide a theoretical background on these models, a new picturized detail is presented. The proposed method is applied to the Kurdistan regional load demands and compared with classical methods of data inputting demonstrating improvements in both the model accuracy and training time.