Purpose
This paper aims to propose an approach based upon the principal component analysis (PCA) to define a contribution rate for each variable and then select the main variables as inputs to a neural network for energy load forecasting in the region southeastern Brazil.
Design/methodology/approach
The proposed approach defines a contribution rate of each variable as a weighted sum of the inner product between the variable and each principal component. So, the contribution rate is used for selecting the most important features of 27 variables and 6,815 electricity data for a multilayer perceptron network backpropagation prediction model. Several tests, starting from the most significant variable as input, and adding the next most significant variable and so on, are accomplished to predict energy load (GWh). The Kaiser–Meyer–Olkin and Bartlett sphericity tests were used to verify the overall consistency of the data for factor analysis.
Findings
Although energy load forecasting is an area for which databases with tens or hundreds of variables are available, the approach could select only six variables that contribute more than 85% for the model. While the contribution rates of the variables of the plants, plus energy exchange added, have only 14.14% of contribution, the variable the stored energy has a contribution rate of 26.31% being fundamental for the prediction accuracy.
Originality/value
Besides improving the forecasting accuracy and providing a faster predictor, the proposed PCA-based approach for calculating the contribution rate of input variables providing a better understanding of the underlying process that generated the data, which is fundamental to the Brazilian reality due to the accentuated climatic and economic variations.