The forecasting of building energy consumption remains a challenging task because of the intricate management of the relevant parameters that can influence the performance of models. Due to the powerful capability of artificial intelligence (AI) in forecasting problems, it is deemed to be highly effective in this domain. However, achieving accurate predictions requires the extraction of meaningful historical knowledge from various features. Given that the exogenous data may affect the energy consumption forecasting model’s accuracy, we propose an approach to study the importance of data and selecting optimum time lags to obtain a high-performance machine learning-based model, while reducing its complexity. Regarding energy consumption forecasting, multilayer perceptron-based nonlinear autoregressive with exogenous inputs (NARX), long short-term memory (LSTM), gated recurrent unit (GRU), decision tree, and XGboost models are utilized. The best model performance is achieved by LSTM and GRU with a root mean square error of 0.23. An analysis by the Diebold–Mariano method is also presented, to compare the prediction accuracy of the models. In order to measure the association of feature data on modeling, the “model reliance” method is implemented. The proposed approach shows promising results to obtain a well-performing model. The obtained results are qualitatively reported and discussed.