The growth of urban areas and the management of energy resources highlight the need for precise short-term load forecasting (STLF) in energy management systems to improve economic gains and reduce peak energy usage. Traditional deep learning models for STLF present challenges in addressing these demands efficiently due to their limitations in modeling complex temporal dependencies and processing large amounts of data. This study presents a groundbreaking hybrid deep learning model, BiGTA-net, which integrates a bi-directional gated recurrent unit (Bi-GRU), a temporal convolutional network (TCN), and an attention mechanism. Designed explicitly for day-ahead 24-point multistep-ahead building electricity consumption forecasting, BiGTA-net undergoes rigorous testing against diverse neural networks and activation functions. Its performance is marked by the lowest mean absolute percentage error (MAPE) of 5.37 and a root mean squared error (RMSE) of 171.3 on an educational building dataset. Furthermore, it exhibits flexibility and competitive accuracy on the Appliances Energy Prediction (AEP) dataset. Compared to traditional deep learning models, BiGTA-net reports a remarkable average improvement of approximately 36.9% in MAPE. This advancement emphasizes the model’s significant contribution to energy management and load forecasting, accentuating the efficacy of the proposed hybrid approach in power system optimizations and smart city energy enhancements.