The accuracy requirements for short-term power load forecasting have been increasing due to the rapid development of the electric power industry. Nevertheless, the short-term load exhibits both elasticity and instability characteristics, posing challenges for accurate load forecasting. Meanwhile, the traditional prediction model suffers from the issues of inadequate precision and inefficient training. In this work, a proposed model called IWOA-CNN-BIGRU-CBAM is introduced. To solve the problem of the Squeeze-and-Excitation (SE) attention mechanism’s inability to collect information in the spatial dimension effectively, the Convolutional Block Attention Module (CBAM) is firstly introduced as a replacement. This change aims to enhance the ability to capture location attributes. Subsequently, we propose an improved Whale Optimization Algorithm (IWOA) that addresses its limitations, such as heavy reliance on the initial solution and susceptibility to local optimum solutions. The proposed IWOA is also applied for the hyperparameter optimization of the Convolutional Neural Network–Bidirectional Gated Recurrent Unit–Convolutional Block Attention Module (CNN-BiGRU-CBAM) to improve the precision of predictions. Ultimately, applying the proposed model to forecast short-term power demand yields results that show that the CBAM effectively addresses the problem of the SE attention mechanism’s inability to capture spatial characteristics fully. The proposed IWOA exhibits a homogeneous dispersion of the initial population and an effective capability to identify the optimal solution. Compared to other models, the proposed model improves R2 by 0.00224, reduces the RMSE by 18.5781, and reduces MAE by 25.8940, and the model’s applicability and superiority are validated.