The dissolved oxygen concentration (DOC) is an important indicator of water quality. Accurate DOC predictions can provide a scientific basis for water environment management and pollution prevention. This study proposes a hybrid DOC forecasting framework combined with Variational Mode Decomposition (VMD), a convolutional neural network (CNN), a Gated Recurrent Unit (GRU), and the Beluga Whale Optimization (BWO) algorithm. Specifically, the original DOC sequences were decomposed using VMD. Then, CNN-GRU combined with an attention mechanism was utilized to extract the key features and local dependency of the decomposed sequences. Introducing the BWO algorithm solved the correction coefficients of the proposed system, with the aim of improving prediction accuracy. This study used 4-h monitoring China urban water quality data from November 2020 to November 2023. Taking Lianyungang as an example, the empirical findings exhibited noteworthy enhancements in performance metrics such as MSE, RMSE, MAE, and MAPE within the VMD-BWO-CNN-GRU-AM, with reductions of 0.2859, 0.3301, 0.2539, and 0.0406 compared to a GRU. These results affirmed the superior precision and diminished prediction errors of the proposed hybrid model, facilitating more precise DOC predictions. This proposed DOC forecasting system is pivotal for sustainably monitoring and regulating water quality, particularly in terms of addressing pollution concerns.