Considering the increased risk of urban flooding and drought due to global climate change and rapid urbanization, the imperative for more accurate methods for streamflow forecasting has intensified. This study introduces a pioneering approach leveraging the available network of real-time monitoring stations and advanced machine learning algorithms that can accurately simulate spatial–temporal problems. The Spatio-Temporal Attention Gated Recurrent Unit (STA-GRU) model is renowned for its computational efficacy in forecasting streamflow events with a forecast horizon of 7 days. The novel integration of the groundwater level, precipitation, and river discharge as predictive variables offers a holistic view of the hydrological cycle, enhancing the model’s accuracy. Our findings reveal that for a 7-day forecasting period, the STA-GRU model demonstrates superior performance, with a notable improvement in mean absolute percentage error (MAPE) values and R-square (R2) alongside reductions in the root mean squared error (RMSE) and mean absolute error (MAE) metrics, underscoring the model’s generalizability and reliability. Comparative analysis with seven conventional deep learning models, including the Long Short-Term Memory (LSTM), the Convolutional Neural Network LSTM (CNNLSTM), the Convolutional LSTM (ConvLSTM), the Spatio-Temporal Attention LSTM (STA-LSTM), the Gated Recurrent Unit (GRU), the Convolutional Neural Network GRU (CNNGRU), and the STA-GRU, confirms the superior predictive power of the STA-LSTM and STA-GRU models when faced with long-term prediction. This research marks a significant shift towards an integrated network of real-time monitoring stations with advanced deep-learning algorithms for streamflow forecasting, emphasizing the importance of spatially and temporally encompassing streamflow variability within an urban watershed’s stream network.