Floods have brought a great threat to the life and property of human beings. Under the premise of strengthening flood control engineering measures and following the strategic thinking of sustainable development, many achievements have been made in flood forecasting recently. However, due to the complexity of the traditional lumped model and distributed model, the hydrologic parameter calibration process is full of difficulties, leading to a long development cycle of a reasonable hydrologic prediction model. Even for modern data-driven models, the spatial distribution characteristics of the rainfall data are also not fully mined. Based on this situation, this paper abstracts the rainfall data into the graph structure data, uses remote sensing images to extract the elevation information, introduces the graph attention mechanism to extract the spatial characteristics of rainfall, and employs long-term and short-term memory (LSTM) network to fuse the spatial and temporal characteristics for flood prediction. Through well-designed experiments, the forecasting effect of flood peak value and flood arrival time is verified. Furthermore, compared with the LSTM model and BIGRU model without spatial feature extraction, the advantages of spatiotemporal feature fusion are highlighted. The specific performance is that the RMSE (the root means square error) and R2(coefficient of determination) of the GA-RNN model have been significantly improved. Finally, we conduct experiments on the observed ten rainfall events in the history of the target watershed. According to the hydrological prediction specifications, the model can be evaluated as a Class B flood forecasting model.