Precipitation forecasting is an immensely significant aspect of meteorological prediction. Accurate weather predictions facilitate services in sectors such as transportation, agriculture, and tourism. In recent years, deep learning-based radar echo extrapolation techniques have found effective applications in precipitation forecasting. However, the ability of existing methods to extract and characterize complex spatiotemporal features from radar echo images remains insufficient, resulting in suboptimal forecasting accuracy. This paper proposes a novel extrapolation algorithm based on a dual-branch encoder–decoder and spatiotemporal Gated Recurrent Unit. In this model, the dual-branch encoder–decoder structure independently encodes radar echo images in the temporal and spatial domains, thereby avoiding interference between spatiotemporal information. Additionally, we introduce a Multi-Scale Channel Attention Module (MSCAM) to learn global and local feature information from each encoder layer, thereby enhancing focus on radar image details. Furthermore, we propose a Spatiotemporal Attention Gated Recurrent Unit (STAGRU) that integrates attention mechanisms to handle temporal evolution and spatial relationships within radar data, enabling the extraction of spatiotemporal information from a broader receptive field. Experimental results demonstrate the model’s ability to accurately predict morphological changes and motion trajectories of radar images on real radar datasets, exhibiting superior performance compared to existing models in terms of various evaluation metrics. This study effectively improves the accuracy of precipitation forecasting in radar echo images, provides technical support for the short-range forecasting of precipitation, and has good application prospects.