Arctic sea ice concentration plays a key role in the global ecosystem. However, accurate prediction of Arctic sea ice concentration remains a challenging task due to its inherent nonlinearity and complex spatiotemporal correlations. To address these challenges, we propose an innovative encoder–decoder pyramid dilated convolutional long short-term memory network (DED-ConvLSTM). The model is constructed based on the convolutional long short-term memory network (ConvLSTM) and, for the first time, integrates the encoder–decoder architecture of ConvLSTM (ED-ConvLSTM) with a pyramidal dilated convolution strategy. This approach aims to efficiently capture the spatiotemporal properties of the sea ice concentration and to enhance the identification of its nonlinear relationships. By applying convolutional layers with different dilation rates, the PDED-ConvLSTM model can capture spatial features at multiple scales and increase the receptive field without losing resolution. Further, the integration of the pyramid convolution module significantly enhances the model’s ability to understand complex spatiotemporal relationships, resulting in notable improvements in prediction accuracy and generalization ability. The experimental results show that the sea ice concentration distribution predicted by the PDED-ConvLSTM model is in high agreement with ground-based observations, with the residuals between the predictions and observations maintained within a range from −20% to 20%. PDED-ConvLSTM outperforms other models in terms of prediction performance, reducing the RMSE by 3.6% compared to the traditional ConvLSTM model and also performing well over a five-month prediction period. These experiments demonstrate the potential of PDED-ConvLSTM in predicting Arctic sea ice concentrations, making it a viable tool to meet the requirements for accurate prediction and provide technical support for safe and efficient operations in the Arctic region.