Traffic speed prediction is an indispensable element of intelligent transportation systems. Numerous studies have devoted to high-precision prediction models. However, most existing methods implement the link-wise or network-wide input. The former is timeconsuming especially for large-scale applications, while the latter may incur the dilemma of underfitting owing to the heterogeneous traffic states within the entire network. Herein, we propose a novel prediction scheme based on spatiotemporal traffic pattern clustering. Firstly, road segments are partitioned into several groups via the developed clustering approach, which considers both the observed data sequence and spatial topology structure. Subsequently, sequence-to-sequence learning architecture is employed for each group to generate predictions for the entire traffic network. Validated by a real-world dataset in Beijing, our proposed paradigm offers a significant improvement over other well-known benchmarks for various prediction intervals in terms of prediction accuracy and computational efficiency. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.