Short‐term traffic flow prediction on a large‐scale road network is challenging due to the complex spatial–temporal dependencies, the directed network topology, and the high computational cost. To address the challenges, this article develops a graph deep learning framework to predict large‐scale network traffic flow with high accuracy and efficiency. Specifically, we model the dynamics of the traffic flow on a road network as an irreducible and aperiodic Markov chain on a directed graph. Based on the representation, a novel spatial–temporal graph inception residual network (STGI‐ResNet) is developed for network‐based traffic prediction. This model integrates multiple spatial–temporal graph convolution (STGC) operators, residual learning, and the inception structure. The proposed STGC operators can adaptively extract spatial–temporal features from multiple traffic periodicities while preserving the topology information of the road network. The proposed STGI‐ResNet inherits the advantages of residual learning and inception structure to improve prediction accuracy, accelerate the model training process, and reduce difficult parameter tuning efforts. The computational complexity is linearly related to the number of road links, which enables citywide short‐term traffic prediction. Experiments using a car‐hailing traffic data set at 10‐, 30‐, and 60‐min intervals for a large road network in a Chinese city shows that the proposed model outperformed various state‐of‐the‐art baselines for short‐term network traffic flow prediction.