Recent years have witnessed a drastic increase in the number of urban metro passengers, which inevitably causes the overcrowdedness in the metro systems of many cities. Clearly, an accurate prediction of passenger flows at metro stations is critical for a variety of metro system management operations, such as line scheduling and staff preallocation, that help alleviate such overcrowdedness. Thus, in this paper, we aim to address the problem of accurately predicting metro station passenger (MSP) flows. Similar to other traffic data, such as road traffic volume and highway speed, MSP flows are also spatial-temporal in nature. However, existing methods for other traffic prediction tasks are usually suboptimal to predict MSP flows due to MSP flows' unique spatial-temporal characteristics. As a result, we propose a novel deep learning framework STP-TrellisNets, which for the first time augments the newly-emerged temporal convolutional framework TrellisNet for spatial-temporal prediction. The temporal module of STP-TrellisNets (named CP-TrellisNets) employs two TrellisNets in serial to jointly capture the short-and long-term temporal correlation of MSP flows. In parallel to CP-TrellisNets, its spatial module (named GC-TrellisNet) adopts a novel transfer flow-based metric to characterize the spatial correlation among MSP flows, and implements multiple diffusion graph convolutional networks (DGCNs) in time-series order with their outputs connected to a TrellisNet to capture the dynamics of such spatial correlation. Clearly, GC-TrellisNet essentially integrates TrellisNet with graph convolution, and empowers TrellisNet with the ability to capture dynamic graph-structured correlation. We conduct extensive experiments with two large-scale real-world automated fare collection datasets, which contain respectively about 1.5 billion records in Shenzhen, China and 70 million records in Hangzhou, China. The experimental results demonstrate that STP-TrellisNets outperforms the state-of-the-art baselines. CCS CONCEPTS • Information systems → Spatial-temporal systems; Data mining; • Computing methodologies → Neural networks.