Short-term passenger flow forecasting is a crucial task in the operation of urban rail transit. Emerging deeplearning technologies have become effective methods to overcome this problem. In this study, we propose a deep-learning architecture called Conv-GCN combining graph convolutional network (GCN) and 3D convolutional neural network (3D CNN). First, we introduce a multi-graph GCN to deal with three patterns (recent, daily, and weekly patterns) of inflow and outflow separately. Multi-graph GCN network can capture spatiotemporal correlations and topological information in a whole network. Then, a 3D CNN is applied to deeply integrate the inflow and outflow information. High-level spatiotemporal features between different patterns of inflow and outflow, and between stations nearby and far away can be extracted by 3D CNN. Finally, a fully connected layer is used to output results. The Conv-GCN model is evaluated on smart card data of Beijing subway under the time interval of 10 min, 15 min, and 30 min. Results show that this model performs the best among seven other relative models. In terms of the RMSE, the performances under three time intervals have been improved by 9.402%, 7.756%, and 9.256%, respectively. This study can provide critical insights for subway operators to optimize the operation.