Accurate forecasting of the future transfer passenger flow from historical data is essential for helping travelers to adjust their trips, optimal resource allocation and alleviating traffic congestion. However, current studies have mainly emphasized predicting traffic parameters for a single type of transport, while lacking research into transfer passenger flow influenced by multiple factors across different transport modes. Additionally, efficient traffic prediction relies on high-quality traffic data, yet data loss issues are inevitable but often ignored. To fill these gaps, we present for the first time a reliable joint long short-term memory with matrix factorization deep learning model (i.e., Joint-IF) for accurate imputation and forecasting of transfer passenger flow between metro and bus. This hybrid Joint-IF model uses a repair-before-prediction strategy to deliver the final high-quality outputs. In particular, we simulate a variety of missing combinations under the natural conditions and apply a low-rank matrix factorization to infer those lost values. In addition, we investigate the effects of crucial parameters and spatiotemporal features on transfer flow prediction. To validate the effectiveness of Joint-IF, a large series of experiments are carried out for models’ comparison and validation on the real-world transfer passenger flow dataset of the Shenzhen public transport system, and the results show that the proposed Joint-IF performs better for both imputation and forecasting of transfer passenger flow relative to the baseline models in terms of accuracy and stability.