Short-term passenger flow forecasting has great significance for the safety and efficiency of urban rail transit system. The existing forecasting methods mostly focus on traffic data and consider little other external factors from multi-source data such as meteorological data and point of interest data, which may exert a strong influence on the passenger flow. This paper proposes a new type of LSTM with two attention mechanisms, named as 2A-LSTM, with dataset of multivariate time series to improve the forecasting accuracy. The multivariate time series are constructed from the multi-source data by means of data fusion. These time series have characteristics of strong self-correlation, periodicity and predictability, which are the key to ensure the prediction accuracy of 2A-LSTM. The 2A-LSTM uses temporal pattern attention and soft attention mechanism to perceive the correlation of each external factor on previous time steps. Based on Shanghai Metro traffic card data, we perform experiments to measure and compare the effect of external features on the accuracy of passenger flow by using different features combination. We input time series data of single station to our model and forecast the inbound passenger flow on weekdays. The experiment results show that the accuracy is improved by using external features, and our model has good performance on multivariate time series forecasting.