Recently, with the increasing scale of the volume of freight transport and the number of passengers, the study of railway vehicle fault diagnosis and condition management is becoming more significant than ever. The axle temperature plays a significant role in the locomotive operating condition assessment that sudden temperature changes may lead to potential accidents. To realize accurate real-time condition monitoring and fault diagnosis, a new multi-data-driven model based on reinforcement learning and deep learning is proposed in this paper. The whole modeling process contains three steps: In step 1, the feature crossing and reinforcement learning methods are applied to select the suitable features that could efficiently shorten the redundancy of the input. In step 2, the stack denoising autoencoder is employed to extract deep fluctuation information in the features after the reinforcement learning. In step 3, the bidirectional gated recurrent unit algorithm is utilized to accomplish the forecasting model and achieve the final results. These parts of the integrated modeling structure contributed to increased forecasting accuracy than single models. By analyzing the forecasting results of three different data series, it could be summarized that: (1) The proposed two-stage feature selection method and feature extraction method could greatly optimize the input for the predictor and form the optimal axle temperature forecasting model. (2) The proposed hybrid model can achieve satisfactory forecasting results which are better than the contrast algorithms proposed by other researchers.