Applying methods such as deep learning improves the efficiency of bearing fault diagnosis and reduces trains’ operation and maintenance costs. However, in practical applications, the deficiency of historical data and the imbalance of data types often limit the effectiveness of the diagnosis. The variability between operating conditions also restricts the availability of transfer learning including domain adaptation. To address this challenge, a digital twin (DT) framework is established to fill the data for train fault diagnosis. A train bearing dynamics model is optimized using virtual-reality mapping in the DT framework with measured health data as a baseline to generate data closer to reality. Finally, the fault diagnosis uses a hybrid dataset that mixes measured and simulated data as a source domain for transfer learning. The Case Western Reserve University dataset is used as an example, and the accuracy reaches up to 99.40%, which verifies the method’s effectiveness.