Rolling bearing fault diagnosis is of significant importance in practical production and life. However, existing research still faces certain challenges. For instance, source domain data for rolling bearing fault diagnosis often originates from laboratory experiments, making it difficult to acquire real-world data during the transfer learning process. Additionally, the training approach of domain adaptation networks lags behind, failing to fully leverage the advantages of loss functions. To address these issues, this paper proposes a rolling bearing fault diagnosis method based on joint IATL (Improved Alternating Transfer Learning) from the dynamics simulation model source domain to the target domain. This paper considers the influence of real-time positions of rolling elements on the radial displacement excitation function when the rolling elements enter the defect region and takes into account factors such as the size of fault defects and bearing speed on the impact force at the edge of rolling elements after impacting defects. The dynamic equations of rolling bearings are modified to construct a dynamic simulation model of rolling bearing fault states to obtain a source domain dataset with rich fault label information. To harness the high recognition rate of CNN for images and improve the training speed of the model, vibration signal time-domain waveforms are directly converted into grayscale images as inputs to the neural network. An improved alternating transfer learning approach is proposed to enhance the loss function and training method for transfer learning. This is achieved by alternately calculating loss functions in different layers, reducing the distance between different domains, and updating network parameters alternately, harnessing the complementary advantages of different loss functions. To validate the effectiveness of the proposed method, the Case Western Reserve University (CWRU) bearing dataset is used as the target domain dataset. Three experimental verifications are conducted involving the same bearing model, cross-bearing model, and a small-sample dataset in the transfer from simulation domain to target domain. The results indicate that compared to algorithms that only calculate CORAL and MMD loss functions, this paper’s algorithm effectively reduces the feature distribution differences between domain data and exhibits a higher fault classification accuracy.