Deep learning-based methods have been widely used for rotating machinery fault diagnosis. However, they would exhibit poor performance due to the severe data distribution difference under variable working conditions. Therefore, we first develop an improved convolutional neural network, consisting of multi-scale convolutional layer (MSC), channel attention layer (CA), and inception network structure (INS). Compared with other models, our model has strong feature extraction ability, fewer parameters and less training cost. Subsequently, based on transfer learning (TL), we propose the MSC-CA-INS-TL method. In order to improve the model’s generalization ability, we propose an appropriate fine-tuning strategy to coordinate with the model and pay attention to the accuracy of both source and target domains during migration. The bearing datasets and gear experimental platforms are used to verify the proposed method, and high fault diagnosis accuracy and stability are achieved under variable working conditions and small samples.