Bearing fault diagnosis with extensive labeled fault data has been achieved. In engineering, most machines are in a healthy state. When a fault does occur, the machine is shut down as soon as possible, thus it is difficult and uneconomical to collect enough fault data with labels to carry out a fault diagnosis. To solve the problem, a hybrid fault diagnosis method for rotating machinery, based on variational mode decomposition energy entropy (VMD-EE) and transfer learning (TL), is proposed. First, we decompose the original signal using VMD, calculate the EE value of the modal components, then build a dataset using fault data from these components. Second, a deep residual neural network is proposed to extract high-dimensional features from the dataset and divide the data into source and target domains according to the working conditions. Finally, W-distances are introduced to dynamically evaluate the importance of the conditional and marginal distribution probabilities to minimize the loss, with a feature-based TL method being used to dynamically balance the distribution adaption and reduce the difference in the probability distributions of the two domains. The proposed method is validated using the datasets of Case Western Reserve University and a machinery fault simulator platform. The VMD-based intelligent health detection and statistical analysis solve the problem of mode mixing very well and accurately detect signal faults, or not, by
E
E
θ
, the threshold of VMD-EE. Meanwhile, the accuracy of TL-based neural network fault diagnosis is up to 99.4%, and the losses are kept at around 0.02. These results show the accuracy and robustness of the proposed method in the absence of datasets and under varying operating conditions.