In an actual engineering environment, some rotating machines are usually in normal operation, but their time in a fault state is very short, which leads to a serious imbalance in the fault diagnosis datasets for rotating machinery, and gives the traditional network model the shortcomings of poor stability and low accuracy in practical engineering applications. To solve this problem, we propose a fault diagnosis method based on the combination of a new Dual-stage Attention-based Recurrent Neural Network (DA-RNN) and depth residual dispersion self-calibration convolution network (SC-ResNeSt). Firstly, a novel DA-RNN network with a gated cycle unit (GRU) as a coding-decoding unit was designed, and the network was used to predict and expand the scarce fault signals. Secondly, to make full use of the time domain information of vibration signals, a new image coding method, namely, Gram Angle Product Field (GAPF), was proposed. Then, because the traditional convolution layer lacks a dynamic receptive field to extract more representative features, self-calibrated convolution modules were introduced on the basis of the distraction network (ResNeSt), and a new network model, SC-ResNeSt, was established. Finally, the expanded vibration signal is converted into GAPF, which is used as the input for the SC-ResNeSt network to classify the fault types. To check the performance of the model, the Case Western Reserve University rolling bearing dataset and planetary gearbox dataset were used for testing. Ultimately, good results were obtained in a prediction experiment for bearing fault samples and a subsequent fault diagnosis experiment for bearings and gears, which verified the feasibility and practicability of the model.