Fault diagnosis of rolling bearings is significant for the safe operation of engineering equipment. Many intelligent diagnostic methods have been successfully developed. However, it is often susceptible to noisy environments and the sample size in practical industrial applications. Therefore, the paper proposes a rolling bearing fault diagnosis method based on multimodal information fusion in time and time-frequency domains by combining an improved 1D-CNN with ResNet50(WCNN-RSN). The algorithm employs the multi-head self-attention mechanism to complementarily fuse fault features in different scales, achieving fault diagnosis by fully extracting fault features. The experimental results show that the diagnostic effect of WCNN-RSN is better than that of the comparison methods under noise interference and small samples, which proves that the proposed method possesses good anti-noise and generalization ability.