In the field of data-driven fault diagnosis (FD), deep learning methods have proven their excellent performance, especially when dealing with complex signals from rotating equipment such as bearings. However, fault features in vibration signals are often mixed with noise features and distributed at different frequency scales, posing challenges for effective feature extraction. In order to solve this problem, this paper proposes a high frequency-multiscale cascade network (HF-MSCN), which enhances the noise suppression and feature learning capability of the model by combining a high-frequency convolutional block (HFCB) with a multi-scale cascade block (MSCB). HFCB effectively suppresses high-frequency noise through wide convolutional layers and self-attention mechanisms while still retaining essential high-frequency fault signals. MSCB enhances the interaction between convolutional layers at different scales by cascading the layers at different scales and strengthens the model’s ability to capture subtle fault features, especially when processing periodic fault pulse signals. Finally, we investigate the internal functioning of the network using time–frequency analysis methods in signal processing to improve the interpretability of deep learning methods in FD applications and further verify the enhanced effect of HFCB and MSCB on feature extraction. We validate the effectiveness of HF-MSCN on the case western reserve university dataset as well as a self-constructed bearing composite fault dataset, and the experimental results demonstrate that the network exceeds the performance of six state-of-the-art fault diagnostic methods in high-noise environments.