Health-condition-sensitive vibration information is prone to be swamped by widespread noise. Denoising is always indispensable, but existing methods still lack adaptability. Therefore, a novel intelligent denoising framework called a hybrid transformer masked time-domain denoising network (HTMTDN) is proposed. First, a dense dilation convolution block and a hybrid transformer are constructed to deal with fault impulse scale variations and unexpected noise frequency bands respectively, which greatly improves the adaptive denoising capability and relieves tough denoising parameter tuning. Further, interpretable time- and frequency-domain joint constraints are constructed to enhance the network’s optimization ability under strong noise. Finally, a novel strategy called overlapping reconstruction is introduced to recover 1D signals from 2D signal segments. Extensive experiments based on two bearing fault datasets with variable loads and rotation speeds confirm the remarkable performance of the HTMTDN under low signalnoise ratios, and present good adaptability in 15 health conditions without separate hyperparameter tuning, which shows promise for real-world applications.