The intelligent classification achieved through the utilization of deep learning networks, which possess the capability to automatically extract essential features from data, has garnered significant attention within the domain of fault diagnosis research. Nevertheless, within the industrial production process, the data collected inevitably suffers from noise contamination, thereby adversely affecting the network’s diagnostic results. To enhance the denoising prowess and mitigate the risks associated with overfitting in deep learning networks, this paper introduces the input gate structure of long short-term memory and an attention module into the convolutional neural network to propose a novel architecture known as the gate convolutional attention neural network (gate-CANN), which subsequently finds application in the domain of squirrel-cage asynchronous motor fault diagnosis. Firstly, the sensor-acquired time domain vibration undergo conversion into two-dimensional time–frequency images through the employment of continuous wavelet transform (CWT). Subsequently, the CWT images in two directions are put into gate-CANN for feature extraction, respectively. Finally, feature fusion and fault diagnosis are achieved in the end of network. To validate the effectiveness of the proposed method, it undergoes verification using the fault diagnosis testbed specific to squirrel cage asynchronous motors. The obtained results demonstrate that, in comparison to alternative diagnostic methods, the proposed approach exhibits superior capabilities in terms of noise resistance and generalization.