As artificial intelligence technology progresses, deep learning models are increasingly utilized for machine fault classification. However, a significant drawback of current state-of-the-art models is their high computational complexity, rendering them unsuitable for deployment in portable devices. This paper presents a compact fault diagnosis model that integrates a self-attention SqueezeNet architecture with a hybrid texture representation technique utilizing empirical mode decomposition (EMD) and a gammatone spectrogram (GS) filter. In the model, the dominant signal is first isolated from the audio fault signals by discarding lower intrinsic mode functions (IMFs) from EMD, and subsequently, the dominant signals are transformed into 2D texture maps using the GS filter. These generated texture maps feed as input into the modified self-attention SqueezeNet classifier, featuring reduced model width and depth, for training and validation. Different attention modules were tested in the paper, including the self-attention, channel attention, spatial attention, and convolutional block attention module (CBAM). The models were tested on the MIMII and ToyADMOS datasets. The experimental results demonstrated that the self-attention mechanism with SqueezeNet achieved an accuracy of 97% on the previously unseen MIMII and ToyADMOS datasets. Furthermore, the proposed model outperformed the SqueezeNet attention model with other attention mechanisms and state-of-the-art deep architectures, exhibiting a higher precision, recall, and F1-score. Lastly, t-SNE is applied to visualize the features of the self-attention SqueezeNet for different fault classes of both MIMII and ToyADMOS.