Because of electrification condition, key components of battery swapping system (BSS) for electric heavy trucks (EHT) are always damaged by electric erosion, which poses challenges to the safety and efficiency of high-intensity transportation. Due to the special working condition of BSS, the fault diagnosis of its driving gear encounters several issues, including reciprocation motion, low and fluctuating speed, complicated noises. To solve these problems, audio features, including Mel-frequency cepstral coefficients (MFCC) and Gammatone cepstral coefficients (GTCC), are extracted from the vibration signals, then, these features are utilized to construct original dictionary, after that, based on data augmentation and dictionary learning, a robust dictionary is generated from the original dictionary, finally, with the robust dictionary, sparse representation-based classification is integrated into AdaBoost to achieve an accurate fault diagnosis for the driving gear in BSS. The effectiveness of the fault diagnosis scheme is validated based on the monitoring data of BSS, and the accuracy of fault diagnosis is 99.17%.