“…Currently, the performance of the ASR system in many human-robot interaction scenarios is unsatisfactory due to robustness limitations, and one of the critical factors is that various practical noises make it more challenging to extract the features, such as Melfrequency cepstral coefficients (MFCC) [12][13][14], log-channel energies [15], and pitch-based features [12,16]. Some common noises have been widely researched by experts in ASR, such as background noise [9,17], reverberation [18][19][20][21], squeal noise, and noises tightly related to hardware, such as thermal noises from amplifiers [22], quantizing noises from analog to digital converters (ADCs) [23], and signal quality loss caused by coding [24], compression, and transmission [25]. However, noises related to gain controls have received less attention.…”