A Noise-Robust Continuous Speech Recognition System Using Block-Based Dynamic Range Adjustment

Sun, Yang; Miyanaga, Yoshikazu

doi:10.1587/transinf.e95.d.844

Cited by 1 publication

(1 citation statement)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although running spectrum analysis (RSA) is a well known method focusing on modulation spectrum, it has mostly been applied for automatic continuous speech recognition [19]. Furthermore, in speech, its application has been mainly focused on the frequency components in the range of 2-8 Hz because this range contains the dominant elements of the amplitude envelope of speech [20][21]. The modulation frequency band at higher than 8 Hz can be regarded as miscellaneous noise or unnecessary speech components related to the speakers characteristics, such as tone and pronunciation, among other factors [22].…”

Section: Introductionmentioning

confidence: 99%

Enhanced Running Spectrum Analysis for Robust Speech Recognition Under Adverse Conditions: A Case Study on Japanese Speech

et al. 2017

Self Cite

View full text Add to dashboard Cite

In any real environment, noises degrade the performance of Automatic Speech Recognition (ASR) systems. Additionally, in the case of similar pronunciations, it is not easy to realize a high accuracy of recognition. From this point of view, our work envisions an enhanced algorithm processing a speech modulation spectrum, such as Running Spectrum Analysis (RSA). It was also adequately applied to observed speech data. In the envisioned method, a modulation spectrum filtering (MSF) method directly modified the observed cepstral modulation spectrum by a Fourier transform of the cepstral time frequency. The method and experiments carried out for various passbands had favorable results that showed an improvement of about 1-4 % in recognition accuracycompared to conventional methods.

show abstract