This paper deals with different speaker adaptation methods for speech recognition systems adapting automatically to new and unknown speakers in a short training phase. The adaptation techniques aim a t transformations of feature vectors, optimized with respect to some constraints. Two different adaptation strategies are appropriate. The first one is based on least mean squared error (MSE) optimization. The second method is a codebook-driven feature transformation. Both ndnptation techniques are incorporated into two different recognition systems: dynamic time warping (DTW) and Hidden Markov Modelling (HMM). The results show, that in both systems speaker-adaptive error rates are close to speaker-dependent error rates. In the best case the mean error rate of four test speakers decreases by a factor of 6 (DTWrecognizer) resp. 3 (HMM-recognizer) compared to the interspeaker error rate without adaptation. Finally a hardware realization of the speaker-adaptive HMM-recognizer will be described.
This article describes the work in the development of the speech understanding and dialog system EVAR. The relevant knowledge bases containing the raw linguistic knowledge and the preprocessors converting this to the specialized form needed by the processing algorithms are treated. Processing so far covers the level of the speech signal up to the level of pragmatic analysis. Some topics of ongoing and future work are mentioned briefly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.