Major progress is being recorded regularly on both the technology and exploitation of automatic speech recognition (ASR) and spoken language systems. However, there are still technological barriers to flexible solutions and user satisfaction under some circumstances. This is related to several factors, such as the sensitivity to the environment (background noise), or the weak representation of grammatical and semantic knowledge.Current research is also emphasizing deficiencies in dealing with variation naturally present in speech. For instance, the lack of robustness to foreign accents precludes the use by specific populations. Also, some applications, like directory assistance, particularly stress the core recognition technology due to the very high active vocabulary (application perplexity). There are actually many factors affecting the speech realization: regional, sociolinguistic, or related to the environment or the speaker herself. These create a wide range of variations that may not be modeled correctly (speaker, gender, speaking rate, vocal effort, regional accent, speaking style, non-stationarity, etc.), especially when resources for system training are scarce. This paper outlines current advances related to these topics.
Hidden Markov models are widely used for automatic speech recognition. They inherently incorporate the sequential character of the speech signal and are statistically trained. However, the a-priori choice of the model topology limits their flexibility. Another drawback of these models is their weak discriminating power. Multilayer perceptrons are now promising tools in the connectionist approach for classification problems and have already been successfully tested on speech recognition problems. However, the sequential nature of the speech signal remains difficult to handle in that kind of machine. In this paper, a discriminant hidden Markov model is defined and it is shown how a particular multilayer perceptron with contextual and extra feedback input units can be considered as a general form of such Markov models.
m e Hidden Mwkov models are generalized b y &$ning a n e w e m i s s i o n p r o b a b i l i t y w h i c h takes the correlation between sltccessive feature vectors into account. Estimation f m u l a s fm the iterative Learning both along %teTbi a n d Mazimurn likelihood criteria are p r e s e n t e d .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.