This paper gives an overview of a research system for phoneme based, large vocabulary continuous speech recognition. The system to be described has been applied to the SPICOS task, the DARPA RM task and a 12000 word dictation task. Experimental results for these three tasks will be presented. Like many other systems, the recognition architecture is based on an integrated statistical approach. In this paper, we describe the characteristic features of the system as opposed to other systems: (1) The Viterbi criterion is consistently applied both in training and testing. (2) Continuous mixture densities are used without any tying or smoothing; this approach can be viewed as a sort of ‘statistical template matching’. (3) Time-synchronous beam search is used consistently throughout all tasks; extensions using a tree organization of the vocabulary and phoneme lookahead are presented so that a 12000 word task can be handled.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.