Ahtract-This paper describes the first successfully implemented real-time Mandarin dictation machine developed in the world which recognizes Mandarin speech with very large vocabulary and almost unlimited texts for the input of Chinese characters into computers. Considering the special characteristics of the Chinese language, syllables are chosen as the basic units for dictation. The machine is speaker dependent, and the input speech is in the form of sequences of isolated syllables. The machine can be decomposed into two subsystems. The first subsystem is to recognize the syllables using hidden Markov models, in which special training algorithms and recognition approaches have been developed to recognize the 408 very confusing syllables (disregarding the tones), and special feature vectors have been used to recognize the five different tones including the very confusing neutral tone. But this does not help very much because every syllable can represent many different homonym characters and form different multi-syllabic words with syllables on its right or left. The second subsystem is then needed to identify the exact characters from the syllables and correct the errors in syllable recognition by first finding all possible word hypotheses and forming a word lattice for the sequence of recognized syllables through a lexical access process, and then obtaining the best path in the lattice with the maximum likelihood as the output sentence using a data-trained Markov Chinese language model. The real-time implementation is on an IBM PC/AT, connected to three sets of specially designed hardware boards on which seven TMS 320C25 chips operate in parallel. The preliminary test results indicate that it takes only about 0.45 s to dictate a syllable (or character) with an accuracy on the order of 90%. All techniques used in this machine are described and discussed in detail in this paper.
This paper describes the first successfully implemented real-time Mandarin dictation machine developed in the world which recognizes Mandarin speech with unlimited texts and very large vocabulary for the input of Chinese characters to computers. Isolated syllables including the tones are first recognized using specially trained hidden Markov models with special feature parameters, the exact characters are then identified from the syllables using a Markov Chinese language model, because every syllable can represent many different homonym characters. The real-time implementation is on an IBM PC/AT, connected to a set of special hardware boards on which ten TMS 320C25 chips operate in parallel. It takes only 0.45 sec to dictate acharacters.
This paper describes a fully parallel real-time Mandarin dictation machine which recognizes Mandarin speech with almost unlimited texts and very large vocabulary for the input of Chinese characters to computers. Isolated syllables including the tones are first recognized using specially trained hidden Markov models with special feature parameters, the exact characters are then identified from the syllables using a Markov Chinese language model, because every syllable can represent many different homonym characters. The real-time implementation is in Occam language on a transputer system with 10 T8OO processors operating in parallel. The overall correction rate for the final output characters is about 89%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.