This paper describes several key experiments in large vocabulay speech recognition. We demonstrate that, counter to our intuitions, given a fixed amount of training speech, the number of training speakers has little effect on the accuracy. We show how much speech is needed for speaker-independent (SI) recognition in order to achieve the same performance as speakerdependent (SD) recognition. We demonstrate that, though the N-Best Paradigm works quite well up to vocabularies of 5,000 words, it begins to break down with 20,000 words and long sentences. We compare the performance of two feature preprocessing algorithms for microphone independence and we describe a new microphone adaptation algorithm based on selection among several codebook transformations.
In this paper, we present several approaches designed to increase the robustness of BYBLOS, the BBN continuous speech recognition system. We address the problem of increased degradation in • performance when there is mismatch in the characteristics of the training and the test microphones. We introduce a new supervised adaptafi.~n algor/thm that computes a transformation from the trainhag microphone codebook to that of a new microphone, given some information about the new microphone. Results are reported for the development and evaluation test sets of the 1993 ARPA CSR Spoke 6 WSJ task, which consist of speech recorded with two al-• temate microphones, a stand-mount and a telephone microphone. The proposed algorithm improves the performance of the system • • when tested with the stand-mount microphone by reducing the difference ha error rate between the high quality training microphone and the alternate stand-mount microphone recordings by a factor of 2. Several results are presented for the telephone speech leading • to important conclusions: a) the performance on telephone speech is dramaticaUy improved by simply retraining the system on the high-quality training data after they have been bandlimited in the telephone bandwith; and b) additional training data recorded with the high quality microphone give luther substantial improvement ha performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.