In this paper we introduce the backward N-gram language model (LM) scores as a confidence measure in large vocabulary continuous speech recognition.Contrary to a forward N-gram LM, in which the probability of a word is dependent on the preceding words, a word in a backward N-gram LM is predicted based on the following words only. So the backward LM is a model for sentences read from the end to the beginning.We show on the benchmark 20k word Wall Street Journal recognition task that the backward LM scores contain information for the confidence measure that is complementary to the information in forward LM scores. The normalised cross entropy metric for confidence measures increases significantly from 18.5% to 23.3% when backward LM scores are added to a confidence measure which includes the forward LM scores.
In this paper, we investigate the use of the total likelihood (the weighted sum of the likelihoods of all possible state sequences) instead of the approximation with the Viterbi likelihood (the likelihood of the best state sequence) normally used in speech recognition. Next to its use in a recognizer, the use of total likelihoods in the context of an automatic word alignment task is also addressed shortly. We describe how the search algorithm must be modified and how word lattices based on total likelihoods can be constructed. The total likelihood framework also requires us to make a distinction between upgrading the language model scores or downgrading the acoustic model scores in the recognizer. To help in deciding between these two alternatives, some theoretical foundation is given to the practice of making a weighted combination of language and acoustic scores. Finally, the total likelihood and the Viterbi framework are compared in terms of accuracy and computational effort on the Wall Street Journal recognition task, while the accuracy of word alignments is evaluated on a large Dutch corpus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.